Message boards :
Number crunching :
What "spare parts" would you keep to allow a rapid recovery from failure?
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462 ![]() |
I have reduced my commitment to Seti@Home crunching to one dedicated cruncher and one "daily driver"/cruncher. I have been selling my surplus hardware. I want to keep enough extra parts that I can probably get my dedicated cruncher up promptly without waiting (probably) for an order/shipping to arrive. What would you keep? What has failed on you? I have had two MB's go south, an overworked PSU go south and one or two gtx 1060's go south. Tom A proud member of the OFA (Old Farts Association). |
![]() Send message Joined: 25 Nov 01 Posts: 21533 Credit: 7,508,002 RAC: 20 ![]() ![]() |
Just keep PSUs and disks as spares. All other parts fails have been of the order of once a decade! Happy fast crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
rob smith ![]() ![]() ![]() Send message Joined: 7 Mar 03 Posts: 22652 Credit: 416,307,556 RAC: 380 ![]() ![]() |
PSU most certainly. Disc drive - well, that's a debate. But I do have a spare disk which is a clone of my Windows 7 PC as that is my daily driver. Anything else - I'm only a couple of hours away from three different component shifters so its of no big deal. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Holdolin Send message Joined: 10 Apr 19 Posts: 68 Credit: 88,777,750 RAC: 30 ![]() ![]() |
As others have suggested, most definitely a PSU and perhaps a hdd/ssd. Those are the most common points of failure in my experience. I can't say much though as my basement looks like a parts depot and could easily suggest keeping much more, but if have no need desire then just the mentioned stuff should work for ya. |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
I have a lot of spare parts, but mainly as a result of parts collection, I haven't really bought anything specifically to use as backups. I have a couple PSUs I have tons of memory, most of my systems use DDR3 ECC RDIMMs, and they have more memory installed than they need, so I can redistribute if necessary I have some spare small SSDs laying around for OS disks lots of old HDDs that are really too old/small to be useful for anything, these are last resort backups for OS drives Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
![]() ![]() ![]() Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 ![]() ![]() |
PSU first for sure , but my last one decide to kill my Ram too ... somes good caps to repair the MB sometimes too ^^ |
Phil Burden Send message Joined: 26 Oct 00 Posts: 264 Credit: 22,303,899 RAC: 0 ![]() |
And, whatever you do, don't forget Murphy's Law. Whatever you decide to keep as spares, it'll be something else that fails ;-) P. ps, Never had anything fail in a pc, only hard drives in a NAS box. |
Ianab Send message Joined: 11 Jun 08 Posts: 732 Credit: 20,635,586 RAC: 5 ![]() |
Just keep a whole "warm spare". Some random dumpster dived i3 (depends on the quality of your local dumpsters), that's all set up and ready to go. If one of your real machines dies, you have either all the parts, or it's lost a system board, then you have a working spare PC. Power isn't an issue, because it's not plugged in. Space isn't an issue because a PC case doesn't take up any more space than box of PC parts. Wife puts up with my "spares" because she knows that if her (dumpster dived) PC dies, I can have a serviceable backup under her desk in 5 mins. The 10 "warm spares" in the corner may suggest it's time to have a scrap metal session though :-D There are some perfectly good C2D machines in there with valid Win10s , and if that's all you had, they would run and "work". Just they are likely worth more as $2 of scrap metal . But just pick your best "old"machine and stick it in the corner, ratter than scrapping or selling it. Then not matter what fails, you have good parts to fix it. |
Ianab Send message Joined: 11 Jun 08 Posts: 732 Credit: 20,635,586 RAC: 5 ![]() |
I've seen pretty much EVERY part of a PC fail , but that's over 100s of machines and several decades. Heck throw in a good thunderstorm and I've seen PCs there EVERY part was toasted. When the modem cable is welded into the socket on both the PC and wall end, it's likely there was some current and voltage slightly over spec.... By maybe 100,000 volts? But hey, a "warm spare" in the cupboard would still be good, once you got power and internet back on again. |
![]() ![]() ![]() Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 ![]() ![]() |
and don't forget to have a backup copy of your BOINC & BOINC_DATA folders ^^ |
![]() Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462 ![]() |
and don't forget to have a backup copy of your BOINC & BOINC_DATA folders ^^ +1 A proud member of the OFA (Old Farts Association). |
rob smith ![]() ![]() ![]() Send message Joined: 7 Mar 03 Posts: 22652 Credit: 416,307,556 RAC: 380 ![]() ![]() |
In reality the only files you need to back up are configuration files, executables and libraries. Data, unless you backup continuously as it is continuously changing, and, in the event of a really bad crash you just wave goodbye to a load of tasks - which is why it is a bad idea to have over-inflated caches as having a couple of hundred tasks waiting to time-out is one thing, but to have several thousand is even worse. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
In reality the only files you need to back up are configuration files, executables and libraries. Data, unless you backup continuously as it is continuously changing, and, in the event of a really bad crash you just wave goodbye to a load of tasks - which is why it is a bad idea to have over-inflated caches as having a couple of hundred tasks waiting to time-out is one thing, but to have several thousand is even worse. you can detach the system from the project, which abandons all the tasks immediately and they get redistributed to users. no waiting for timeout. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
rob smith ![]() ![]() ![]() Send message Joined: 7 Mar 03 Posts: 22652 Credit: 416,307,556 RAC: 380 ![]() ![]() |
But does that work if the disk on which they are is "a smoldering lump of rubble"? Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
![]() ![]() ![]() ![]() Send message Joined: 15 May 99 Posts: 3824 Credit: 1,114,826,392 RAC: 3,319 ![]() ![]() |
....which is why it is a bad idea to have over-inflated caches as having a couple of hundred tasks waiting to time-out is one thing, but to have several thousand is even worse. This assumes that the person is just going to let them abandon and time out. When I lost a hard drive full I recovered all of them. By doing so I was able to improve the process and more importantly get the resend limit increased from 20 to 80 per request. So, it was actually a net positive for everyone. ![]() |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
But does that work if the disk on which they are is "a smoldering lump of rubble"? I don't see why not. reinstall OS/BOINC to new disk set new host name to be the same as before grab the "Number of times client has contacted server" from the host details page increment that number by one and add to your client_state.xml file new system looks like the "old" system detach. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
![]() ![]() Send message Joined: 30 Nov 03 Posts: 66504 Credit: 55,293,173 RAC: 49 ![]() ![]() |
Video cards and psus is what I'd have a few extras of, just in case. CA HSR built a foundation, is laying Track! PRR T1 Class 4-4-4-4 #5550 Loco, US's 1st HST ![]() |
![]() ![]() Send message Joined: 24 Jan 00 Posts: 37301 Credit: 261,360,520 RAC: 489 ![]() ![]() |
Here it's 2 "warm spares", a PSU, 2 monitors and a 250GB SSD. Cheers. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.