BOINC Strange Behavior

Message boards : SETI@home Enhanced : BOINC Strange Behavior
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile CElliott
Volunteer tester

Send message
Joined: 16 Aug 05
Posts: 79
Credit: 71,936,490
RAC: 0
United States
Message 45979 - Posted: 19 May 2013, 23:08:32 UTC
Last modified: 19 May 2013, 23:32:44 UTC

1. Since installing 2 Kepler-class GPUs, Computer ID 52900 has been unable to download more than a day's supply of WUs, usually there has been less than a day's supply in inventory.

2. On May 14, the server sent 22 AP CPU WUs (which take about 27 hours apiece to complete) with only two hours to finsh them; they all timed out and errored out.

3. Last week I suspended all the S@H WUs via Efmer's BoincTasks to finish up the few AP WUs in inventory. Boinc only processed 3 more AP WUs, and then quit using the GPUs altoghether. It would only start using the GPUs again when I unsuspended the regular S@H WUs, which it then processed.

4. At 21:40:00 5/18/2013, with about 1 day's supply of WUs available, all evenly distributed between AP & S@H and between CPU and GPU, when an AP WU finished on GPU 2 and after a communication with the server returning that result, Boinc refused to schedule any further work on that GPU; it just left it idle. I did not notice it until about 13:00:00 5/19/13. I tried everyting I could think of with preferences and cc_config.xml to try to make Boinc use the second GPU; I even reinstalled Boinc 7.0.62, but nothing worked. I could not reboot because I am running a long experiment that would be impossible to restart except at the beginning. Finally, at 13:39:00 I resumed GPUGRID and requested some work. When a GPUGRID WU was finally completely downloaded at 13:52:00, it preempted the AP WU on GPU 1, and began executing; GPU 2 was still unused. So I requested another WU from GPUGRID and it was scheduled on GPU 2. At that point Boinc went wild with requests for more work from S@H Beta. At one time there were 971 workunits in the transfer queue; I have not seen that for months. Jealous mistress syndrome? Now I have 306 S@H CPU WUs (2 days), 1169 S@H GPU WUs (5.4 days), 5 AP CPU WUs (0.7 days), and 423 AP GPU WUs (14.5 days). The day's estimates may be a little fuzzy because of changes in apps, but the quantities are exact.

5. Boinc no longer computes the time remaining on a WU by ratio and proportion after the WU is 50% or even 85% complete based on how long it took to complete the first part. When an AP WU is 99.99% complete, the GUI still indicates it has 34 hours remaining.
ID: 45979 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 45983 - Posted: 20 May 2013, 14:00:41 UTC - in response to Message 45979.  

It's a shame you've never learned to use the tools available in the BOINC client/manager to diagnose issues like this. Like:

20/05/2013 14:53:34 | SETI@home Beta Test | [sched_op] Starting scheduler request
20/05/2013 14:53:34 | SETI@home Beta Test | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA (6642.98 sec, 1.00 inst)
20/05/2013 14:53:34 | SETI@home Beta Test | Sending scheduler request: To fetch work.
20/05/2013 14:53:34 | SETI@home Beta Test | Requesting new tasks for NVIDIA
20/05/2013 14:53:34 | SETI@home Beta Test | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
20/05/2013 14:53:34 | SETI@home Beta Test | [sched_op] NVIDIA work request: 6642.98 seconds; 1.00 devices
20/05/2013 14:53:36 | SETI@home Beta Test | Scheduler request completed: got 0 new tasks
20/05/2013 14:53:36 | SETI@home Beta Test | [sched_op] Server version 701
20/05/2013 14:53:36 | SETI@home Beta Test | Project requested delay of 7 seconds
20/05/2013 14:53:36 | SETI@home Beta Test | [work_fetch] backing off NVIDIA 494 sec
20/05/2013 14:53:36 | SETI@home Beta Test | [sched_op] Deferring communication for 7 sec
20/05/2013 14:53:36 | SETI@home Beta Test | [sched_op] Reason: requested by project
20/05/2013 14:53:36 | | [work_fetch] Request work fetch: RPC complete

Raistmer missed that clue yesterday too, which led to both Claggy and myself posting the real reason in the News thread, long before your post.
ID: 45983 · Report as offensive

Message boards : SETI@home Enhanced : BOINC Strange Behavior


 
©2021 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.