Message boards :
News :
Experiment for server operations check...
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5
Author | Message |
---|---|
Claggy Volunteer tester Send message Joined: 29 May 06 Posts: 1037 Credit: 8,436,558 RAC: 1 |
O.K Thanks Claggy |
Claggy Volunteer tester Send message Joined: 29 May 06 Posts: 1037 Credit: 8,436,558 RAC: 1 |
New scheduler is on. Let me know if you have problems with anything.Looks to be operating correctly now: 16/11/2012 18:07:42 SETI@home Beta Test [sched_op_debug] Starting scheduler request 16/11/2012 18:07:42 SETI@home Beta Test Sending scheduler request: To fetch work. 16/11/2012 18:07:42 SETI@home Beta Test Requesting new tasks for GPU 16/11/2012 18:07:42 SETI@home Beta Test [sched_op_debug] CPU work request: 0.00 seconds; 0.00 CPUs 16/11/2012 18:07:42 SETI@home Beta Test [sched_op_debug] NVIDIA GPU work request: 0.00 seconds; 0.00 GPUs 16/11/2012 18:07:42 SETI@home Beta Test [sched_op_debug] ATI GPU work request: 10562.03 seconds; 0.00 GPUs 16/11/2012 18:08:01 SETI@home Beta Test Scheduler request completed: got 6 new tasks 16/11/2012 18:08:01 SETI@home Beta Test [sched_op_debug] Server version 701 16/11/2012 18:08:01 SETI@home Beta Test Message from server: Resent lost task 05ap10al.19223.2117.140733193388043.14.245_1 16/11/2012 18:08:01 SETI@home Beta Test Message from server: Resent lost task 05ap10al.19223.2117.140733193388043.14.246_1 16/11/2012 18:08:01 SETI@home Beta Test Message from server: Resent lost task 05ap10al.19223.2117.140733193388043.14.247_1 16/11/2012 18:08:01 SETI@home Beta Test Message from server: Resent lost task 05ap10al.19223.2117.140733193388043.14.248_1 16/11/2012 18:08:01 SETI@home Beta Test Message from server: Resent lost task 05ap10al.19223.2117.140733193388043.14.249_1 16/11/2012 18:08:01 SETI@home Beta Test Message from server: Resent lost task 05ap10al.19223.2117.140733193388043.14.250_1 16/11/2012 18:08:01 SETI@home Beta Test Project requested delay of 7 seconds 16/11/2012 18:08:01 SETI@home Beta Test [sched_op_debug] estimated total CPU job duration: 0 seconds 16/11/2012 18:08:01 SETI@home Beta Test [sched_op_debug] estimated total NVIDIA GPU job duration: 0 seconds 16/11/2012 18:08:01 SETI@home Beta Test [sched_op_debug] estimated total ATI GPU job duration: 9322 seconds 16/11/2012 18:08:01 SETI@home Beta Test [sched_op_debug] Deferring communication for 7 sec 16/11/2012 18:08:01 SETI@home Beta Test [sched_op_debug] Reason: requested by project and: 16/11/2012 18:26:21 SETI@home Beta Test [sched_op_debug] Starting scheduler request 16/11/2012 18:26:21 SETI@home Beta Test Sending scheduler request: To fetch work. 16/11/2012 18:26:21 SETI@home Beta Test Requesting new tasks for CPU and GPU 16/11/2012 18:26:21 SETI@home Beta Test [sched_op_debug] CPU work request: 54.41 seconds; 0.00 CPUs 16/11/2012 18:26:21 SETI@home Beta Test [sched_op_debug] NVIDIA GPU work request: 0.00 seconds; 0.00 GPUs 16/11/2012 18:26:21 SETI@home Beta Test [sched_op_debug] ATI GPU work request: 5456.83 seconds; 0.00 GPUs 16/11/2012 18:26:37 SETI@home Beta Test Scheduler request completed: got 5 new tasks 16/11/2012 18:26:37 SETI@home Beta Test [sched_op_debug] Server version 701 16/11/2012 18:26:37 SETI@home Beta Test Message from server: Resent lost task 05ap10al.21593.15205.140733193388042.14.18_0 16/11/2012 18:26:37 SETI@home Beta Test Message from server: Resent lost task 05ap10al.21593.15205.140733193388042.14.19_0 16/11/2012 18:26:37 SETI@home Beta Test Message from server: Resent lost task 05ap10al.21593.15205.140733193388042.14.20_0 16/11/2012 18:26:37 SETI@home Beta Test Message from server: Resent lost task 05ap10al.21593.15205.140733193388042.14.21_0 16/11/2012 18:26:37 SETI@home Beta Test Message from server: Resent lost task 05ap10al.21593.15205.140733193388042.14.22_0 16/11/2012 18:26:37 SETI@home Beta Test Project requested delay of 7 seconds 16/11/2012 18:26:37 SETI@home Beta Test [sched_op_debug] estimated total CPU job duration: 9445 seconds 16/11/2012 18:26:37 SETI@home Beta Test [sched_op_debug] estimated total NVIDIA GPU job duration: 0 seconds 16/11/2012 18:26:37 SETI@home Beta Test [sched_op_debug] estimated total ATI GPU job duration: 6199 seconds 16/11/2012 18:26:37 SETI@home Beta Test [sched_op_debug] Deferring communication for 7 sec 16/11/2012 18:26:37 SETI@home Beta Test [sched_op_debug] Reason: requested by project and normal requests work fine: 16/11/2012 18:34:31 SETI@home Beta Test [sched_op_debug] Starting scheduler request 16/11/2012 18:34:31 SETI@home Beta Test Sending scheduler request: To fetch work. 16/11/2012 18:34:31 SETI@home Beta Test Requesting new tasks for CPU and GPU 16/11/2012 18:34:31 SETI@home Beta Test [sched_op_debug] CPU work request: 269.28 seconds; 0.00 CPUs 16/11/2012 18:34:31 SETI@home Beta Test [sched_op_debug] NVIDIA GPU work request: 14124.74 seconds; 0.00 GPUs 16/11/2012 18:34:31 SETI@home Beta Test [sched_op_debug] ATI GPU work request: 20406.80 seconds; 0.00 GPUs 16/11/2012 18:36:53 SETI@home Beta Test Scheduler request completed: got 29 new tasks 16/11/2012 18:36:53 SETI@home Beta Test [sched_op_debug] Server version 701 16/11/2012 18:36:53 SETI@home Beta Test Project requested delay of 7 seconds 16/11/2012 18:36:53 SETI@home Beta Test [sched_op_debug] estimated total CPU job duration: 9467 seconds 16/11/2012 18:36:53 SETI@home Beta Test [sched_op_debug] estimated total NVIDIA GPU job duration: 12990 seconds 16/11/2012 18:36:53 SETI@home Beta Test [sched_op_debug] estimated total ATI GPU job duration: 18677 seconds 16/11/2012 18:36:53 SETI@home Beta Test [sched_op_debug] Deferring communication for 7 sec 16/11/2012 18:36:53 SETI@home Beta Test [sched_op_debug] Reason: requested by project Note: this all been done using a proxy to get round the scheduler timeout both the Main and Beta projects are suffering. Claggy |
Juha Volunteer tester Send message Joined: 18 Jun 08 Posts: 76 Credit: 113,089 RAC: 0 |
Eric, check validator logic once again. I haven't read server code carefully enough to be sure but there might a be a small window of opportunity for a late report just as the files are being scheduled for deletion. Or it might be something else. Anyway, I made some changes that should take care of the stuck results for good. The first part is the same as before. If the result is invalid or can't be opened the Astropulse side of the validator lies to the BOINC side to retry the validation later. I fixed that by making sure retry is signalled only when necessary, which is when your file server is not accessible. Previously the code couldn't tell the difference between a missing file and a missing file server. I improved the ResultFile class to do better diagnosis of the problem. The information is then carried in the ResultFileError exception to the rest of the code. In the case of missing file server the BOINC side is told to retry later, otherwise a missing file will get the result marked with Validate Error and Invalid. Also the log messages got a bit of touch up. Instead of logging that an error occurred the code now tries to tell where and and what kind of error occurred and what caused it in the first place. And the rest of the changes. The exception handling in the code was a very fine example of how not to do exception handling. Basically it was just emulating the traditional way of returning an error code. Cleaning that up made the code easier to follow imho. Combining all three changes into one made the patch a bit messy but considering that all of them are to same parts of the code I don't think separating the changes would have made the patch that much easier to follow. (Or I'm just too lazy to redo it.) The amount of changes is a bit more than last time so instead of inlining the patch I'm just going to give a link to the patch file(1). It was made with git-svn. I don't know if svn patch can handle it but it does seem to be readable to patch. You very likely need to be in astropulse directory to apply the patch. Just in case there's some problems with the patch I packaged the changed files. End result should be the same whichever you choose to use. Link to package(2). (1) That was a direct link. In case it doesn't work here's the patch via Google Drive UI. (2) Same as (1). Package via UI. |
![]() Volunteer tester ![]() Send message Joined: 18 Aug 05 Posts: 2423 Credit: 15,878,738 RAC: 0 |
Today I recived fresh pack of cuda22 tasks along with ~same amount of cuda23 tasks. cuda22 almost twice slower on this host. Does it meant that this issue remains unfixed ? |
Richard Haselgrove Volunteer tester Send message Joined: 3 Jan 07 Posts: 1444 Credit: 3,264,298 RAC: 1 |
Today I recived fresh pack of cuda22 tasks along with ~same amount of cuda23 tasks. Different bug. Message 44303. |
©2018 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.