Message boards :
Number crunching :
Cannot get any work with 3 GPU, no queue size, one GPU sometimes idle
Message board moderation
Author | Message |
---|---|
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 309 Credit: 70,759,933 RAC: 3 ![]() |
======brought this over from boinc forum as maybe the problem is SETI???======== Noticed for some time that SETI had exactly 1 task running on each of the 3 GPUs. There is no queue depth. Since generally there are 100,000 or so at the project then something is wrong. All my systems use account manager BAM! but it seems that preferences at BAM! are not used (they show .1 and .25), the same as the local client preference (according to BoincTasks) I went to SETI and set preferences there for .25 daily queue with .50 additional (used to be .1 and .25) just to see what happened. Did an update as that was required by the project and event queue reported
3348 SETI@home 5/2/2019 10:53:52 AM Sending scheduler request: Requested by user. 3349 SETI@home 5/2/2019 10:53:52 AM Not requesting tasks: don't need (CPU: ; AMD/ATI GPU: ) 3350 SETI@home 5/2/2019 10:53:54 AM Scheduler request completed 3351 SETI@home 5/2/2019 10:53:54 AM General prefs: from SETI@home (last modified 02-May-2019 10:53:54) 3352 SETI@home 5/2/2019 10:53:54 AM Host location: none 3353 SETI@home 5/2/2019 10:53:54 AM General prefs: using your defaults 3354 5/2/2019 10:53:54 AM Reading preferences override file 3355 5/2/2019 10:53:54 AM Preferences: 3356 5/2/2019 10:53:54 AM max memory usage when active: 6139.56 MB 3357 5/2/2019 10:53:54 AM max memory usage when idle: 11051.20 MB 3358 5/2/2019 10:53:54 AM max disk usage: 116.17 GB 3359 5/2/2019 10:53:54 AM max CPUs used: 20 3360 5/2/2019 10:53:54 AM (to change preferences, visit a project web site or select Preferences in the Manager)
3603 SETI@home 5/2/2019 11:42:31 AM General prefs: from SETI@home (last modified 02-May-2019 10:53:55) 3604 SETI@home 5/2/2019 11:42:31 AM Host location: none 3605 SETI@home 5/2/2019 11:42:31 AM General prefs: using your defaults 3606 5/2/2019 11:42:31 AM Reading preferences override file 3607 5/2/2019 11:42:31 AM Preferences: 3608 5/2/2019 11:42:31 AM max memory usage when active: 6139.56 MB 3609 5/2/2019 11:42:31 AM max memory usage when idle: 11051.20 MB 3610 5/2/2019 11:42:31 AM max disk usage: 116.17 GB 3611 5/2/2019 11:42:31 AM max CPUs used: 20 3612 5/2/2019 11:42:31 AM (to change preferences, visit a project web site or select Preferences in the Manager)
|
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
You must have some configuration conflicting with resources allocation. The host has more gpu work for other projects and Seti only gets the last little slice of gpu allocation. Or you have mistakenly put a decimal point in the wrong place in your usage in Preferences. What does setting the sched_ops_debug flag in Logging options show for work request? It will show the number of seconds of work requested for both cpu and gpu. You could also set work_fetch_debug and look at its more detailed report. With even a 0.5 day work cache, you should get 100 tasks for each gpu and another 100 tasks for the cpu. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 309 Credit: 70,759,933 RAC: 3 ![]() |
You must have some configuration conflicting with resources allocation. The host has more gpu work for other projects and Seti only gets the last little slice of gpu allocation. Or you have mistakenly put a decimal point in the wrong place in your usage in Preferences. OK, set those debug flags. Results are here It the above does not work remove the www. I have no idea which sites use which protocol. Be nice if all boinc projects upgrade to allow storage on the cloud like newer forums / communities. Going to make a guess after looking at the chatter. I have share set to 0 because I want seti to run behind all other GPU tasks. There are no other GPU tasks on this system nor do I plan on any but that might change. Maybe that is the problem? |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Yes, that is exactly the problem. When you set share to 0 for a project it will request exactly one task at a time and finish it when the other projects are idle and continue to ask for only a single task until the other projects pickup work again. That way you don't get a ton of work for a backup project when your prime projects get work again and have to crunch through the unwanted secondary backup project work before being able to request work from your prime project. If you look at your wok_fetch_debug you are asking for exactly one day of work with is 0.25 day plus 0.50 days of additional work. [work_fetch] target work buffer: 21600.00 + 43200.00 sec If you wanted to get more Seti work, you could set the host for another venue and then bump the project share up to 1 or 5 or something in relation to your prime projects share. Also I would remove the additional days of work by setting to 0.01 days which is the lowest BOINC will allow since it won't take zero for input. I would reduce your work day cache down even further to maybe 0.1 day of cache. That would only request 8640 seconds of work which you could easily crunch through with the gpu in short time. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Keith is right - Resource share 0 is a special case value meaning 'backup project', work to be fetched only when resources are idle and no other project can supply work. Using it here on a machine with 3 GPUs is especially problematic because of the enforced 5-minute delay between work requests: if you need work between those requests, you can't get it. |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 309 Credit: 70,759,933 RAC: 3 ![]() |
Thanks Keith, Richard! Yes, after changing priority at BAM! and doing a sync, about 5 minutes later I got a boatload of tasks. I was unaware of the "0" but did know about the problem with low priority tasks ending up with a lot of WU's that never complete. This is what I have been working on: I have GPUs that have extremely fast double precision float. S9x00, HD79x0: They work best on Milkyway All other GPUs, RX5x0, GTX1070 have superior single precision over the above AMD boards but really suck on double precision, typically 1:16 ratio. Waste of electricity on Milkyway Milkyway & Seti go off line for maintenance regularly. I want priority on science projects with fallback to non-science when those are offline. Not all projects have ATI apps, most have nVidia. so fallback to Asteroids cannot be on ATI systems (for example) There is a problem with Milkyway in that they have work but do not supply it for some reason. Some type of bug, perfect example is HERE. During those 10-15 minute gaps my secondary projects suck up work units and if I set the priority too low there will be real problems later near their deadlines. I am thinking that I cannot use BAM! nor boinc client general preferences and need to use project preferences. Not sure how to do this or if it is even possible keeping BAM! as account manager. I am not sure if Milkyway even looks at project preferences like .1 and .25. I seem to get exactly 200 WUs for each GPU. I cannot change what Milkyway is doing. The best I can do to avoid idle time is to fall back on seti or Einstein on those double precision AMD boards. I will try a low number for seti & Einstein on those AMD boards. My other GPUs do not run milkyway nor do I plan to other then getting statistics for various studies I am doing. If you got any suggestions let me know. Maybe there should be a WiKi about this and also about cost (KWH) of running various projects. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I am not sure if Milkyway even looks at project preferences like .1 and .25. I seem to get exactly 200 WUs for each GPU. I think you may be correct that MW doesn't obey the normal BOINC convention of cache allotment. I think you just get 200 tasks for each gpu. Doesn't matter how fast or slow they are or the GFLOPS performance of the app on the task in determining how much work to ask for. Right now MW@home is very broken since the server code update. The preference choices have been removed for controlling which project you get work for. The ongoing issue with fast clients going idle until all the work is reported before asking for work. Plus the obvious misconfiguration in the feeder size. But we have new scientists maintaining the project and they are starting off on the bottom of the learning curve. We will just have to have patience till they can get the project working correctly again. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
I think you may be correct that MW doesn't obey the normal BOINC convention of cache allotment....I'd be interested if you can stand that up. Remember that what you ask for is a client decision: what you get is a server decision. They are independent, but should be correlated. The ideal is that you would see 02/05/2019 23:20:56 | SETI@home | [sched_op] NVIDIA GPU work request: 7897.15 seconds; 0.00 devices 02/05/2019 23:20:59 | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 7922 secondswhere 'received' exceeds 'request' by no more than one task's estimated runtime. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I think you may be correct that MW doesn't obey the normal BOINC convention of cache allotment.... I'd had to replicate my earlier test back when the number of tasks allowed per gpu was set at 40 or 80. I set my cache allotment to 0.1 day and 0.01 additional days and still received 40 tasks only. Changing to 4 days of cache changed nothing. The server configuration sets the max allowed per gpu. It was 40 a few years ago, then bumped to 80 a year ago and then earlier this year was bumped to 200 per gpu because of all the fast ATI host users complaining they crunched through the tasks too fast. https://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=4424&postid=68441 Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.