opencl_nvidia_SoG issues thread

Message boards : SETI@home Enhanced : opencl_nvidia_SoG issues thread
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 56442 - Posted: 3 Feb 2016, 6:18:49 UTC

Please post here any issues you have with this application: opencl_nvidia_SoG

In case you experience lags or driver restarts post your host configuration and results of troubleshooting sequence posted here:

http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=2266&postid=55879#55879

Specifically: at what values restarts stopped/lags disappeared.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 56442 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 56478 - Posted: 4 Feb 2016, 6:02:22 UTC

Not sure if this is a problem, or just a difference in how SoG processes the data.
Progress (percentage complete) does not seem to accurately reflect progress, or at least does not seem to progress in a linear fashion. Percentage complete registers below 10% for almost half the duration of processing. By 50% of time, progress is reflecting 15-20%. By about 60% of time, percentage starts increasing by as much as 10% per 3 second update. Not saying this is a problem, per se, but it is behavior not consistent with OpenCL_sah or Cuda42/50 on V8, so I thought I'd mention it case it is a necessary or easy fix.
Other than that, SoG is a significant improvement over other OpenCL or Cuda apps on my crunchers.
I turned all my boxes over to SoG work for this evening, to build the count, and have done a few hundred. If there are results I can provide that cannot be found by looking at my public profile or computer data, please let me know what data and how and where to provide it and I will gladly do so. Regards, Jim ...
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 56478 · Report as offensive
Zalster
Volunteer tester

Send message
Joined: 30 Dec 13
Posts: 258
Credit: 12,340,341
RAC: 0
United States
Message 56479 - Posted: 4 Feb 2016, 6:08:22 UTC - in response to Message 56478.  

When I asked him what SoG stood for, he said it was Signals on GPU then gave this explanation

It means spikes, autocorrs and gaussians are stored completely on GPU now and retrieved only on checkpoints. Much less synching for some ARs and hence much less issues with NV's realisation of that synching in their OpenCL runtime.


I'm guessing that means it's keeping all those results on the GPU until it's nearly finished then starts to move it back, why we see a slow start but rapid finish.

I have to wonder if the amount of RAM on the card in anyway affects how fast or how many instances of work can be done on the card.
ID: 56479 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 56480 - Posted: 4 Feb 2016, 6:18:17 UTC - in response to Message 56479.  

When I asked him what SoG stood for, he said it was Signals on GPU then gave this explanation


understood.

I'm guessing that means it's keeping all those results on the GPU until it's nearly finished then starts to move it back, why we see a slow start but rapid finish.


Either way, it all comes back to how remaining time and percentage complete are calculated, regardless of where the data is. As I said, an observation, not a complaint.

I have to wonder if the amount of RAM on the card in anyway affects how fast or how many instances of work can be done on the card.


Interesting question. I used to run 2x jobs per GT610 w/512mb. Doubt that would fly here, as with 2xSoG on 750ti/2mb, I'm seeing memory usage bouncing between 25-50% with GPU use nicely in the high 90%'s but not maxed out. In testing I've done, more than 2 jobs per 750ti didn't result in appreciable gain, and with OpenCL loving to suck on CPU use, actually degradation.
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 56480 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 56484 - Posted: 4 Feb 2016, 10:21:04 UTC - in response to Message 56480.  

Non-linear progress was ALWAYS, even with CPU Multibeam apps. It's inherent property of types of data processing distribution inside processing chain. SoG just makes this difference more visible.

SoG GPU memory demands are higher but more memory used on GPUs capable to allocate it. In stderr it can be seen by:

LotOfMem path: yes


and amount of used memory:

Currently allocated 201 MB for GPU buffers


As one can see current amount not critical even for modern low-end GPUs.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 56484 · Report as offensive
Profile David S
Volunteer tester
Avatar

Send message
Joined: 10 Sep 13
Posts: 1187
Credit: 2,791,507
RAC: 0
United States
Message 56492 - Posted: 4 Feb 2016, 15:08:03 UTC

This isn't an issue with your new app per se. I just got my main cruncher to download new work, but the GPU tasks it got were opencl_nvidia_sah, not _SoG. Is there a way I can force it to get that specific type?
David
signature sent back to alpha testing
ID: 56492 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 56496 - Posted: 4 Feb 2016, 16:08:55 UTC - in response to Message 56492.  

This isn't an issue with your new app per se. I just got my main cruncher to download new work, but the GPU tasks it got were opencl_nvidia_sah, not _SoG. Is there a way I can force it to get that specific type?

As with any other app. To abort tasks until quota for particular plan class will reached. Then host will get tasks from other compatible plan classes.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 56496 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 56497 - Posted: 4 Feb 2016, 19:37:10 UTC - in response to Message 56484.  

Non-linear progress was ALWAYS, even with CPU Multibeam apps... SoG just makes this difference more visible.


OK, thanks. Very much more visible...
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 56497 · Report as offensive
Rob Smith
Volunteer moderator
Volunteer tester

Send message
Joined: 21 Nov 12
Posts: 1015
Credit: 5,459,295
RAC: 0
United Kingdom
Message 56507 - Posted: 4 Feb 2016, 21:58:52 UTC

Wow, the progress timings are really crazy - I just followed one task:
1 min ~3.5%
2 min ~5%
3 min ~1.2%
3.5 min ~30%
complete in ~3:50

(one task at a time on a GTX980)

CPU usage is minimal after the first few seconds - I guess that is when everything is getting loaded onto the GPU. The a quick burst at the end, when I assume the GPU is unloading.

No noticeable stagger during the body of the run, with only a little bit at the start.
ID: 56507 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 56528 - Posted: 5 Feb 2016, 16:45:45 UTC

App released as standalone pack for usage on main too.
But please continue beta besting also.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 56528 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 56529 - Posted: 5 Feb 2016, 18:40:35 UTC

Please heed the warnings about non-standard version/plan_class strings in OpenCL NV MultiBeam v8 SoG edition for Windows, if you intend to use this build long-term. This is not yet a full production-grade release.
ID: 56529 · Report as offensive
Profile David S
Volunteer tester
Avatar

Send message
Joined: 10 Sep 13
Posts: 1187
Credit: 2,791,507
RAC: 0
United States
Message 56548 - Posted: 6 Feb 2016, 15:09:26 UTC

Unfortunately, the splitter isn't running, so I can't get any SoGs to test.
David
signature sent back to alpha testing
ID: 56548 · Report as offensive

Message boards : SETI@home Enhanced : opencl_nvidia_SoG issues thread


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.