Deprecated: Function get_magic_quotes_gpc() is deprecated in /disks/centurion/b/carolyn/b/home/boincadm/projects/beta/html/inc/util.inc on line 663
SAH v8 on Linux & Nvidia OpenCL

SAH v8 on Linux & Nvidia OpenCL

Message boards : SETI@home Enhanced : SAH v8 on Linux & Nvidia OpenCL
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Metod, S56RKO
Volunteer tester

Send message
Joined: 27 Feb 06
Posts: 13
Credit: 924,689
RAC: 0
Slovenia
Message 55926 - Posted: 14 Jan 2016, 13:21:28 UTC

I've got a few tasks MB v8 opencl_nvidia_sah on my Linux x64 machine (host ID 5310). Seems to be working fine science-wise (tasks validate just fine) while they consume ungodly amount of CPU time. They are supposed to consume less than 15% of a CPU per task while in reality they consume 100% CPU per task.

On some other forums it was said that this is due to the way NVIDIA implemented waits for OpenCL kernel to do something and there were some workarounds mentioned. Did anybody have a look at this behaviour?

Some of my peers have NVIDIA OpenCL on Windows and they seem to suffer from the same problem as well. ATI OpenCL seems to behave better, CPU time there is roughly one order of magnitude lower than run time.

It seems that there are some unofficial v8 beta apps floating around, some of them don't seem to suffer from the same problem. User paulT has one of those that have CPU time quite lower than run time (host ID 77426). Which makes me think that solution to this problem does exist indeed.

Any thoughts from developers?
Metod ...
ID: 55926 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 55933 - Posted: 14 Jan 2016, 17:44:38 UTC - in response to Message 55926.  

Answer to "why": https://setiweb.ssl.berkeley.edu/beta//forum_thread.php?id=2266&postid=55931
Answer to "what to do": -use_sleep
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 55933 · Report as offensive
Metod, S56RKO
Volunteer tester

Send message
Joined: 27 Feb 06
Posts: 13
Credit: 924,689
RAC: 0
Slovenia
Message 55965 - Posted: 15 Jan 2016, 14:54:15 UTC - in response to Message 55933.  

Thank you for the pointers.

I've constructed required app_info.xml and verified that executable (setiathome_8.04_x86_64-pc-linux-gnu__opencl_nvidia_sah) really got THE command line option. It did. However, I did not notice any difference in behaviour (still more than 95% CPU usage).

I noticed that in task's stdout it says

Sleep() & wait for event loops will be used in some places


and that line was not in there previously. I guess it prooves that cmdline parameter was accepted.
Metod ...
ID: 55965 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 55981 - Posted: 15 Jan 2016, 19:01:42 UTC - in response to Message 55965.  
Last modified: 15 Jan 2016, 19:02:00 UTC

and that line was not in there previously. I guess it prooves that cmdline parameter was accepted.


Yes, it was accepted. Another question if it really implemented on Linux.
Windows version uses Sleep() that is, Windows-specific call so porting required.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 55981 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 18 Jan 06
Posts: 1038
Credit: 18,734,730
RAC: 0
Germany
Message 55983 - Posted: 15 Jan 2016, 23:51:45 UTC
Last modified: 16 Jan 2016, 0:40:28 UTC

Just checked -use-sleep option and it currently does nothing.

Will make sure that the next version will have that active by default, like it was with AstroPulse v7.
_\|/_
U r s
ID: 55983 · Report as offensive
Zalster
Volunteer tester

Send message
Joined: 30 Dec 13
Posts: 258
Credit: 12,340,341
RAC: 0
United States
Message 55984 - Posted: 16 Jan 2016, 0:08:50 UTC - in response to Message 55983.  
Last modified: 16 Jan 2016, 0:09:01 UTC

Urs,

Is that default to be for Linux only or will Windows also be affected?

I did not speak up as the thread states Linux, but in the OP initial post he talks about Windows users suffering the same issues.

I know from testing that the -use_sleep command for the OpenCL VLAR on Windows lowers the CPU usage down to low 20% or even teens but the time to complete increase by a similar factor. 24 minutes to 1 hour 20 minutes.

I've used Commandlines that Mike gave me for those OpenCl VLARs minus the -use_sleep and had good throughput. But CPU usage is extremely high (but I don't use the CPU for anything else but GPU support)
ID: 55984 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 18 Jan 06
Posts: 1038
Credit: 18,734,730
RAC: 0
Germany
Message 55986 - Posted: 16 Jan 2016, 0:42:02 UTC - in response to Message 55984.  

Linux only so far.
_\|/_
U r s
ID: 55986 · Report as offensive
Iztok s52d (and friends)
Volunteer tester

Send message
Joined: 22 Jan 16
Posts: 2
Credit: 6,399
RAC: 0
Slovenia
Message 56239 - Posted: 23 Jan 2016, 10:56:01 UTC - in response to Message 55983.  

Hello!

Main report: CPU on 100% with 352.63 driver, all normal with old one.

I recently joined beta to help with v8 nvidia for linux.
With 304.125 driver it worked somehow: CPU usage was normal,
but WU was stuck on some 66%. After restart, it stareted from zero again.

So, I upgraded to 352.63 driver, now it works, validates,
but CPU load is on 100%.

[/url]https://setiweb.ssl.berkeley.edu/beta/show_host_detail.php?hostid=77887[/url]

Beside: X works fine, no noticable delays due to GPU being busy.
(Same PC running astropulse V7 GPU has some problems with this).

BR
s52d
ID: 56239 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 56247 - Posted: 23 Jan 2016, 21:13:37 UTC - in response to Message 56239.  


I recently joined beta to help with v8 nvidia for linux.
With 304.125 driver it worked somehow: CPU usage was normal,
but WU was stuck on some 66%. After restart, it stareted from zero again.

BR
s52d


?? With 304 driver OpenCL NV should abort its execution and complain about driver incompatibility. Could you provide link to any completed result under that driver?
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 56247 · Report as offensive
Iztok s52d (and friends)
Volunteer tester

Send message
Joined: 22 Jan 16
Posts: 2
Credit: 6,399
RAC: 0
Slovenia
Message 56248 - Posted: 24 Jan 2016, 6:51:20 UTC - in response to Message 56247.  


I recently joined beta to help with v8 nvidia for linux.
With 304.125 driver it worked somehow: CPU usage was normal,
but WU was stuck on some 66%. After restart, it started from zero again.

BR
s52d


?? With 304 driver OpenCL NV should abort its execution and complain about driver incompatibility. Could you provide link to any completed result under that driver?


With old driver it was running, but never finished. Nothing completed, sorry.
I aborted and upgraded.
But while running CPU usage was low. (no complain found in the log, BOINC client version 7.2.42 for x86_64-pc-linux-gnu ).

BR, thanks

-- 16no11aa.26351.19727.8.42.46_0 ---
22-Jan-2016 16:41:20 [SETI@home Beta Test] Starting task 16no11aa.26351.19727.8.42.46_0
23-Jan-2016 10:41:13 [SETI@home Beta Test] task 16no11aa.26351.19727.8.42.46_0 aborted by user
23-Jan-2016 10:41:14 [SETI@home Beta Test] Computation for task 16no11aa.26351.19727.8.42.46_0 finished
-- 16no11aa.2951.15633.8.42.225_3 ---
22-Jan-2016 16:46:21 [SETI@home Beta Test] Starting task 16no11aa.2951.15633.8.42.225_3
23-Jan-2016 10:41:13 [SETI@home Beta Test] task 16no11aa.2951.15633.8.42.225_3 aborted by user
23-Jan-2016 10:41:14 [SETI@home Beta Test] Computation for task 16no11aa.2951.15633.8.42.225_3 finished
ID: 56248 · Report as offensive
Laurent Domisse
Volunteer tester

Send message
Joined: 10 Feb 16
Posts: 9
Credit: 1,004,348
RAC: 0
France
Message 56652 - Posted: 11 Feb 2016, 21:11:44 UTC

Same issue running opencl_nvidia_sah on Linux with driver 346.

It stop at 52% with the following messages :

SETI@home v8 8.04 (opencl_nvidia_sah)
Waiting to run (Scheduler wait: nVidia has adequate OpenCL support for
FERMI+ devices starting with 350.xx drivers. Please upgrade driver to
350.xx to use OpenCL MultiBeam SETI apps on FERMI+ CC2+ device

Then another GPU wu is starting until it reach 52% and stop...forever...

I have GTX 950.
Sadly, I'm runnng mageia 5 and there is no 352 driver available right now.

Maybe GPU wu should not be started when nvidia driver is lower than 350 and the error message should be sent ?

Let me know if I can help !
ID: 56652 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 56655 - Posted: 11 Feb 2016, 21:36:04 UTC - in response to Message 56652.  

new build will allows 347+ drivers.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 56655 · Report as offensive
Olle Kalesche
Volunteer tester

Send message
Joined: 30 Jan 16
Posts: 1
Credit: 90,363
RAC: 0
Germany
Message 56657 - Posted: 12 Feb 2016, 1:44:35 UTC

Although I did not have any problems with running "SAH v8 on Linux & Nvidia OpenCL",
I was asked to write a little report here to give some positive hints. Well, here it is :

Hardware : Athlon 5350 (OC at 23xx MHz), ASRock AM1B-ITX, 8GB-Ram, EVGA GTX 750Ti FTW ACX Cooler, 1.189 MHz-1.268 MHz (OC).
Software : openSUSE Leap 42.1 (daily update if necessary), driver version for nvidia = 352.79
Special Boinc-Settings : NONE for cpu, gpu or other ???.xml !

Report from BOINC :
Di 09 Feb 2016 21:41:41 CET | | cc_config.xml not found - using defaults
Di 09 Feb 2016 21:41:42 CET | | Starting BOINC client version 7.2.42 for x86_64-pc-linux-gnu
Di 09 Feb 2016 21:41:42 CET | | log flags: file_xfer, sched_ops, task
Di 09 Feb 2016 21:41:42 CET | | Libraries: libcurl/7.37.0 OpenSSL/1.0.1i zlib/1.2.8 libidn/1.28 libssh2/1.4.3
Di 09 Feb 2016 21:41:42 CET | | Data directory: /usr/bin
Di 09 Feb 2016 21:41:42 CET | | CUDA: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version unknown, CUDA version 7.5, compute capability 5.0, 2047MB, 1827MB available, 2434 GFLOPS peak)
Di 09 Feb 2016 21:41:42 CET | | OpenCL: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 352.79, device version OpenCL 1.2 CUDA, 2047MB, 1827MB available, 2434 GFLOPS peak)
Di 09 Feb 2016 21:41:42 CET | | Host name: linux-hmci
Di 09 Feb 2016 21:41:42 CET | | Processor: 4 AuthenticAMD AMD Athlon(tm) 5350 APU with Radeon(tm) R3 [Family 22 Model 0 Stepping 1]
Di 09 Feb 2016 21:41:42 CET | | Processor features: fpu vme ...
...
Di 09 Feb 2016 21:41:42 CET | | OS: Linux: 4.1.15-8-default
Di 09 Feb 2016 21:41:42 CET | | Memory: 7.67 GB physical, 2.01 GB virtual

Number of workunits calculated 684, 678 validated, 5 errors and 1 pending - this is pretty normal.
OpenCL seems to raise the level of 1 CPU-Kernel to a usage of 100% (I saw that very often in combination with OpenCL),
but when I started work on all CPU-Kernels, the calculation of the workunits just ran some seconds slower -
I do not think, that this is a problem.

There was no infinite loop, there was no crash of Boinc or the system ...
sorry , everything worked fine ...
ID: 56657 · Report as offensive
Laurent Domisse
Volunteer tester

Send message
Joined: 10 Feb 16
Posts: 9
Credit: 1,004,348
RAC: 0
France
Message 56662 - Posted: 12 Feb 2016, 10:12:34 UTC - in response to Message 56655.  

Thanks.

No hope for 346 drivers ?
ID: 56662 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 56663 - Posted: 12 Feb 2016, 11:10:01 UTC - in response to Message 56662.  

Thanks.

No hope for 346 drivers ?

No. nVidia fixed their drivers only starting from 347.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 56663 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 18 Jan 06
Posts: 1038
Credit: 18,734,730
RAC: 0
Germany
Message 56678 - Posted: 12 Feb 2016, 23:04:09 UTC - in response to Message 56662.  

Thanks.

No hope for 346 drivers ?

And do not try 349 drivers on linux. These seem to have some other issue that makes these unusable for our purpose OpenCL.

Download the 352.79 driver from nVidia's website. That one is tested and should work with the OpenCL NV app.
_\|/_
U r s
ID: 56678 · Report as offensive
Laurent Domisse
Volunteer tester

Send message
Joined: 10 Feb 16
Posts: 9
Credit: 1,004,348
RAC: 0
France
Message 56683 - Posted: 13 Feb 2016, 11:41:15 UTC - in response to Message 56678.  

Thanks.

From nvidia website, the current driver for GTX 900 series is 361.28.
Does anyone test it ? or is it safer to use 352.79 ?
ID: 56683 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 3 Jun 07
Posts: 51
Credit: 2,861,562
RAC: 0
Finland
Message 56728 - Posted: 15 Feb 2016, 15:56:56 UTC

Does anyone have idea when opencl_nvidia_sah is released to main?
ID: 56728 · Report as offensive
Zalster
Volunteer tester

Send message
Joined: 30 Dec 13
Posts: 258
Credit: 12,340,341
RAC: 0
United States
Message 56733 - Posted: 15 Feb 2016, 18:15:53 UTC - in response to Message 56728.  

It's not.

There were too many issues with it.

It's been replaced with the test app OpenCL_nvidia_SoG which is currently being test both on Beta and Main.
ID: 56733 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 3 Jun 07
Posts: 51
Credit: 2,861,562
RAC: 0
Finland
Message 56735 - Posted: 15 Feb 2016, 18:47:35 UTC - in response to Message 56733.  
Last modified: 15 Feb 2016, 18:48:05 UTC

It's not.

There were too many issues with it.

It's been replaced with the test app OpenCL_nvidia_SoG which is currently being test both on Beta and Main.


Yes.

I Know.

That SoG is for Windows.

This thread is for LINUX and opencl_nvidia_sah

Anyone else?
ID: 56735 · Report as offensive
1 · 2 · Next

Message boards : SETI@home Enhanced : SAH v8 on Linux & Nvidia OpenCL


 
©2023 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.