Astropulse 7,00 released for Linux 32&64, Win 32&64, Win32+AMD/NVIDIA/Intel GPU

Message boards : News : Astropulse 7,00 released for Linux 32&64, Win 32&64, Win32+AMD/NVIDIA/Intel GPU
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 35 · Next

AuthorMessage
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 26,981,856
RAC: 717
United States
Message 51419 - Posted: 3 Jul 2014, 18:25:09 UTC

Please point out problems on this thread with work distribution or compatibility.
ID: 51419 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 51424 - Posted: 3 Jul 2014, 20:36:22 UTC
Last modified: 3 Jul 2014, 20:50:29 UTC

NV GT9400 can't handle default settings and causes driver restart with next message:

ERROR: OpenCL kernel/call 'clEnqueueReadBuffer->CPU_result' call failed (-5) in file ..\..\ap_fold.cpp near line 6766.
Waiting 30 sec before restart...

BOINC performs temporary exit in this case so file boinc_temporary_exit with value 30 appears in the corresponding slot.

Possible solutions to this issue:

1) To increase TDR value for Windows video driver in registry.
2) to add -ffa_block 1024 or less and -ffa_block_fetch of 512 or less into ap_cmdline_7.00_windows_intelx86__opencl_nvidia_100.txt file in project directory.

I will try both methods and specify limits more precisely later.

EDIT: it's happening on quite old driver 263.06, maybe not occur on newer ones.
ID: 51424 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 29 May 06
Posts: 1037
Credit: 8,440,339
RAC: 3
United Kingdom
Message 51425 - Posted: 3 Jul 2014, 20:46:22 UTC - in response to Message 51424.  

2) to add -ffa_block 1024 or less and -ffa_block_fetch of 512 or less into ap_cmdline_7.00_windows_intelx86__opencl_nvidia_100.txt file in project directory.

The project can't do that, it supplies the file empty so you can put values in it,
If it supplies a file that is non-zero in size, then you modify it, then Boinc will discard the file and redownload it. (I don't remember if it errors the Wu too)

Claggy
ID: 51425 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 51426 - Posted: 3 Jul 2014, 20:51:49 UTC - in response to Message 51425.  
Last modified: 3 Jul 2014, 20:54:02 UTC

2) to add -ffa_block 1024 or less and -ffa_block_fetch of 512 or less into ap_cmdline_7.00_windows_intelx86__opencl_nvidia_100.txt file in project directory.

The project can't do that, it supplies the file empty so you can put values in it,
If it supplies a file that is non-zero in size, then you modify it, then Boinc will discard the file and redownload it. (I don't remember if it errors the Wu too)

Claggy


I know that project can't modify that file just as I fully aware that project can't modify host's TDR value, btw ;)

It's info for participants in case they meet same issue.

EDIT for the project one need to establish how many hosts affected by this particular issue. If only my own GT9400 driven by too old driver... then to change defaults would be not right decision. More appropriate would be to direct such users to post how to solve issue.
ID: 51426 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 26,981,856
RAC: 717
United States
Message 51427 - Posted: 3 Jul 2014, 21:50:19 UTC

I'll look around for newer driver sets on the same GPU and see if it's a widespread problem.
ID: 51427 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 51428 - Posted: 3 Jul 2014, 22:15:39 UTC
Last modified: 3 Jul 2014, 22:56:04 UTC

For now in offline testing it survived -ffa_block 512 -ffa_ffa_block_fetch 256 and crashed driver on -ffa_block 512 -ffa_ffa_block_fetch 512

So i think runtime error reporting is misplaced and real cause is fetch kernel (longest kernel in whole app, btw).
Hence, quite possible that higher -ffa_block values will go to, only -ffa_block_fetch should be severely limited (to 256 in my case).
Will check this in next few tests.

Also it seems time to implement some default values scaling based on device capabilities for 7.01
It's not uncommon to have 2 same vendor devices with vastly different capabilities (almost any upgrade with keeping old old card plugged will lead to such situation).

EDIT: it survived -ffa_block 1024 -ffa_ffa_block_fetch 256 too so I think recommended workaround would be:
add -ffa_block 1024 -ffa_block_fetch 256 to ap_cmdline_7.00_windows_intelx86__opencl_nvidia_100.txt file
[-ffa_block required to allow desired block_fetch modification]

Will test live on loaded host now and TDR solution later.

EDIT2: can confirm it works and make progress with recommended above settings.
Next will be to check TDR solution (as more appropriate for fast GPU/slow GPU mix).
ID: 51428 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 21 Sep 13
Posts: 223
Credit: 407,183
RAC: 0
France
Message 51429 - Posted: 4 Jul 2014, 5:25:32 UTC

10 tasks in progress on NVIDIA GeForce GT 640 (2048MB) driver: 337.88 :)
I give you my report when the first result is returned
Processing time announced : 16 hours
ID: 51429 · Report as offensive
Matthias Lehmkuhl
Volunteer tester

Send message
Joined: 15 Jul 05
Posts: 176
Credit: 1,674,830
RAC: 0
Germany
Message 51430 - Posted: 4 Jul 2014, 7:37:58 UTC
Last modified: 4 Jul 2014, 7:39:41 UTC

I got some errors on one Windows 7 x64 machine with AstroPulse v7 v7.00 (sse2) and AstroPulse v7 v7.00 (sse)
<core_client_version>7.3.19</core_client_version>
CPUID: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
AstroPulse v7 Windows x64 rev 2488, V7 match

Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS

BOINC client no longer exists - exiting
timer handler: client dead, exiting

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248746
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248806
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271736
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271741
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271813

all results had a run time of 10 sec.

Edit: boinc is running as service.
Matthias

ID: 51430 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 21 Sep 13
Posts: 223
Credit: 407,183
RAC: 0
France
Message 51431 - Posted: 4 Jul 2014, 8:13:45 UTC

First result send, no problem for me :)
ap_10fe09ab_B0_P1_00328_20140630_13924.wu_0

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
Running on device number: 0
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used
OpenCL platform detected: NVIDIA Corporation
BOINC assigns device 0
Info: BOINC provided OpenCL device ID used
Used GPU device parameters are:
Number of compute units: 2
Single buffer allocation size: 256MB
max WG size: 1024
FERMI path used: yes

Name: GeForce GT 640
Vendor: NVIDIA Corporation
Driver version: 337.88
Version: OpenCL 1.1 CUDA

ID: 51431 · Report as offensive
Matthias Lehmkuhl
Volunteer tester

Send message
Joined: 15 Jul 05
Posts: 176
Credit: 1,674,830
RAC: 0
Germany
Message 51434 - Posted: 4 Jul 2014, 8:27:46 UTC
Last modified: 4 Jul 2014, 8:34:46 UTC

I found 2 valid AstroPulse v7 v7.00 (opencl_nvidia_100) results in my list. :)
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248692
valid to AstroPulse v7 v7.00 (opencl_nvidia_100)
here I got 1400 credits

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17257113
valid to AstroPulse v7 v7.00
here I got 573 credits

Edit:
could also finish my first Linux x86 AstroPulse v7 v7.00 (sse2) result
AstroPulse v7.00
Linux 32 bit, Rev 2438
V7 match, by Raistmer with support of Lunatics.kwsn.net team.
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248340
Matthias

ID: 51434 · Report as offensive
Matthias Lehmkuhl
Volunteer tester

Send message
Joined: 15 Jul 05
Posts: 176
Credit: 1,674,830
RAC: 0
Germany
Message 51438 - Posted: 4 Jul 2014, 10:22:08 UTC - in response to Message 51430.  

I got some errors on one Windows 7 x64 machine with AstroPulse v7 v7.00 (sse2) and AstroPulse v7 v7.00 (sse)
<core_client_version>7.3.19</core_client_version>
CPUID: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
AstroPulse v7 Windows x64 rev 2488, V7 match

Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS

BOINC client no longer exists - exiting
timer handler: client dead, exiting

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248746
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248806
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271736
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271741
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271813

all results had a run time of 10 sec.

Edit: boinc is running as service.


one additional point:
the exe of AstroPulse v7 v7.00 (sse2) and AstroPulse v7 v7.00 (sse) have round 700 kB, the exe of AstroPulse v7 v7.00 intel86 has round 7 MB
and for sse and sse2 are no additional program files downloaded.
Matthias

ID: 51438 · Report as offensive
Alex Storey
Volunteer tester
Avatar

Send message
Joined: 10 Feb 12
Posts: 107
Credit: 305,151
RAC: 0
Greece
Message 51439 - Posted: 4 Jul 2014, 10:48:58 UTC

Extremely busy, apologies in advance...

Just enabled my ION but unfortunately @"set & forget" for now.

Anyway I'm pretty sure (but might be wrong) my version of ION is the exact same chip as your 9400, Raistmer. Only difference I think is mine is on 40nm instead of 55nm and that is why the chips have different code-names.

If you have time and your card supports it, I recommend the 270.xx driver I'm using right now. Not the 275.xx though, for some reason I can't remember anymore:) 280 & 290 series had no extra benefit on my ION, may even have been slower. Never tried 300+

Have a wonderful Summer everyone!
ID: 51439 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 51440 - Posted: 4 Jul 2014, 11:18:49 UTC - in response to Message 51439.  


If you have time and your card supports it, I recommend the 270.xx driver I'm using right now.

I'm staying with CUDA 3.2 development driver of 263.06 just because it allows to run OpenCL w/o fully CPU utilization. Soona after that driver NV changed drive model and 100% core load occured.
ID: 51440 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 51441 - Posted: 4 Jul 2014, 11:20:57 UTC - in response to Message 51430.  

I got some errors on one Windows 7 x64 machine with AstroPulse v7 v7.00 (sse2) and AstroPulse v7 v7.00 (sse)
<core_client_version>7.3.19</core_client_version>
CPUID: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
AstroPulse v7 Windows x64 rev 2488, V7 match

Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS

BOINC client no longer exists - exiting
timer handler: client dead, exiting

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248746
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248806
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271736
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271741
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271813

all results had a run time of 10 sec.

Edit: boinc is running as service.


Sounds like app can't communicate with BOINC client.
Try to change BOINc version and ask for this issue on BOINc forums too.
ID: 51441 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 51444 - Posted: 4 Jul 2014, 15:00:27 UTC
Last modified: 4 Jul 2014, 15:02:59 UTC

I like credits granted so far:
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6467836
:DDD
and another nice pay:
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6467012
few other are lower with min @~1100 credits per task.
ID: 51444 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 51446 - Posted: 4 Jul 2014, 15:23:34 UTC
Last modified: 4 Jul 2014, 15:40:14 UTC

On RV770 ATi app fails to compile:

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271828

Error : Building Program (source, clBuildProgram):main kernels: not OK code -11

Warning: GPU_fetch_array_kernel_twin_1D_cl kernel has register spilling. Lower performance is expected.
Error: Requested compile size is bigger than the required workgroup size of 64 elements
Error: Creating kernel GPU_fetch_array_kernel_twin_1D_cl failed!


I'll check if that kernel really used in processing.

Same issue with NV app on GT120:
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271698

Error : Building Program (source, clBuildProgram):main kernels: not OK code -42
ptxas : error : Entry function 'GPU_fetch_array_kernel_twin_1D_cl' uses too much shared data (0x4034 bytes, 0x4000 max)


EDIT: fast fix to this issue:

1) stop BOINC
2) locate AstroPulse_Kernels_r2488.cl file in project directory
3) locate these lines inside file:

#define FARRAY_SIZE 12
void fetch_caches(__global float* src, __global float* src_twin, __local float* src_cache, __local float* src_twin_cache){

4) put
#if 0
line before "#define FARRAY_SIZE 12" line so it should be:
#if 0
#define FARRAY_SIZE 12
void fetch_caches(__global float* src, __global float* src_twin, __local float* src_cache, __local float* src_twin_cache){


5) locate these lines inside file:
}

/*
#if 0 //scalar fetch
__kernel void GPU_fetch_array_kernel_twin_cl(__global float* src, __global float* src_twin,

6) put
#endif
after closing bracket "}" and opening commentary "/*"
That is, those lines should be after editing:
}
#endif
/*
#if 0 //scalar fetch
__kernel void GPU_fetch_array_kernel_twin_cl(__global float* src, __global float* src_twin,

7) save file
8) delete all *.bin* files inside project directory, especialy
AstroPulse_Kernels_r2488.*.bin_V6_TWIN_FFA_*
one
8) restart BOINC.
ID: 51446 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 21 Sep 13
Posts: 223
Credit: 407,183
RAC: 0
France
Message 51447 - Posted: 4 Jul 2014, 15:27:01 UTC - in response to Message 51444.  

I like credits granted so far:
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6467836
:DDD
and another nice pay:
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6467012
few other are lower with min @~1100 credits per task.

+100 Raistmer
Testing is good ! :)
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6475259
ID: 51447 · Report as offensive
Josef W. Segur
Volunteer tester

Send message
Joined: 14 Oct 05
Posts: 1137
Credit: 1,848,733
RAC: 0
United States
Message 51452 - Posted: 4 Jul 2014, 17:10:25 UTC - in response to Message 51441.  

I got some errors on one Windows 7 x64 machine with AstroPulse v7 v7.00 (sse2) and AstroPulse v7 v7.00 (sse)
<core_client_version>7.3.19</core_client_version>
CPUID: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
AstroPulse v7 Windows x64 rev 2488, V7 match

Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS

BOINC client no longer exists - exiting
timer handler: client dead, exiting

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248746
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17248806
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271736
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271741
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=17271813

all results had a run time of 10 sec.

Edit: boinc is running as service.


Sounds like app can't communicate with BOINC client.
Try to change BOINc version and ask for this issue on BOINc forums too.

It's no longer a matter of communicating with the BOINC client. BOINC puts its ProcessID in the init_data.xml file, the application periodically checks whether that ProcessID is still active and exits if not.

That checking of course is done by BOINC API code built into the application, and for the issue to get any attention from BOINC developers they would want to know what revision of that API code is being used. However, I agree it might make sense to replace the 7.3.19 alpha with the latest alpha of BOINC just in case the issue has already been fixed.

BOINC running as a service is sort of rare these days, that may be significant.
                                                                  Joe
ID: 51452 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 26,981,856
RAC: 717
United States
Message 51453 - Posted: 4 Jul 2014, 18:35:07 UTC - in response to Message 51446.  

Is there any way (predefined symbols) to tell at (OpenCL) compile time what the maximum workgroup size or shared data size is?

I would hope that OpenCL will eventually become a more general language where the compilers will do what is necessary to generate functional code regardless of what the inner workings of the GPU are.
ID: 51453 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 51456 - Posted: 4 Jul 2014, 19:45:01 UTC - in response to Message 51452.  
Last modified: 4 Jul 2014, 19:46:14 UTC


That checking of course is done by BOINC API code built into the application, and for the issue to get any attention from BOINC developers they would want to know what revision of that API code is being used. However, I agree it might make sense to replace the 7.3.19 alpha with the latest alpha of BOINC just in case the issue has already been fixed.

BOINC running as a service is sort of rare these days, that may be significant.
                                                                  Joe


1) I use 7.2.33 BOINC API for all builds.
2) My Core2Duo runs BOINC as sevice w/o issues (of course, older BOINC); my both Athlon 64 run BOINC as service too (again, old BOINC). With BOINC the rule doesn't fix that didn't broken has very high importance. I would avoid _any_ BOINc upgrade w/o the _real_ need. Especially BOINC alphas until you BOINc's alpha tester and in close contact with dev team.
ID: 51456 · Report as offensive
1 · 2 · 3 · 4 . . . 35 · Next

Message boards : News : Astropulse 7,00 released for Linux 32&64, Win 32&64, Win32+AMD/NVIDIA/Intel GPU


 
©2019 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.