Message boards :
Number crunching :
SETI applications for NVIDIA GPU improvement - how you can help
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 14 · Next
Author | Message |
---|---|
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Script recognizes only older version of CUDA. zi missed there. I'll add new CUDA build recognition in next version. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
What would the perfect report look like for you? Actually this script for your convenience cause you asked for tool to look into AR field en masse. Testing type I currently need I usually post here or in beta news thread. There will be new build soon that requires testing. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
send them to me. P.S. Ah, and before doing that make sure missed tasks not overflowed ones. Script recognize and excludes overflows automatically. Cause they have distorted timings. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Please test this build: https://cloud.mail.ru/public/9ifo/XQTdqmfJJ Look for CPU usage vs r3486. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
I never run script in BOINC's own dir. Copy xml file into another dir and run there.I shut down the client and made a copy of client_status.xml, 5 tasks were in status "uploading" and only 3 showed up in Times.txt For both of the missing, they are in client_state.xml and can see the AR value (ie: ar=0.427524) in the stderr. P.S. Ah, and before doing that make sure missed tasks not overflowed ones. Script recognize and excludes overflows automatically. Cause they have distorted timings.If by overflows, you mean: SETI@Home Informational message -9 result_overflowboth of the missing in Times.txt have it in their stderr. So it seems (to me) that it works as intended...except for cuda50 tasks are missing (on my 2nd rig that I am using to compare results to your latest NV_SoG revision). |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
Please test this build: https://cloud.mail.ru/public/9ifo/XQTdqmfJJWow, that's a big improvement in completion time for non-VLARs as compared to r3484 on GTX 750 Ti! NV_SoG_r3484: 15-16 mins NV_SoG_r3486: 12.5mins with no commandline for either It's even the 1st time that it's better than cuda50 (3tasks=~39mins) Keep up the great work! |
Harri Liljeroos Send message Joined: 29 May 99 Posts: 4635 Credit: 85,281,665 RAC: 126 |
Please test this build: https://cloud.mail.ru/public/9ifo/XQTdqmfJJ http://setiathome.berkeley.edu/result.php?resultid=5042270028 Here's one that was crunched with this version. Stderr misses the normal output of used parameters but includes a lot of other other information. The CPU use don't seem to be any different to the previous r3486 version. The parameters used were: -sbs 384 -use_sleep -hp -instances_per_device 1 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32 It was run on my GTX970. In general all these SoG applications are very sensitive to the CPU load. If it reaches 100% the driver reset is bound to happen although I have now the TdrDelay set at 8 seconds. So I have reserved 1.6 CPU cores per MB task because this is also my daily driver so extra head room is required. I will set up now the non-Boinc use limit to 18% to ease the effect of other running applications. |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
send them to me.Here are the files for a situation where I have 8GPU and 2 CPU tasks being blocked from Uploading. Only 1 CPU file is in Times.txt and there is no Overflow in client_state.xml https://www.dropbox.com/sh/5jbnvivo9mtcq5g/AADIpSRrHx_rnPzaDyecZEqsa?dl=0 I'm tired but I double-checked so I don't know if I forgot something. Cheers, Rob zzzz soon |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Commandlines are the same, no -tt 90 nor -use_sleep r.3486 Final WU true angle range is : 0.008880 http://setiathome.berkeley.edu/result.php?resultid=5046873396 Run time 18 min 25 sec CPU time 16 min 46 sec WU true angle range is : 0.009471 http://setiathome.berkeley.edu/result.php?resultid=5046872611 Run time 21 min 1 sec CPU time 20 min 48 sec r3430 WU true angle range is : 0.008136 http://setiathome.berkeley.edu/result.php?resultid=5044042638 Run time 20 min 29 sec CPU time 18 min 29 sec WU true angle range is : 0.008136 http://setiathome.berkeley.edu/result.php?resultid=5044042824 Run time 20 min 58 sec CPU time 17 min 56 sec There is some variance as they speed up when teamed up with non guppi. From what I can see they are comparable. Non-guppi times are exactly the same in both versions. None of the r3486final non guppi have yet validated. If you want those, we will need to wait for my wingmen. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
send them to me.Here are the files for a situation where I have 8GPU and 2 CPU tasks being blocked from Uploading. Task names to check? SETI apps news We're not gonna fight them. We're gonna transcend them. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Commandlines no sleep nor -tt 90 r3486Final WU true angle range is : 0.426888 http://setiathome.berkeley.edu/result.php?resultid=5046867280 Run time 12 min 24 sec CPU time 8 min 29 sec r3430 WU true angle range is : 0.423408 http://setiathome.berkeley.edu/result.php?resultid=5044010448 Run time 12 min 34 sec CPU time 7 min 6 sec do you want -use_sleep and -tt 90? |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
send them to me.Here are the files for a situation where I have 8GPU and 2 CPU tasks being blocked from Uploading. All the ones with a <stderr> that isn't task: 28jn10ac.17896.6206.12.39.68 |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
-use_sleep -tt 90 WU true angle range is : 0.426932 http://setiathome.berkeley.edu/result.php?resultid=5046873026 Run time 12 min 56 sec CPU time 6 min 18 sec commandline plus -tt 90 WU true angle range is : 0.426891 http://setiathome.berkeley.edu/result.php?resultid=5046873467 Run time 12 min 54 sec CPU time 6 min 31 sec WU true angle range is : 0.006725 http://setiathome.berkeley.edu/result.php?resultid=5047342941 Run time 21 min 8 sec CPU time 20 min 57 sec r3430 WU true angle range is : 0.007948 http://setiathome.berkeley.edu/result.php?resultid=5043925269 Run time 20 min 47 sec CPU time 17 min 54 sec ok that should be it. will keep running this version for now |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Another test to do by volunteers: On beta (or in anonymous platform mode on main) take 8.16 OpenCL app or later and add -tt F parameter to command line. increase F value (default is 15 that means 15 ms target time for partial PulseFind kernel) and watch for host usability (that is, GUI lags, missing letters at typing and so on). At what F value lags appear? From performance point of view it's better to have longer kernels but this can result in GUI lags. This testing needed to establish best possible default value for unattended run. EDIT: describe your host config (preferably with link to host on beta) along with report, please. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Please test in single task per GPU mode with and w/o -use_sleep for comparison: https://cloud.mail.ru/public/6wp7/cgfuAXmnc SETI apps news We're not gonna fight them. We're gonna transcend them. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Please test in single task per GPU mode with and w/o -use_sleep for comparison: Is this a new version or still part of the older r3486Final? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Please test in single task per GPU mode with and w/o -use_sleep for comparison: revision the same but binary is new. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Ok, I downloaded your new app and have it running Single instance per card No commandline WU true angle range is : 1.155181 http://setiathome.berkeley.edu/result.php?resultid=5049630122 Run time 2 min 35 sec CPU time 39 sec WU true angle range is : 2.527829 http://setiathome.berkeley.edu/result.php?resultid=5049227148 Run time 2 min 30 sec CPU time 30 sec WU true angle range is : 0.427635 http://setiathome.berkeley.edu/result.php?resultid=5049222197 Run time 4 min 24 sec CPU time 4 min 22 sec WU true angle range is : 0.228564 http://setiathome.berkeley.edu/result.php?resultid=5049610671 Run time 7 min 58 sec CPU time 7 min 56 sec WU true angle range is : 0.007531 http://setiathome.berkeley.edu/result.php?resultid=5049222194 Run time 10 min 15 sec CPU time 10 min 12 sec WU true angle range is : 0.006300 http://setiathome.berkeley.edu/result.php?resultid=5049227551 Run time 10 min 58 sec CPU time 10 min 51 sec SETI@Home Informational message -9 result_overflow WU true angle range is : 0.415784 http://setiathome.berkeley.edu/result.php?resultid=5049216828 Run time 11 sec CPU time 8 sec Only -use_sleep only WU true angle range is : 1.132067 http://setiathome.berkeley.edu/result.php?resultid=5049583939 Run time 2 min 36 sec CPU time 27 sec WU true angle range is : 1.087592 http://setiathome.berkeley.edu/result.php?resultid=5049563469 Run time 3 min 9 sec CPU time 50 sec WU true angle range is : 0.315959 http://setiathome.berkeley.edu/result.php?resultid=5049554025 Run time 5 min 55 sec CPU time 2 min 27 sec WU true angle range is : 0.315959 http://setiathome.berkeley.edu/result.php?resultid=5049553957 Run time 5 min 57 sec CPU time 2 min 27 sec WU true angle range is : 0.007006 http://setiathome.berkeley.edu/result.php?resultid=5049583950 Run time 11 min CPU time 3 min 44 sec WU true angle range is : 0.007006 http://setiathome.berkeley.edu/result.php?resultid=5049583887 Run time 10 min 55 sec CPU time 3 min 42 sec I'll let them run for a while without any commandlines, what else would you like to see? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Thanks for such good structured and detailed report. Nothing else with this build. I'll post another one soon - would be very good if you could do just similar run with it too. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
send them to me.Here are the files for a situation where I have 8GPU and 2 CPU tasks being blocked from Uploading. <stderr_txt> result map took 35.93ms, num_iter_wo_sync=12, mean_per_iter=2.994ms icfft=194302, FFTLength=4096, sync before triplet result map took 6.859ms, num_iter_wo_sync=3, mean_per_iter=2.286ms icfft=194315, FFTLength=2048, sync before triplet result map took 35.8ms, num_iter_wo_sync=12, mean_per_iter=2.983ms icfft=194319, FFTLength=2048, sync before triplet result map took 6.871ms, num_iter_wo_sync=3, mean_per_iter=2.29ms icfft=194333, FFTLength=4096, sync before triplet result map took 34.14ms, num_iter_wo_sync=13, mean_per_iter=2.626ms icfft=194336, FFTLength=4096, sync before triplet result map took 6.721ms, num_iter_wo_sync=3, mean_per_iter=2.24ms icfft=194349, FFTLength=512, sync before triplet result map took 33.37ms, num_iter_wo_sync=13, mean_per_iter=2.567ms icfft=194355, FFTLength=512, sync before triplet result map took 6.464ms, num_iter_wo_sync=3, mean_per_iter=2.155ms icfft=194371, FFTLength=4096, sync before triplet result map took 34.14ms, num_iter_wo_sync=12, mean_per_iter=2.845ms icfft=194374, FFTLength=4096, sync before triplet result map took 6.75ms, num_iter_wo_sync=3, mean_per_iter=2.25ms icfft=194387, FFTLength=2048, sync before triplet result map took 35.11ms, num_iter_wo_sync=12, mean_per_iter=2.926ms icfft=194391, FFTLength=2048, sync before triplet result map took 7.229ms, num_iter_wo_sync=3, mean_per_iter=2.41ms icfft=194405, FFTLength=4096, sync before triplet result map took 36.23ms, num_iter_wo_sync=12, mean_per_iter=3.019ms icfft=194408, FFTLength=4096, sync before triplet result map took 7.57ms, num_iter_wo_sync=3, mean_per_iter=2.523ms icfft=194421, FFTLength=1024, sync before triplet result map took 35.89ms, num_iter_wo_sync=12, mean_per_iter=2.991ms there is no stderr header available with AR info so such records are skipped. verbose builds not suitable for performance analysis BTW. They are special-purpose ones. SETI apps news We're not gonna fight them. We're gonna transcend them. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.