Message boards :
Number crunching :
have more GPUs than actually exist
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 8 Dec 08 Posts: 231 Credit: 28,112,547 RAC: 1 ![]() |
i miss something on this. what is it relating to. ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
i miss something on this. what is it relating to. Petri says a version 0.99 of the special app is working and close to release. That would enable you to set <ngpus>0.5</ngpus or <count>0.5</count> for the app and run two tasks at the same time on the card. It won't really run both simultaneously like you could do with the older CUDA42 or CUDA50 apps or the SoG app. What it will do is load two tasks and prep both for running my pre-initalizing the fft search mechanism. As soon as the current task finishes, it will immediately start crunching the other staged task on the card and then preload another task to be ready to start when the current task again finishes. Because of the large reduction in memory usage that the 0.98 special app achieved, there is plenty of room to load two tasks in the gpu memory, even in the lesser cards with only 3 or 4GB of memory. This will save 2-4 seconds of idle time in the task loading transaction for each task. Over an hour or a day, that will allow more tasks being crunched and production will go up. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 9 Mar 06 Posts: 21140 Credit: 33,933,039 RAC: 23 ![]() ![]() |
i miss something on this. what is it relating to. What about Cards with 2GB VRAM? (EVGA GTX-1050 2GB GDDR5 VRAM.) TL TimeLord04 Have TARDIS, will travel... Come along K-9! Join Calm Chaos |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
That might be cutting it too close. Only Petri knows for sure. I am seeing about 1600MB of gpu memory in use for a gpu task on the 0.98b1CUDA10.1 app. Think that two task loaded on a 2GB card will be unlikely. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 309 Credit: 70,759,933 RAC: 3 ![]() |
This will save 2-4 seconds of idle time in the task loading transaction for each task. Over an hour or a day, that will allow more tasks being crunched and production will go up. Not sure how useful this would be, but I have a windows program that reads the BoincTasks history file and shows idle time for various projects and systems. It is at https://github.com/BeemerBiker/Gridcoin/tree/master/BTHistoryReader and would have to be built with VS2017. Here is a sample output that shows an idle problem on milkyway https://github.com/BeemerBiker/Gridcoin/blob/master/BTHistoryReader/BTHistory_Demo3.png |
Oddbjornik ![]() ![]() ![]() ![]() Send message Joined: 15 May 99 Posts: 220 Credit: 349,610,548 RAC: 1,728 ![]() ![]() |
I guess you managed to wrangle the code snippet oddbjornik threw in here for pre-initialization.I have turned that code snippet inside out and upside down several times since I first aired it here. The current version has run for two weeks without incident on my Linux hosts, and as Petri indicates, it should be in the pipeline for release with 0.99. As usual, Petri and TBar will handle the full testing/compiling/packaging/release of 0.99 to the public. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Hi Oddbjornik, so does the extra code for the mutex lock handle error conditions gracefully? Or have neither you nor Petri run into that experimental condition yet? Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Oddbjornik ![]() ![]() ![]() ![]() Send message Joined: 15 May 99 Posts: 220 Credit: 349,610,548 RAC: 1,728 ![]() ![]() |
Hi Oddbjornik, so does the extra code for the mutex lock handle error conditions gracefully? Or have neither you nor Petri run into that experimental condition yet?That's what the turning inside out of the code has been for. I believe it is pretty near bulletproof by now. PM me if you want a link to the source. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Thanks, I'll wait for the official release. I'm in no hurry. My time right now is trying to get a gpu app working on my new Jetson Nano. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 ![]() ![]() |
My time right now is trying to get a gpu app working on my new Jetson Nano. He tasks me. He tasks me, and I shall have him. I'll chase him round the Moons of Nibia and round the Antares Maelstrom and round Perdition's flames before I give him up! ![]() ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
My time right now is trying to get a gpu app working on my new Jetson Nano. Love the quote! Ha ha. LOL. I'm close. I got tasks this time but errored out on amount of disk space required. Bumped up everything I could think of plus suspending Seti for the time being so just Einstein is able to run unobstructed hopefully. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Back to the penalty box again for another 24 hours. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 23 May 99 Posts: 7381 Credit: 44,181,323 RAC: 238 ![]() ![]() |
My time right now is trying to get a gpu app working on my new Jetson Nano. Hi Zalster, ST: The Wrath of Khan. :) Love that movie! Have a great day! :) Siran CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
I'm close. I got tasks this time but errored out on amount of disk space required. Bumped up everything I could think of plus suspending Seti for the time being so just Einstein is able to run unobstructed hopefully.Diagnosis and suggestion posted at BOINC message 91274 |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Thanks for explaining the misleading error message. See that the message has nothing to do with BOINC disk usage limits now. My third 24 hour delay period just expired and then BOINC set another 24 hour delay. So still unable to test if my present app_info will work until I can finally get some work to test with tomorrow. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Thanks for explaining the misleading error message. See that the message has nothing to do with BOINC disk usage limits now. My third 24 hour delay period just expired and then BOINC set another 24 hour delay. So still unable to test if my present app_info will work until I can finally get some work to test with tomorrow.There must be a reason behind that delay. I'll go and look for it. |
![]() ![]() ![]() Send message Joined: 6 Nov 99 Posts: 717 Credit: 8,032,827 RAC: 62 ![]() ![]() |
does Cuda V0.98B1 made some errors on overflow wu ? 3452554813 Lan Computer here 7632732562
then mine 7632732563
then third wingmen 7638624591
|
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
does Cuda V0.98B1 made some errors on overflow wu ?Not errors - all three tasks are 'similar enough' to have validated and been granted credit. But I believe it has been acknowledged that with these short-running overflow tasks, there is some imprecision in the process of selecting the 'first' 30 signals to report (processing performed in a different order, I think), resulting in the inconclusive validation between the first two results returned. This makes no difference to end users, but does place some additional strain on the project (servers have to create an extra task replication in the database, network has to support an additional data file download). Whether that matters depends on whether you are thinking as a user, or as a project administrator. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13915 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Did we ever get a final fix for this issue? Just bought myself a late birthday present, which required upgrading the video driver. Tried other suggestions for downgrading the driver, but am still stuck with 2 OpenCL lines in the BOINC event log for each video card, resulting in only one of them processing work. 19/05/2019 08:33:08 | | Starting BOINC client version 7.6.33 for windows_x86_64 19/05/2019 08:33:08 | | log flags: file_xfer, sched_ops, task 19/05/2019 08:33:08 | | Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8 19/05/2019 08:33:08 | | Data directory: C:\ProgramData\BOINC 19/05/2019 08:33:08 | | Running under account USER 19/05/2019 08:33:09 | | CUDA: NVIDIA GPU 0: GeForce RTX 2060 (driver version 418.81, CUDA version 10.1, compute capability 7.5, 4096MB, 3556MB available, 14054 GFLOPS peak) 19/05/2019 08:33:09 | | CUDA: NVIDIA GPU 1: GeForce GTX 1070 (driver version 418.81, CUDA version 10.1, compute capability 6.1, 4096MB, 3556MB available, 6852 GFLOPS peak) 19/05/2019 08:33:09 | | OpenCL: NVIDIA GPU 0: GeForce RTX 2060 (driver version 418.81, device version OpenCL 1.2 CUDA, 6144MB, 3556MB available, 14054 GFLOPS peak) 19/05/2019 08:33:09 | | OpenCL: NVIDIA GPU 0: GeForce RTX 2060 (driver version 418.81, device version OpenCL 1.2 CUDA, 6144MB, 3556MB available, 14054 GFLOPS peak) 19/05/2019 08:33:09 | | OpenCL: NVIDIA GPU 1: GeForce GTX 1070 (driver version 418.81, device version OpenCL 1.2 CUDA, 8192MB, 3556MB available, 6852 GFLOPS peak) 19/05/2019 08:33:09 | | OpenCL: NVIDIA GPU 1: GeForce GTX 1070 (driver version 418.81, device version OpenCL 1.2 CUDA, 8192MB, 3556MB available, 6852 GFLOPS peak) 19/05/2019 08:33:09 | SETI@home | Found app_info.xml; using anonymous platform 19/05/2019 08:33:09 | | Host name: Grant-PC 19/05/2019 08:33:09 | | Processor: 12 GenuineIntel Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz [Family 6 Model 158 Stepping 10] 19/05/2019 08:33:09 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 pbe fsgsbase bmi1 hle smep bmi2 19/05/2019 08:33:09 | | OS: Microsoft Windows 10: Professional x64 Edition, (10.00.17134.00) 19/05/2019 08:33:09 | | Memory: 15.95 GB physical, 18.33 GB virtual 19/05/2019 08:33:09 | | Disk: 930.56 GB total, 850.84 GB free 19/05/2019 08:33:09 | | Local time is UTC +9 hours 19/05/2019 08:33:09 | SETI@home | Found app_config.xml 19/05/2019 08:33:09 | SETI@home Beta Test | Found app_config.xml 19/05/2019 08:33:09 | | Config: use all coprocessors Even with the oldest driver I can use, the coproc_info.xml gets overwritten with the extra entries, so I've used BeemerBiker's workaround for now (many thanks for that BTW). Interestingly, checking out the results of work processed on my account, the new results show double entries for the OpenCL as well. eg SETI8 update by Raistmer OpenCL version by Raistmer, r3557 Number of OpenCL platforms: 2 OpenCL Platform Name: NVIDIA CUDA Number of devices: 2 Max compute units: 30 Max work group size: 1024 Max clock frequency: 1830Mhz Max memory allocation: 1610612736 Cache type: Read/Write Cache line size: 128 Cache size: 491520 Global memory size: 6442450944 Constant buffer size: 65536 Max number of constant args: 9 Local memory type: Scratchpad Local memory size: 49152 Queue properties: Out-of-Order: Yes Name: GeForce RTX 2060 Vendor: NVIDIA Corporation Driver version: 418.81 Version: OpenCL 1.2 CUDA Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer Max compute units: 15 Max work group size: 1024 Max clock frequency: 1784Mhz Max memory allocation: 2147483648 Cache type: Read/Write Cache line size: 128 Cache size: 245760 Global memory size: 8589934592 Constant buffer size: 65536 Max number of constant args: 9 Local memory type: Scratchpad Local memory size: 49152 Queue properties: Out-of-Order: Yes Name: GeForce GTX 1070 Vendor: NVIDIA Corporation Driver version: 418.81 Version: OpenCL 1.2 CUDA Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer OpenCL Platform Name: NVIDIA CUDA Number of devices: 2 Max compute units: 30 Max work group size: 1024 Max clock frequency: 1830Mhz Max memory allocation: 1610612736 Cache type: Read/Write Cache line size: 128 Cache size: 491520 Global memory size: 6442450944 Constant buffer size: 65536 Max number of constant args: 9 Local memory type: Scratchpad Local memory size: 49152 Queue properties: Out-of-Order: Yes Name: GeForce RTX 2060 Vendor: NVIDIA Corporation Driver version: 418.81 Version: OpenCL 1.2 CUDA Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer Max compute units: 15 Max work group size: 1024 Max clock frequency: 1784Mhz Max memory allocation: 2147483648 Cache type: Read/Write Cache line size: 128 Cache size: 245760 Global memory size: 8589934592 Constant buffer size: 65536 Max number of constant args: 9 Local memory type: Scratchpad Local memory size: 49152 Queue properties: Out-of-Order: Yes Name: GeForce GTX 1070 Vendor: NVIDIA Corporation Driver version: 418.81 Version: OpenCL 1.2 CUDA Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer Work Unit Info: Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13915 Credit: 208,696,464 RAC: 304 ![]() ![]() |
In the previous thread, Juha suggested that you inspect HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenCL\Vendors, but I can't see any reply to that particular question. When I look at that key on my machine here, I see On my system Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenCL\Vendors C:\WINDOWS\System32\DriverStore\FileRepository\igdlh64.inf_amd64_9929e26743d53831\IntelOpenCL64.dll Grant Darwin NT |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.