Message boards :
Number crunching :
Developing a Multi-Threaded Benchmarking App for Linux
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next
Author | Message |
---|---|
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
I have run my first benchmark of CPU app performance. It uses 30 cores and does 30 repetitions for each app. ![]() The high variability is likely an effect of the 2990WX having only half the cores with direct memory access and the last run of jobs occuring in low loading. Should re-run on my 1950X system in the future. Also, probably would be better to use 30 different WUs rather than 30 repetitions of the same WU. GitHub: Ricks-Lab Instagram: ricks_labs ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
The AVX2 app is looking good. I never retested that app on Ryzen + I don't think. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13913 Credit: 208,696,464 RAC: 304 ![]() ![]() |
Also, probably would be better to use 30 different WUs rather than 30 repetitions of the same WU. A mix of Areibo & GBT tasks would be interesting. One of the issues with running 2 GPU WUs at a time under CUDA50 was when a Arecibo & GBT WU were running on the same GPU, the runtime for the Arecibo task would generally triple. I don't see that happening on the CPU, but I wouldn't be surprised if there were some performance impact there. Grant Darwin NT |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
Also, probably would be better to use 30 different WUs rather than 30 repetitions of the same WU. Good idea. I will setup a new benchmark run to execute during this week's outage. GitHub: Ricks-Lab Instagram: ricks_labs ![]() |
![]() ![]() ![]() ![]() Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 ![]() ![]() |
Curious, Did you run each app individually for a day, then the next, etc. Or run a mixture of apps all at once? |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
Curious, Did you run each app individually for a day, then the next, etc. I ran 30 iterations of all apps with a single WU, which is 180 tasks. These 180 tasks were loaded across 30 cores until complete. GitHub: Ricks-Lab Instagram: ricks_labs ![]() |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
Here are the 1950x results for the benchmark run identical to what I did for the 2990WX. The 1950x has SMT enabled, while the 2990WX has it disabled, so both runs used 30 of a total of 32 available threads. ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Interesting, it looks like the Gen.1 TR likes the standard AVX application while the Gen. 2 TR likes the AVX2 application. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I'm currently running the r3711 SSE41 app against the AVX2 app since that one wasn't included in Rick's set of default apps for some reason. Also some anomalous behavior in the number of gpu instances that can be invoked for some reason. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
I'm currently running the r3711 SSE41 app against the AVX2 app since that one wasn't included in Rick's set of default apps for some reason. Also some anomalous behavior in the number of gpu instances that can be invoked for some reason. Can you provide a link of where a set of 3711 apps can be found? I will plan to include them in the benchmark run planned during the outage. |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
I'm currently running the r3711 SSE41 app against the AVX2 app since that one wasn't included in Rick's set of default apps for some reason. Also some anomalous behavior in the number of gpu instances that can be invoked for some reason. benchMT currently only allows 1 task per GPU. Number of GPUs is determined by lshw -short | grep display Does the log file indicate the correct number of GPUs? |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
The r3711 SSE41 app is the default app installed the TBar BOINC All-in-One packages. http://www.arkayn.us/lunatics/BOINC-7.8.3.7z Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I'm currently running the r3711 SSE41 app against the AVX2 app since that one wasn't included in Rick's set of default apps for some reason. Also some anomalous behavior in the number of gpu instances that can be invoked for some reason. I'm only running one task per gpu since I am running the CUDA92 app. But if I only invoke 3 instances of the application, it only runs two tasks on two gpus and has the third instance pending until the first two complete, then the pending task runs on the first gpu. I see all 3 gpus always. This is the entry in benchCFG setiathome_x41p_V0.97b2_Linux-Pascal+_cuda92 setiathome_x41p_V0.97b2_Linux-Pascal+_cuda92 setiathome_x41p_V0.97b2_Linux-Pascal+_cuda92 #setiathome_x41p_V0.97b2_Linux-Pascal+_cuda92 This is what the benchmark is going to execute Only 0 CPU jobs and 3 GPU jobs. Max Threads reduced to 3 List of Initialized Slots SlotNum | platform | device | state | job | SlotDir -0------| GPU | 0 | EMPTY | None| /home/keith/Downloads/Utils/benchMT/Slots/0 -1------| GPU | 1 | EMPTY | None| /home/keith/Downloads/Utils/benchMT/Slots/1 -2------| CPU | NA | EMPTY | None| /home/keith/Downloads/Utils/benchMT/Slots/2 ##### 3 total slots Pending jobs (CPU/GPU): 0 / 3 Pending reference jobs: 0 Execute listed jobs? [y/N] With this benchCFG file entry setiathome_x41p_V0.97b2_Linux-Pascal+_cuda92 setiathome_x41p_V0.97b2_Linux-Pascal+_cuda92 setiathome_x41p_V0.97b2_Linux-Pascal+_cuda92 setiathome_x41p_V0.97b2_Linux-Pascal+_cuda92 This is what the benchmark is going to execute Only 0 CPU jobs and 4 GPU jobs. Max Threads reduced to 4 List of Initialized Slots SlotNum | platform | device | state | job | SlotDir -0------| GPU | 0 | EMPTY | None| /home/keith/Downloads/Utils/benchMT/Slots/0 -1------| GPU | 1 | EMPTY | None| /home/keith/Downloads/Utils/benchMT/Slots/1 -2------| GPU | 2 | EMPTY | None| /home/keith/Downloads/Utils/benchMT/Slots/2 -3------| CPU | NA | EMPTY | None| /home/keith/Downloads/Utils/benchMT/Slots/3 ##### 4 total slots Pending jobs (CPU/GPU): 0 / 4 Pending reference jobs: 0 Execute listed jobs? [y/N] Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
This is the output of the r3711SSE41 app versus the r3712AVX2 app. I ran 4 instances or each app. https://www.dropbox.com/s/wjgz56tqmrn1zi1/Screenshot%20from%202018-11-25%2018-52-54.png?dl=0 The SSE41 app is up to 10% faster than the AVX2 app. That is what I found on my old Gen. 1700X and 1800X cpus. So not seeing any improvement on Ryzen+ 2700X cpus. Might be something different on Threadrippers. I will be able to test on TR once I get my TR platform built. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
I'm currently running the r3711 SSE41 app against the AVX2 app since that one wasn't included in Rick's set of default apps for some reason. Also some anomalous behavior in the number of gpu instances that can be invoked for some reason. Looks like a bug. Let me try to reproduce it on my system this evening. |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
This is the output of the r3711SSE41 app versus the r3712AVX2 app. I ran 4 instances or each app. For analysis, I suggest using the .psv file the testData directory. This file is easy to import into excel and summarize with pivot. It is pipe delimited. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I'll have to pass since I flunked OpenCalc and Excel. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
I'm currently running the r3711 SSE41 app against the AVX2 app since that one wasn't included in Rick's set of default apps for some reason. Also some anomalous behavior in the number of gpu instances that can be invoked for some reason. When I run this on my system, It all appears normal. Can you post or send me your complete BenchCFG file and the hostname*.txt file in the run subdir of testData? Also, running it with --debug option might give more insight. Also, does this app require a .cl file? If so, it needs to be in the APPS_GPU directory. Thanks! |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
The r3711 SSE41 app is the default app installed the TBar BOINC All-in-One packages. I just downloaded and extracted. Did not find the r3711 SSE41 app. |
![]() ![]() ![]() Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785 ![]() ![]() |
The r3711 SSE41 app is the default app installed the TBar BOINC All-in-One packages. I found it in a different download on the site. I will include it in my next run. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.