Deprecated: Function get_magic_quotes_gpc() is deprecated in /disks/centurion/b/carolyn/b/home/boincadm/projects/beta/html/inc/util.inc on line 663
SETI@home v8 beta to begin on Tuesday

SETI@home v8 beta to begin on Tuesday

Message boards : News : SETI@home v8 beta to begin on Tuesday
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 51 · 52 · 53 · 54 · 55 · 56 · 57 . . . 99 · Next

AuthorMessage
Zalster
Volunteer tester

Send message
Joined: 30 Dec 13
Posts: 258
Credit: 12,340,341
RAC: 0
United States
Message 58820 - Posted: 1 Jul 2016, 0:39:01 UTC - in response to Message 58819.  

I've noticed that non-guppi are taking longer with r3480.

Since I run multiple instance after I get the 100 validated at stock, the time to complete is more apparent.

I'm seeing a 180s to 240s increase in time to complete while running r3480 multiple non-guppi.

The Guppis on the other hand have speed up about 120s.

Doesn't sound like much but when you look at the difference in time, I see that the Guppi are now completing 300s faster than the non-guppi.

With r3430 the non-guppi's used to be 120s faster than the GUPPIs, so this is a significant change.

I'll keep running 8.12 at stock 1 at a time just so they can have the data but I'm not going to upgrade my 2 Main Seti machine, since r3430 works better on them.
ID: 58820 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1700
Credit: 13,216,373
RAC: 0
Sweden
Message 58821 - Posted: 1 Jul 2016, 0:50:24 UTC - in response to Message 58820.  
Last modified: 1 Jul 2016, 0:53:05 UTC

I've noticed that non-guppi are taking longer with r3480.

Since I run multiple instance after I get the 100 validated at stock, the time to complete is more apparent.

I'm seeing a 180s to 240s increase in time to complete while running r3480 multiple non-guppi.

The Guppis on the other hand have speed up about 120s.

Doesn't sound like much but when you look at the difference in time, I see that the Guppi are now completing 300s faster than the non-guppi.

With r3430 the non-guppi's used to be 120s faster than the GUPPIs, so this is a significant change.

I'll keep running 8.12 at stock 1 at a time just so they can have the data but I'm not going to upgrade my 2 Main Seti machine, since r3430 works better on them.

Have you added "-high_perf" to the command file, to Set the app to high-performance path?

I'm not getting any SoG's for the moment, only the opencl_nvidia_sah app, because I screwed up (nothing new there) :-), but I added "-high_perf to the command file of the opencl_nvidia_sah app, and it seems to make it faster.
WARNING!! "THIS IS A SIGNATURE", of the "IT MAY CHANGE AT ANY MOMENT" type. It may, or may not be considered insulting, all depending upon HOW SENSITIVE THE VIEWER IS, to certain inputs to/from the nervous system.
ID: 58821 · Report as offensive
Zalster
Volunteer tester

Send message
Joined: 30 Dec 13
Posts: 258
Credit: 12,340,341
RAC: 0
United States
Message 58822 - Posted: 1 Jul 2016, 1:11:29 UTC - in response to Message 58821.  
Last modified: 1 Jul 2016, 1:22:01 UTC

Yes, I did add the -high_perf to the commandline, it did speed up the GUPPIs but did nothing for the non-guppi

Edit..

I'm not running with it right now so I can see how v8.14 run without it first.

After the 100 validate then I'll re-add app_config and place the commandlines back in
ID: 58822 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 58826 - Posted: 1 Jul 2016, 8:41:55 UTC - in response to Message 58822.  
Last modified: 1 Jul 2016, 8:52:23 UTC

Also I need examples of -v 6 runs on different tasks different GPUs.

EDIT: currently high-perf path only omits 2 sync points (but it seems it really matters for high-perf cards) in base PulseFind sequence.
With profiling data I request more changes could be possible to speedup high-end cards.

@Zalster could you answer questions in v8.12 support thread on main please.

EDIT2: from last "official" release (that is, since 8.12) sync points were added in partial PulseFind pass 4/5 too (I expect their omission made app not sensible to -pulse_iterations_num N increase for some NV GPUs). Seems high-end GPUs very strongly react on each such additional sync point. That's why now high-perf path separated too. W/o any profiling data I decided not to turn it ON on some pre-defined conditions so only manual enabling for now.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 58826 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 58829 - Posted: 1 Jul 2016, 8:56:49 UTC - in response to Message 58819.  
Last modified: 1 Jul 2016, 9:00:14 UTC

Hm.....

#define CL_OUT_OF_HOST_MEMORY                       -6


¯\_(ツ)_/¯

EDIT:
Please look for memory working set size. Does it continuously increase ?

No, Guppi fails totally on my iGPU with 8.14 SETI@home v8 (opencl_intel_gpu_sah) also.

Aborted, since it will only repeat itself.

In this case:

ERROR: OpenCL kernel/call 'Enqueueing kernel:PC_find_pulse_partial_kernel_cl,pass 3' call failed (-6) in file ..\analyzePoT.cpp near line 2340.
Waiting 30 sec before restart...

WU:
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24211237

I'll abort all Guppi's on my iGPU here on Beta, and let it run the Arecibo tasks it got, and then put it at use on Main, with 8.12 (opencl_intel_gpu_sah) MB8_win_x86_SSSE3_OpenCL_Intel_r3430.exe, where it works flawlessly, even with Guppi tasks.

(It used to work here on Beta too, with r3430) see two examples: http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24198175, and http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24198719

In other words: "r3480, is fail fail fail with Guppi's on my iGPU HD4600, while r3430 was win win win". :-)

News about SETI opt app releases: https://twitter.com/Raistmer
ID: 58829 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1700
Credit: 13,216,373
RAC: 0
Sweden
Message 58832 - Posted: 1 Jul 2016, 10:23:24 UTC - in response to Message 58829.  
Last modified: 1 Jul 2016, 10:45:40 UTC

Hm.....

#define CL_OUT_OF_HOST_MEMORY                       -6


¯\_(ツ)_/¯

EDIT:
Please look for memory working set size. Does it continuously increase ?



Just snagged another 8.14 Guppi to test with

Oh yes, in the most terrible way, it continues to increase all the time, until the working memory set is way over 500MB, Private working memory set continues to increase to over 450MB, and Virtual memory to over 500MB too, then it enters into a 30 seconds pause with the ERROR: OpenCL kernel/call 'Enqueueing kernel:PC_find_pulse_kernel_cl,pass 4' call failed (-6) in file ..\analyzePoT.cpp near line XXXX.

And so it goes all over again. The one I have now, will of course ultimately fail, so I will abort it. That is unless you want me to let it go on until it fails by itself, after I don't know how many restarts.

Edit, added: And this is with default settings, nothing in the command line file.

Edit, added 2: This one seems to possible be able to finish, after a number of restarts though. Each restart is after a higher percentage than the previous restart. I'll let it run, but this is not a viable way to run them, they take very long time, compared to it they can run uninterrupted :-)

And Edit 3: No, I take back that it might be able to finish. Next Pause for 30 seconds, was before the checkpoint, so it starts over again at 27.24%. That's what I get for having 600 seconds in between checkpoints. I'll change that now, to 60 seconds.
WARNING!! "THIS IS A SIGNATURE", of the "IT MAY CHANGE AT ANY MOMENT" type. It may, or may not be considered insulting, all depending upon HOW SENSITIVE THE VIEWER IS, to certain inputs to/from the nervous system.
ID: 58832 · Report as offensive
Winterknight
Volunteer tester

Send message
Joined: 15 Jun 05
Posts: 709
Credit: 5,834,108
RAC: 0
United Kingdom
Message 58833 - Posted: 1 Jul 2016, 10:37:22 UTC

On my Nvidia 670 the 8.14 SOG app looks to be much faster than the CUDA32 app for the VLAR tasks.
ID: 58833 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1700
Credit: 13,216,373
RAC: 0
Sweden
Message 58834 - Posted: 1 Jul 2016, 10:52:16 UTC - in response to Message 58833.  

On my Nvidia 670 the 8.14 SOG app looks to be much faster than the CUDA32 app for the VLAR tasks.

Oh yes, SoG of any version have always been faster than CUDA of any version. At least on my GTX980
WARNING!! "THIS IS A SIGNATURE", of the "IT MAY CHANGE AT ANY MOMENT" type. It may, or may not be considered insulting, all depending upon HOW SENSITIVE THE VIEWER IS, to certain inputs to/from the nervous system.
ID: 58834 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1700
Credit: 13,216,373
RAC: 0
Sweden
Message 58835 - Posted: 1 Jul 2016, 11:25:45 UTC
Last modified: 1 Jul 2016, 12:14:23 UTC

Well, with a low enough checkpoint time, it is possible to run 8.14 on my iGPU, but there are lots of restarts due to it increasing the working memory set, private memory set, and Virtual memory, until a 30 seconds pause occurs.

Something must have changed from r3430 (8.12), since this never happened with that version.

Anyhow, here's the finished WU, after lots of restarts:
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24214959

I'll abort the rest of my 8.14 Guppis, because this is not a viable way to run the guppies on my iGPU.

Thanks for the coffee :-)

Edit, added: Well, it's doing the same kind of memory increase on Arecibo tasks, (non VLARs), but it is able to finish them without any restarts (up to 480 MB working memory set) . Constant increasing of memory. However, if they were Arecibo VLARs, I bet it would come up with the same problem as Guppi's.
WARNING!! "THIS IS A SIGNATURE", of the "IT MAY CHANGE AT ANY MOMENT" type. It may, or may not be considered insulting, all depending upon HOW SENSITIVE THE VIEWER IS, to certain inputs to/from the nervous system.
ID: 58835 · Report as offensive
SusieQ
Volunteer tester

Send message
Joined: 12 Nov 10
Posts: 1149
Credit: 32,460,657
RAC: 1
United Kingdom
Message 58836 - Posted: 1 Jul 2016, 11:32:14 UTC
Last modified: 1 Jul 2016, 11:47:19 UTC

Just forced my client to run an 8.14 task on my Intel GPU - INTEL Intel(R) HD Graphics 530 (4859MB) OpenCL: 2.0

It did eventually finish, and validate, but it kept stopping and swapping to another task before restarting.

Stderr log shows - similar to Tutankhamon

ERROR: OpenCL kernel/call 'Enqueueing kernel:PC_find_pulse_kernel_cl' call failed (-6) in file ..\analyzePoT.cpp near line 3287.

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24214398

EDIT: If the figures in Task Manager are to be believed, not only did the memory usage climb constantly to over 550Mb, the CPU usage ran mostly between 3.x% and 6.x% constantly, sometimes falling to 2.x%, sometimes rising to 8.x%.

In contrast a non-guppi GPU task is now running, 250-300Mb memory, rising and falling, and 0.x% CPU
ID: 58836 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1700
Credit: 13,216,373
RAC: 0
Sweden
Message 58837 - Posted: 1 Jul 2016, 11:34:18 UTC - in response to Message 58836.  

Just forced my client to run an 8.14 task on my Intel GPU - INTEL Intel(R) HD Graphics 530 (4859MB) OpenCL: 2.0

It did eventually finish, and validate, but it kept stopping and swapping to another task before restarting.

Stderr log shows - similar to Tutankhamon

ERROR: OpenCL kernel/call 'Enqueueing kernel:PC_find_pulse_kernel_cl' call failed (-6) in file ..\analyzePoT.cpp near line 3287.

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24214398

Heh, yes that was also an ugly one :-)
WARNING!! "THIS IS A SIGNATURE", of the "IT MAY CHANGE AT ANY MOMENT" type. It may, or may not be considered insulting, all depending upon HOW SENSITIVE THE VIEWER IS, to certain inputs to/from the nervous system.
ID: 58837 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 58838 - Posted: 1 Jul 2016, 16:51:11 UTC - in response to Message 58835.  


Something must have changed from r3430 (8.12), since this never happened with that version.

Sure.

OK, thanks for testing. You gave me some ideas what it could be.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 58838 · Report as offensive
Profile [AF>EDLS]GuL
Volunteer tester

Send message
Joined: 4 Mar 11
Posts: 1
Credit: 2,165,853
RAC: 0
France
Message 58839 - Posted: 1 Jul 2016, 16:53:17 UTC - in response to Message 58794.  

Hi all,
I had also the problem of missing cl file with version 8.13 and a GTX 780.
Task postponed: Can't read CL file


MultiBeam_Kernels_r3480.cl file fixed the problem.

Cheers
ID: 58839 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1700
Credit: 13,216,373
RAC: 0
Sweden
Message 58842 - Posted: 1 Jul 2016, 22:57:15 UTC - in response to Message 58838.  
Last modified: 1 Jul 2016, 23:13:12 UTC


Something must have changed from r3430 (8.12), since this never happened with that version.

Sure.

OK, thanks for testing. You gave me some ideas what it could be.

Very good Raistmer. Just one more note. 8.14 opencl_nvidia_sah, and 8.14 opencl_nvidia_SoG does not show this continuous increase of memory usage. They both seem pretty stable once they have reached their max memory usage. Sure, going up and down a little during the runs, but that's normal I believe.
WARNING!! "THIS IS A SIGNATURE", of the "IT MAY CHANGE AT ANY MOMENT" type. It may, or may not be considered insulting, all depending upon HOW SENSITIVE THE VIEWER IS, to certain inputs to/from the nervous system.
ID: 58842 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 58843 - Posted: 2 Jul 2016, 1:24:50 UTC

No idea if this is related, but in the last months I've developed an issue here with memory, not GPU but physical, virtual and page file main memory, becoming exhausted requiring reboots. Biggest issue seems to be Page File usage growing to 100% over a period for 2-3 days. 3 of the 4 machines are dedicated cruncher, with little or no other work going on.
GPU memory use seems to be ok, and varies within normal ranges of 25-50% allocation of 2mb for 2 tasks per GPU.
Applications are x41zj_win32_cuda50 for the GPUs (2x750ti), and MB8_win_x64_SSE3_VS2008_r3330 for the CPUs.
Wondering if anyone else has seen this?
Thanks!
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 58843 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1700
Credit: 13,216,373
RAC: 0
Sweden
Message 58844 - Posted: 2 Jul 2016, 1:41:30 UTC - in response to Message 58843.  
Last modified: 2 Jul 2016, 1:51:29 UTC

No idea if this is related, but in the last months I've developed an issue here with memory, not GPU but physical, virtual and page file main memory, becoming exhausted requiring reboots. Biggest issue seems to be Page File usage growing to 100% over a period for 2-3 days. 3 of the 4 machines are dedicated cruncher, with little or no other work going on.
GPU memory use seems to be ok, and varies within normal ranges of 25-50% allocation of 2mb for 2 tasks per GPU.
Applications are x41zj_win32_cuda50 for the GPUs (2x750ti), and MB8_win_x64_SSE3_VS2008_r3330 for the CPUs.
Wondering if anyone else has seen this?
Thanks!

No, never seen that on any of my computers, Windows XP, Windows Vista, or Windows 8.1. They can run for weeks, and are only rebooted when I need to update something. They have sometimes been running for a month or more, if I forget to update.

Sounds as if you have something running on that computer, that develops memory leaks over time. Some service, or driver, or Anti Virus program, or something....
WARNING!! "THIS IS A SIGNATURE", of the "IT MAY CHANGE AT ANY MOMENT" type. It may, or may not be considered insulting, all depending upon HOW SENSITIVE THE VIEWER IS, to certain inputs to/from the nervous system.
ID: 58844 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 15 Jun 16
Posts: 45
Credit: 1,836,741
RAC: 0
Australia
Message 58845 - Posted: 2 Jul 2016, 4:06:19 UTC - in response to Message 58843.  

3 of the 4 machines are dedicated cruncher, with little or no other work going on.

Is this occurring on all systems?

Like King Tut, I haven't had any issues like you're describing.
Grant
Darwin NT.
ID: 58845 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 58846 - Posted: 2 Jul 2016, 4:48:33 UTC - in response to Message 58845.  

3 of the 4 machines are dedicated cruncher, with little or no other work going on.

Is this occurring on all systems?

Like King Tut, I haven't had any issues like you're describing.

Yes, does happen on all 4 boxes, to a greater or lesser degree. I generally forgive most issues on the HP8000, as it's a much less capable box (Win7 pro x64,3 ghz core2Quad, 8gb ram) and I'm really asking a bit too much from it (Security video, Firefox, BOINCTasks, Malik's HWInfo64, and any general computing I do).
The two Z400s (Win7 pro x64, Xeon quad) literally have nothing running on them besides MS Security Essentials, BOINCManager and HWInfo64.
Same applies for the Z600 (Win7 pro x64, 2x Xeon quad), except that 1) I have an external USB drive connected that is mapped for all systems to use as data store for docs, music, video and pics, and 2) a secondary internal drive is used across the network for storage of security video managed by the HP8000.
Puzzling ...
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 58846 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 15 Jun 16
Posts: 45
Credit: 1,836,741
RAC: 0
Australia
Message 58847 - Posted: 2 Jul 2016, 7:14:34 UTC - in response to Message 58846.  
Last modified: 2 Jul 2016, 7:15:14 UTC

I'd give Process Explorer a go to see which application or service is grabbing all the RAM.
It's just very odd that it's occurring on all 4 systems
Grant
Darwin NT.
ID: 58847 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 58848 - Posted: 2 Jul 2016, 7:58:40 UTC - in response to Message 58847.  
Last modified: 2 Jul 2016, 8:00:46 UTC

I'd give Process Explorer a go to see which application or service is grabbing all the RAM.
It's just very odd that it's occurring on all 4 systems

Tut, Grant, thanks! Just needed another set of eyes to point out the obvious, which you both did quite nicely. Smack upside the head cheerfully self-administered!
Problem identified, if not resolved. As it turns out, it's nothing related to BOINC or Seti/Lunatics, so sorry for using the bandwidth here ...

The culprit is HWInfo64 itself, which runs on all 4 boxes and seems to have a slow incremental memory leak. Exiting the app seems to release the memory, and it can then be restarted with no reboot and begin the leak anew:)
Loaded up task mgr on the Z600, looked at processes and HWInfo64 was 'only' using a tad more than 10 gig of memory. Sheesh. Was running 5.22, did the upgrade to 5.30 and if that doesn't solve it I guess I'll have to do a bug report on that one; too useful a tool not to keep using it.

Thanks again, guys, and a heads-up to anyone else that uses it.
Later, ...
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 58848 · Report as offensive
Previous · 1 . . . 51 · 52 · 53 · 54 · 55 · 56 · 57 . . . 99 · Next

Message boards : News : SETI@home v8 beta to begin on Tuesday


 
©2023 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.