SETI@home v8 beta to begin on Tuesday

Message boards : News : SETI@home v8 beta to begin on Tuesday
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 52 · 53 · 54 · 55 · 56 · 57 · 58 . . . 99 · Next

AuthorMessage
Profile [AF>EDLS]GuL
Volunteer tester

Send message
Joined: 4 Mar 11
Posts: 1
Credit: 2,165,853
RAC: 793
France
Message 58839 - Posted: 1 Jul 2016, 16:53:17 UTC - in response to Message 58794.  

Hi all,
I had also the problem of missing cl file with version 8.13 and a GTX 780.
Task postponed: Can't read CL file


MultiBeam_Kernels_r3480.cl file fixed the problem.

Cheers
ID: 58839 · Report as offensive
Grumpy Old Man
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1488
Credit: 8,634,021
RAC: 22,735
Sweden
Message 58842 - Posted: 1 Jul 2016, 22:57:15 UTC - in response to Message 58838.  
Last modified: 1 Jul 2016, 23:13:12 UTC


Something must have changed from r3430 (8.12), since this never happened with that version.

Sure.

OK, thanks for testing. You gave me some ideas what it could be.

Very good Raistmer. Just one more note. 8.14 opencl_nvidia_sah, and 8.14 opencl_nvidia_SoG does not show this continuous increase of memory usage. They both seem pretty stable once they have reached their max memory usage. Sure, going up and down a little during the runs, but that's normal I believe.
ID: 58842 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 58843 - Posted: 2 Jul 2016, 1:24:50 UTC

No idea if this is related, but in the last months I've developed an issue here with memory, not GPU but physical, virtual and page file main memory, becoming exhausted requiring reboots. Biggest issue seems to be Page File usage growing to 100% over a period for 2-3 days. 3 of the 4 machines are dedicated cruncher, with little or no other work going on.
GPU memory use seems to be ok, and varies within normal ranges of 25-50% allocation of 2mb for 2 tasks per GPU.
Applications are x41zj_win32_cuda50 for the GPUs (2x750ti), and MB8_win_x64_SSE3_VS2008_r3330 for the CPUs.
Wondering if anyone else has seen this?
Thanks!
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 58843 · Report as offensive
Grumpy Old Man
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1488
Credit: 8,634,021
RAC: 22,735
Sweden
Message 58844 - Posted: 2 Jul 2016, 1:41:30 UTC - in response to Message 58843.  
Last modified: 2 Jul 2016, 1:51:29 UTC

No idea if this is related, but in the last months I've developed an issue here with memory, not GPU but physical, virtual and page file main memory, becoming exhausted requiring reboots. Biggest issue seems to be Page File usage growing to 100% over a period for 2-3 days. 3 of the 4 machines are dedicated cruncher, with little or no other work going on.
GPU memory use seems to be ok, and varies within normal ranges of 25-50% allocation of 2mb for 2 tasks per GPU.
Applications are x41zj_win32_cuda50 for the GPUs (2x750ti), and MB8_win_x64_SSE3_VS2008_r3330 for the CPUs.
Wondering if anyone else has seen this?
Thanks!

No, never seen that on any of my computers, Windows XP, Windows Vista, or Windows 8.1. They can run for weeks, and are only rebooted when I need to update something. They have sometimes been running for a month or more, if I forget to update.

Sounds as if you have something running on that computer, that develops memory leaks over time. Some service, or driver, or Anti Virus program, or something....
ID: 58844 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 15 Jun 16
Posts: 43
Credit: 1,836,741
RAC: 0
Australia
Message 58845 - Posted: 2 Jul 2016, 4:06:19 UTC - in response to Message 58843.  

3 of the 4 machines are dedicated cruncher, with little or no other work going on.

Is this occurring on all systems?

Like King Tut, I haven't had any issues like you're describing.
Grant
Darwin NT.
ID: 58845 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 58846 - Posted: 2 Jul 2016, 4:48:33 UTC - in response to Message 58845.  

3 of the 4 machines are dedicated cruncher, with little or no other work going on.

Is this occurring on all systems?

Like King Tut, I haven't had any issues like you're describing.

Yes, does happen on all 4 boxes, to a greater or lesser degree. I generally forgive most issues on the HP8000, as it's a much less capable box (Win7 pro x64,3 ghz core2Quad, 8gb ram) and I'm really asking a bit too much from it (Security video, Firefox, BOINCTasks, Malik's HWInfo64, and any general computing I do).
The two Z400s (Win7 pro x64, Xeon quad) literally have nothing running on them besides MS Security Essentials, BOINCManager and HWInfo64.
Same applies for the Z600 (Win7 pro x64, 2x Xeon quad), except that 1) I have an external USB drive connected that is mapped for all systems to use as data store for docs, music, video and pics, and 2) a secondary internal drive is used across the network for storage of security video managed by the HP8000.
Puzzling ...
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 58846 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 15 Jun 16
Posts: 43
Credit: 1,836,741
RAC: 0
Australia
Message 58847 - Posted: 2 Jul 2016, 7:14:34 UTC - in response to Message 58846.  
Last modified: 2 Jul 2016, 7:15:14 UTC

I'd give Process Explorer a go to see which application or service is grabbing all the RAM.
It's just very odd that it's occurring on all 4 systems
Grant
Darwin NT.
ID: 58847 · Report as offensive
Profile Jimbocous
Volunteer tester
Avatar

Send message
Joined: 9 Jan 16
Posts: 51
Credit: 1,038,205
RAC: 0
United States
Message 58848 - Posted: 2 Jul 2016, 7:58:40 UTC - in response to Message 58847.  
Last modified: 2 Jul 2016, 8:00:46 UTC

I'd give Process Explorer a go to see which application or service is grabbing all the RAM.
It's just very odd that it's occurring on all 4 systems

Tut, Grant, thanks! Just needed another set of eyes to point out the obvious, which you both did quite nicely. Smack upside the head cheerfully self-administered!
Problem identified, if not resolved. As it turns out, it's nothing related to BOINC or Seti/Lunatics, so sorry for using the bandwidth here ...

The culprit is HWInfo64 itself, which runs on all 4 boxes and seems to have a slow incremental memory leak. Exiting the app seems to release the memory, and it can then be restarted with no reboot and begin the leak anew:)
Loaded up task mgr on the Z600, looked at processes and HWInfo64 was 'only' using a tad more than 10 gig of memory. Sheesh. Was running 5.22, did the upgrade to 5.30 and if that doesn't solve it I guess I'll have to do a bug report on that one; too useful a tool not to keep using it.

Thanks again, guys, and a heads-up to anyone else that uses it.
Later, ...
If I can help out by testing something, please let me know.
Available hardware and software is listed in my profile here.
ID: 58848 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 15 Jun 16
Posts: 43
Credit: 1,836,741
RAC: 0
Australia
Message 58849 - Posted: 2 Jul 2016, 8:42:08 UTC - in response to Message 58848.  
Last modified: 2 Jul 2016, 8:47:47 UTC

At least it was quickly resolved, unlike Keith Myers intermittent download issues on main.

EDIT-
From the Release notes for v5.24
- Fixed a memory leak in server side of remote sensor monitoring.
Grant
Darwin NT.
ID: 58849 · Report as offensive
BetelgeuseFive
Volunteer tester

Send message
Joined: 3 Jun 12
Posts: 64
Credit: 2,300,431
RAC: 381
Netherlands
Message 58853 - Posted: 2 Jul 2016, 13:49:02 UTC

Why are the new (8.13/8.14) opencl_nvidia tasks (both SoG and sah) using so much CPU time ? I have specified -use_sleep, but I still seem to need nearly an entire CPU core to feed the GPU.

Examples:

8.13 SoG: https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24206772
8.14 SoG: https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24217433
8.14 SoG: https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24225616
8.14 sah: https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24225173

Previous versions did not have this problem:

8.12 SoG: https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24146549
8.12 SoG: https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=23934991
8.12 sah: https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=23868462

I don't like wasting my CPU time just feeding the GPU and by using -use_sleep there were no problems (BOINC would soon select cuda50 over opencl versions), but now I am not so sure.

Tom
ID: 58853 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 58854 - Posted: 2 Jul 2016, 14:42:09 UTC - in response to Message 58853.  

Why are the new (8.13/8.14) opencl_nvidia tasks (both SoG and sah) using so much CPU time ? I have specified -use_sleep, but I still seem to need nearly an entire CPU core to feed the GPU.

Add -v 6 to tuning line. And post few results with this option ON.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 58854 · Report as offensive
Grumpy Old Man
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1488
Credit: 8,634,021
RAC: 22,735
Sweden
Message 58855 - Posted: 2 Jul 2016, 18:38:31 UTC
Last modified: 2 Jul 2016, 18:45:18 UTC

Yikes!!, we're in the middle of a perfect Arecibo VLAR storm again. No wonder the tasks all of a sudden takes more than twice as long as before :-)

This is going to screw up the APR again, so Non SoG will be chosen just when the Arecibo VLAR storm is over, and then Non SoG will be chosen as the fastest app, because it never did run any Arecibo VLARs.

Just like the last Arecibo VLAR storm, one has to then manually abort so many Non SoG, so that SoG again can come into play. Not good at all when there is so many WU's of one type being split, and not a good mix.
ID: 58855 · Report as offensive
Grumpy Old Man
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1488
Credit: 8,634,021
RAC: 22,735
Sweden
Message 58856 - Posted: 2 Jul 2016, 21:25:31 UTC
Last modified: 2 Jul 2016, 21:31:47 UTC

Dropping on the fly <ngpus>, to 0.50 (from 0.33) for SoG and Non SoG, in app_config.xml, because I already know that SoG is the fastest app, no matter the AR.

Just to avoid APR rot, getting slower apps, due to this Arecibo VLAR storm.
ID: 58856 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 15 Jun 16
Posts: 43
Credit: 1,836,741
RAC: 0
Australia
Message 58857 - Posted: 2 Jul 2016, 23:25:59 UTC - in response to Message 58856.  

Dropping on the fly <ngpus>, to 0.50 (from 0.33) for SoG and Non SoG, in app_config.xml, because I already know that SoG is the fastest app, no matter the AR.

Just to avoid APR rot, getting slower apps, due to this Arecibo VLAR storm.

That's what I do change applications, no need to abort. Just load up the undesirable applications with 2, 3, or 4 WUs per GPU and watch it's APR plummet, then go back to the usual values when the VLAR or Guppie storm is over and you've got your preferred application back.
Grant
Darwin NT.
ID: 58857 · Report as offensive
Grumpy Old Man
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1488
Credit: 8,634,021
RAC: 22,735
Sweden
Message 58858 - Posted: 2 Jul 2016, 23:42:20 UTC
Last modified: 2 Jul 2016, 23:43:44 UTC

Well, the best solution would be to have a better mix of AR's here on Beta, like it is on main. Having a full day of only VLARs for Arecibo, really screws things up, unless you manually intervene.

Had I not lowered the <ngpus> for SoG, I would by now only get the much slower Non SoG. The system had already decided that SoG was the fastest, in fact I didn't have even one non SoG, until this Arecibo VLAR storm started.

And if I had not intervened, the APR of SoG would be so low, that I would only get non SoG, and by then the VLAR storm would be over, so the non SoG would not drop its APR. Of course leading to that the system would only give me non SoG, even on normal ARs, until the next VLAR storm, when the APR of non SoG would drop lower than the low APR SoG had from the last VLAR storm.

And then it would start over again...

No, a better mix of Arecibo AR's on Beta, thank you.
ID: 58858 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 15 Jun 16
Posts: 43
Credit: 1,836,741
RAC: 0
Australia
Message 58859 - Posted: 3 Jul 2016, 1:19:43 UTC - in response to Message 58858.  

And then it would start over again...

Yep.


No, a better mix of Arecibo AR's on Beta, thank you.

Or a way to disable the manager's best application selection when on Beta.
The more applications, and the greater the mix of work that are run on given hardware, the better.
10WUs for this application, 10 for that application, 10 for the next & so on, then repeat.
Grant
Darwin NT.
ID: 58859 · Report as offensive
Grumpy Old Man
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1488
Credit: 8,634,021
RAC: 22,735
Sweden
Message 58861 - Posted: 3 Jul 2016, 6:24:51 UTC

Oh man!!!
Is there no end to this suffering?
Thanks for all the VLARs. Can I have something else now?

LOL
ID: 58861 · Report as offensive
BetelgeuseFive
Volunteer tester

Send message
Joined: 3 Jun 12
Posts: 64
Credit: 2,300,431
RAC: 381
Netherlands
Message 58863 - Posted: 3 Jul 2016, 10:16:25 UTC - in response to Message 58854.  

Why are the new (8.13/8.14) opencl_nvidia tasks (both SoG and sah) using so much CPU time ? I have specified -use_sleep, but I still seem to need nearly an entire CPU core to feed the GPU.

Add -v 6 to tuning line. And post few results with this option ON.


First two results (both opencl_nvidia_sah):

https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24227184
https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24227503

The second one resulted in a computation error: Disk usage limit exceeded

BOINC manager shows "free, available to BOINC: 226.47 GB".
Is there some limit on the output file size that is exceeded because using -v 6 ?

I will post SoG results as soon as they are available.

Tom
ID: 58863 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1444
Credit: 3,264,298
RAC: 0
United Kingdom
Message 58864 - Posted: 3 Jul 2016, 10:38:24 UTC - in response to Message 58863.  

The second one resulted in a computation error: Disk usage limit exceeded

That will be the individual workunit limit:

<rsc_disk_bound>33554432.000000</rsc_disk_bound>

or 32 MByte.
ID: 58864 · Report as offensive
BetelgeuseFive
Volunteer tester

Send message
Joined: 3 Jun 12
Posts: 64
Credit: 2,300,431
RAC: 381
Netherlands
Message 58865 - Posted: 3 Jul 2016, 11:32:17 UTC - in response to Message 58864.  

The second one resulted in a computation error: Disk usage limit exceeded

That will be the individual workunit limit:

<rsc_disk_bound>33554432.000000</rsc_disk_bound>

or 32 MByte.


Hi Richard,

Thanks for the explanation. Can I change this limit and if so, where and how ?

Tom
ID: 58865 · Report as offensive
Previous · 1 . . . 52 · 53 · 54 · 55 · 56 · 57 · 58 . . . 99 · Next

Message boards : News : SETI@home v8 beta to begin on Tuesday


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.