Deprecated: Function get_magic_quotes_gpc() is deprecated in /disks/centurion/b/carolyn/b/home/boincadm/projects/beta/html/inc/util.inc on line 663
SETI@home v8 beta to begin on Tuesday

SETI@home v8 beta to begin on Tuesday

Message boards : News : SETI@home v8 beta to begin on Tuesday
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 99 · Next

AuthorMessage
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 55224 - Posted: 10 Dec 2015, 19:55:13 UTC
Last modified: 10 Dec 2015, 19:55:53 UTC

I've just replied to the email from Eric that Claggy is alluding to. From my observations through the day, I'd judge that the plain guppi_56520... WUs are the first run, and the guppi_8bit_56520... WUs are the second run. I haven't seen a third run.

As posted in this thread, the problem seems to be with the <beam_width> parameter in the WU header. The first run seemed far too low (by a factor of a million), but as we saw they ran OK with the test app: presumably it also has a million-fold error and they cancel each other out.

The second splitter run has <beam_width> values much closer to the ones we're familiar with from Arecibo, so that seems right: but the application can't handle them. I'm fairly sure a second correction, to the application this time, will be needed before the next run.
ID: 55224 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 55225 - Posted: 10 Dec 2015, 20:01:21 UTC - in response to Message 55224.  

The problem was actually with the receiver_cfg.center_freq being reported in Hz rather than MHz.

New work will be available when the splitter is rebuilt.
ID: 55225 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 55226 - Posted: 10 Dec 2015, 20:20:20 UTC - in response to Message 55208.  
Last modified: 10 Dec 2015, 20:30:10 UTC

Quoting my own post from earlier today, and adding the matching data for a v7 Arecibo task (also received on Beta today)

This looks wrong. All the header data looks the same (by eye only, no guarantee), _except_:

from the earlier guppi task:

  <name>guppi_56520_1_VOYAGER1_0012.27866.209.20.23.236.vlar</name>
  ...
  <receiver_cfg>
    <s4_id>20</s4_id>
    <name>Green Bank Telescope, Rcvr8_10, Pol 1</name>
    <beam_width>2.4587973414132e-08</beam_width>
    <center_freq>8418749952</center_freq>

and the newer one

  <name>guppi_8bit_56520_VOYAGER1_0012.16007.1.20.23.159.vlar</name>
  ...
  <receiver_cfg>
    <s4_id>20</s4_id>
    <name>Green Bank Telescope, Rcvr8_10, Pol 1</name>
    <beam_width>0.024587973414132</beam_width>
    <center_freq>8418749952</center_freq>

The beam width has changed by six orders of magnitude?

Matching data for a v7 Arecibo task:

  <name>06ap11ag.23142.2112.3.16.31</name>
  ...
  <receiver_cfg>
    <s4_id>3</s4_id>
    <name>Arecibo 1.4GHz Array, Beam 0, Pol 0</name>
    <beam_width>0.0500000007</beam_width>
    <center_freq>1420</center_freq>

I'll try modding a guppi task so that both the <beam_with> and the <center_freq> are in the same ballpark as Arecibo (matching units), and see how it runs.

Edit - it's got through the 'Optimal function choices' and it's using the normal amount of memory - time will tell, but looking good. (It's not found any signals yet, but it is generating a checkpoint file)
ID: 55226 · Report as offensive
Father Ambrose
Volunteer tester

Send message
Joined: 1 May 07
Posts: 556
Credit: 6,470,846
RAC: 0
United Kingdom
Message 55228 - Posted: 11 Dec 2015, 14:58:42 UTC

All v8 WU's have run without fault so far except 8bit a few more to complete.
A computer program will always do what you tell it to do, but rarely what you want it to do.
ID: 55228 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 55231 - Posted: 11 Dec 2015, 18:13:17 UTC - in response to Message 55226.  
Last modified: 11 Dec 2015, 18:27:11 UTC

Edited....

The problem was in generation of the table of Doppler drift rates to be examined. The frequency was a factor of a million too high (given in Hz rather than MHz). The client recomputes the beam width based on the ratio of the work_unit subband center frequency and the receiver center frequency and got a number that was a million times too large. That number is used to calculate the size of the chirp step, which came out way to small resulting in a huge table of Doppler Drift rates.

Jeff is off dealing with some flooding problems, but will fix it today or tomorrow and fire off a new GBT splitter. He'll also start a splitter on some Arecibo data to make sure it still works there.
ID: 55231 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 55232 - Posted: 12 Dec 2015, 8:40:12 UTC
Last modified: 12 Dec 2015, 8:49:40 UTC

How out of memory crash looks on my PC:



So, app tried to become "chatty" by showing some dialogue. But it can't be formed completely and disappears on first mouse click. I suspect it blocks processing slot until user action though, that's not good at all.

EDIT: It's even worse! App's process killed, BOINC shows "waiting for memory" for that particular task... But whole CPU (in my case - 4 cores) blocked from running tasks from this or other projects (!). Most "funny"- GPU tasks continue to run (and their memory consumption for existing SETI apps even higher than normal memory consumption of CPU apps). That's completely absurd behavior that just wastes CPU computational resources.
Not SETI issue but BOINC's sheduler one though.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 55232 · Report as offensive
Profile Ananas
Volunteer tester

Send message
Joined: 21 Jun 05
Posts: 43
Credit: 155,681
RAC: 0
Germany
Message 55233 - Posted: 12 Dec 2015, 9:09:16 UTC
Last modified: 12 Dec 2015, 9:10:56 UTC

I guess the _8bit_ ones are now known to be faulty but just in case ...

guppi_8bit_56520_VOYAGER1_0012.16007.1.20.23.141.vlar fails on all hosts, on my XP x64 it produced a message on the screen that the application couldn't be started.

p.s.: it is a headless cruncher but next time I will try to catch the message and make a screenshot.
ID: 55233 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 55234 - Posted: 12 Dec 2015, 9:25:23 UTC - in response to Message 55232.  

And more details on this issue.
Look memory and CPU graphs at the time of crash.


Whole memory consumed quite fast ((initial saturation) then crash happened (and mentioned dialogue box appeared).

Quite interesting how slowly (!) subsequent memory restoration works. Amount of memory in use very gradually decreases (swapping like a hell).
What next to note: flat line in the middle on ~2,5GB of consumed RAM. That's both SETI and BOINC problem. Memory consumption remained unappropriately high even when app already crashed (!). Until some of my (user required!!) actions on posting pics closed that half-baked VC++ runtime dialog. Only then memory usage dropped. After that next attempt of running was taken and memory usage become saturated again until next crash.

And all this happened with next BOINC client settings:


That is, BOINC completely failed its guarding duties. And allowed "rogue app" to put my system on knees with constant swapping and mouse freeeze in between.

I would say this BOINC area requires heavely reworking....
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 55234 · Report as offensive
Richard Haselgrove
Volunteer tester

Send message
Joined: 3 Jan 07
Posts: 1451
Credit: 3,272,268
RAC: 0
United Kingdom
Message 55235 - Posted: 12 Dec 2015, 9:43:02 UTC - in response to Message 55234.  

I would say this BOINC area requires heavely reworking....

Now would be a good time to talk to Rom Walton about this, and put forward your suggestions for better code.

Overnight (10 hours ago) he committed some updates to do with Microsoft C runtime exception handling, like a97b15c20963ab1235b4768ea3b3e3e077a10574

LIB: Explicitly declare a termination function for handling terminate()/unhandled()/abort() CRT calls.

Call DebugBreak() to make our exception handling technology kick in.
ID: 55235 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 55236 - Posted: 12 Dec 2015, 9:54:02 UTC - in response to Message 55235.  

I would say this BOINC area requires heavely reworking....

Now would be a good time to talk to Rom Walton about this, and put forward your suggestions for better code.

Overnight (10 hours ago) he committed some updates to do with Microsoft C runtime exception handling, like a97b15c20963ab1235b4768ea3b3e3e077a10574

LIB: Explicitly declare a termination function for handling terminate()/unhandled()/abort() CRT calls.

Call DebugBreak() to make our exception handling technology kick in.


I sent mail about this to dev group. Now up to them to properly react :)
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 55236 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 55238 - Posted: 12 Dec 2015, 10:53:12 UTC

Cause this issue understood and described already maybe worth to issue task abortion from server to free those 1434 v8 tasks currently in processing?

To save environment, to save testers from frozen systems and to avoid cross-validation of new tasks with already known to be bad and deprecated results.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 55238 · Report as offensive
Zalster
Volunteer tester

Send message
Joined: 30 Dec 13
Posts: 258
Credit: 12,340,341
RAC: 0
United States
Message 55240 - Posted: 13 Dec 2015, 3:22:19 UTC

Just got a fresh batch of work units v8 "_8bit_"

Been running now for 54 minutes without a problem.

Will see how they do.
ID: 55240 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,183,456
RAC: 0
United States
Message 55241 - Posted: 13 Dec 2015, 6:21:06 UTC - in response to Message 55238.  

Cause this issue understood and described already maybe worth to issue task abortion from server to free those 1434 v8 tasks currently in processing?

To save environment, to save testers from frozen systems and to avoid cross-validation of new tasks with already known to be bad and deprecated results.


In theory, I already did that. In practice BOINC doesn't always work the way it does in theory. :(
ID: 55241 · Report as offensive
Zalster
Volunteer tester

Send message
Joined: 30 Dec 13
Posts: 258
Credit: 12,340,341
RAC: 0
United States
Message 55242 - Posted: 13 Dec 2015, 7:48:57 UTC - in response to Message 55241.  

Anyone noticing an increase in Temps on their CPU when crunching these work units compared to other work units?

Might just be mine but thought I should say something
ID: 55242 · Report as offensive
SusieQ
Volunteer tester

Send message
Joined: 12 Nov 10
Posts: 1149
Credit: 32,460,657
RAC: 1
United Kingdom
Message 55245 - Posted: 13 Dec 2015, 10:07:28 UTC

Just noticed I've had an "Error while computing" on a non 8-bit work unit. Only two of us have completed the work unit so far - other result "Completed, waiting for validation"

http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=7529656

My results (on a WindowsXP PC with BOINC 7.6.9 running as a service):
<core_client_version>7.6.9</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified.
(0x3) - exit code 3 (0x3)

</message>
<stderr_txt>
setiathome_v8 7.99 DevC++/MinGW/g++ 4.8.1
libboinc: 7.7.0


Results from fellow cruncher:
<core_client_version>7.6.18</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_v8 7.99 DevC++/MinGW/g++ 4.8.1
libboinc: 7.7.0[/url]
ID: 55245 · Report as offensive
Father Ambrose
Volunteer tester

Send message
Joined: 1 May 07
Posts: 556
Credit: 6,470,846
RAC: 0
United Kingdom
Message 55246 - Posted: 13 Dec 2015, 11:05:38 UTC
Last modified: 13 Dec 2015, 11:24:16 UTC

Received a batch of 8 bit depends on host when they start processing off to model rail ex for the afternoon. EDIT six failed four running 63% 26% 24% and 3%
A computer program will always do what you tell it to do, but rarely what you want it to do.
ID: 55246 · Report as offensive
Father Ambrose
Volunteer tester

Send message
Joined: 1 May 07
Posts: 556
Credit: 6,470,846
RAC: 0
United Kingdom
Message 55247 - Posted: 13 Dec 2015, 11:41:47 UTC

Just curious been watching graphics. two different WU's

power 500 1012

duration 96583 222298

score 1.03 1.05
A computer program will always do what you tell it to do, but rarely what you want it to do.
ID: 55247 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 55250 - Posted: 14 Dec 2015, 0:01:05 UTC

New WUs splitted. When to expect app's binaries update?
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 55250 · Report as offensive
Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 10 Sep 10
Posts: 21
Credit: 852,516
RAC: 0
United States
Message 55251 - Posted: 14 Dec 2015, 1:54:16 UTC

I'm seeing a vastly underestimated run time for the new batch of tasks.




I don't buy computers, I build them!!
ID: 55251 · Report as offensive
Profile Jeff Buck
Volunteer tester

Send message
Joined: 11 Dec 14
Posts: 96
Credit: 1,240,941
RAC: 0
United States
Message 55252 - Posted: 14 Dec 2015, 3:31:48 UTC - in response to Message 55251.  

I'm seeing a vastly underestimated run time for the new batch of tasks.

The initial estimates on mine were also only about a quarter of what they probably should have been, and on one machine they're not readjusting as the tasks progress. That machine is still on BOINC 7.2.33. However, on another machine running BOINC 7.6.6, the remaining estimated times seemed to get recalculated by the time the progress reached about 30%, or perhaps even earlier than that. Now the remaining times look very realistic.
ID: 55252 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 99 · Next

Message boards : News : SETI@home v8 beta to begin on Tuesday


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.