AstroPulse v4.33 Errors and reporting

log in

Advanced search

Message boards : AstroPulse : AstroPulse v4.33 Errors and reporting

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author Message
Josef W. Segur
Volunteer tester
Send message
Joined: 14 Oct 05
Posts: 1018
Credit: 1,494,997
RAC: 232
Message 34048 - Posted: 29 Jun 2008, 16:01:07 UTC - in response to Message 34044.

...
BoincLogX appears to ignore DCF.

True, once a WU is running the estimate is simply based on progress and CPU time. That works very well for AP WUs, but at the very beginning of setiathome_enhanced WUs provides some amusingly huge estimates.

BoincLogX does use DCF for estimates of unstarted work, though.
Joe

Jim Wilkins
Volunteer tester
Send message
Joined: 1 Nov 06
Posts: 55
Credit: 344,829
RAC: 0
Message 34049 - Posted: 29 Jun 2008, 20:22:16 UTC

Just got this error after 4 hours running.

Error reading from statefile: wanted 2832 bytes, got 0

Jim

Profile Keith T.
Volunteer tester
Avatar
Send message
Joined: 9 Feb 07
Posts: 129
Credit: 25,809
RAC: 0
Message 34050 - Posted: 29 Jun 2008, 21:28:09 UTC - in response to Message 34049.

Just got this error after 4 hours running.

Error reading from statefile: wanted 2832 bytes, got 0

Jim



You seem to have had more than one of those errors on the same computer:
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=4086296
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=3930831

Do you have anything like a virus scan that accesses the BOINC folder regularly?

Profile ulenztest
Volunteer tester
Send message
Joined: 13 Mar 06
Posts: 5
Credit: 55,887
RAC: 0
Message 34051 - Posted: 30 Jun 2008, 4:19:07 UTC

No problems with 4.33 until now. Estimated runtime: about 43 hours per wu.
____________

Jim Wilkins
Volunteer tester
Send message
Joined: 1 Nov 06
Posts: 55
Credit: 344,829
RAC: 0
Message 34053 - Posted: 30 Jun 2008, 12:26:31 UTC - in response to Message 34050.

I run a virus scan. I'm not sure how to figure outif it looks at the BOINC folder. Even so, why would I get this error? My other BOINC apps don't seem to mind.

Thanks,
Jim

Just got this error after 4 hours running.

Error reading from statefile: wanted 2832 bytes, got 0

Jim



You seem to have had more than one of those errors on the same computer:
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=4086296
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=3930831

Do you have anything like a virus scan that accesses the BOINC folder regularly?

rebest
Volunteer tester
Avatar
Send message
Joined: 10 Jun 08
Posts: 4
Credit: 28,159
RAC: 0
Message 34054 - Posted: 30 Jun 2008, 12:43:44 UTC

Got the firt error using 4.33.

1148968 went through as a redundant result. I don't know when exactly this happened. We had a line of storms go through last night and we lost power.
____________
Team NC State University
The Wolfpack howls at the Stars!

Stick
Volunteer tester
Send message
Joined: 16 May 06
Posts: 141
Credit: 80,959
RAC: 162
Message 34055 - Posted: 30 Jun 2008, 13:22:09 UTC - in response to Message 34054.
Last modified: 30 Jun 2008, 13:22:54 UTC

Got the firt error using 4.33.

1148968 went through as a redundant result. I don't know when exactly this happened. We had a line of storms go through last night and we lost power.


This was an Enhanced v6.02 result - not AP v4.33.
____________

Dotsch
Volunteer tester
Avatar
Send message
Joined: 14 Jun 05
Posts: 103
Credit: 81,859
RAC: 15
Message 34060 - Posted: 30 Jun 2008, 18:38:53 UTC

I got a WU which ran in the checkpoint errror. After restart the WU exited immediatly. So I restored from the backup, and the WU continues.
____________

Profile Gary Charpentier
Volunteer tester
Avatar
Send message
Joined: 9 Apr 07
Posts: 956
Credit: 534,898
RAC: 1,316
Message 34061 - Posted: 30 Jun 2008, 19:44:56 UTC - in response to Message 34060.

I got a WU which ran in the checkpoint errror. After restart the WU exited immediatly. So I restored from the backup, and the WU continues.


That at least means the checkpoint works sometimes. I'm wondering if there might be a race condition in it. Thinking that if the O/S is going down it may send signals to both the Boinc Manager and the AP process at the same time. AP may then get another from the Manager as it cleans up and not save correctly.

Gary

____________

rebest
Volunteer tester
Avatar
Send message
Joined: 10 Jun 08
Posts: 4
Credit: 28,159
RAC: 0
Message 34063 - Posted: 30 Jun 2008, 20:02:34 UTC - in response to Message 34055.

Got the firt error using 4.33.

1148968 went through as a redundant result. I don't know when exactly this happened. We had a line of storms go through last night and we lost power.


This was an Enhanced v6.02 result - not AP v4.33.


Ugh. First the eyes go and then the brain.... Sorry.

____________
Team NC State University
The Wolfpack howls at the Stars!

Profile Keith T.
Volunteer tester
Avatar
Send message
Joined: 9 Feb 07
Posts: 129
Credit: 25,809
RAC: 0
Message 34065 - Posted: 30 Jun 2008, 21:55:50 UTC - in response to Message 34053.

I run a virus scan. I'm not sure how to figure out if it looks at the BOINC folder. Even so, why would I get this error? My other BOINC apps don't seem to mind.

Thanks,
Jim


Some virus scanners may lock files while they are scanning them.
The AP application probably needs exclusive access to its files.

Which scanner do you use?

In most virus scanners it is possible to exclude certain folders or files from scanning.

Profile Richard U
Volunteer tester
Avatar
Send message
Joined: 6 May 07
Posts: 69
Credit: 496,888
RAC: 44
Message 34068 - Posted: 1 Jul 2008, 6:42:13 UTC
Last modified: 1 Jul 2008, 6:49:00 UTC

I just had my last 2 AP tasks error out on a -4

In ap_fileio.cpp, Statefile::write, statefile is 0'd, trying again: iteration 100
Error calculating fold_level.

3934752
3938469

About this time I had to log off as I couldn't open Boinc Manager and the computer was real sluggish. Processors were at 100% but then they always are. However on checking the logs it appears to have crashed during logoff "The previous system shutdown at 8:25:42 PM on 30/06/2008 was unexpected." Nothing else obvious shows up.

That should have said 3
4086505
____________
Richard U

Jim Wilkins
Volunteer tester
Send message
Joined: 1 Nov 06
Posts: 55
Credit: 344,829
RAC: 0
Message 34069 - Posted: 1 Jul 2008, 9:44:46 UTC - in response to Message 34065.

I run a relatively old version of Symantec Antivirus. I'll see if I can exclude files.

I'm not sure I am comfortable with this explanation. Windows apps should be able to retry or somehow get around trying to read a file that is being virus scanned. Admitting that I know nothing about AP, why should it be any different than other Windows BOINC apps that don't have this problem?

Thanks,
Jim

I run a virus scan. I'm not sure how to figure out if it looks at the BOINC folder. Even so, why would I get this error? My other BOINC apps don't seem to mind.

Thanks,
Jim


Some virus scanners may lock files while they are scanning them.
The AP application probably needs exclusive access to its files.

Which scanner do you use?

In most virus scanners it is possible to exclude certain folders or files from scanning.

Profile Richard U
Volunteer tester
Avatar
Send message
Joined: 6 May 07
Posts: 69
Credit: 496,888
RAC: 44
Message 34071 - Posted: 1 Jul 2008, 20:46:18 UTC - in response to Message 34069.
Last modified: 1 Jul 2008, 20:46:48 UTC

I run a relatively old version of Symantec Antivirus. I'll see if I can exclude files.

I'm not sure I am comfortable with this explanation. Windows apps should be able to retry or somehow get around trying to read a file that is being virus scanned. Admitting that I know nothing about AP, why should it be any different than other Windows BOINC apps that don't have this problem?

Thanks,
Jim

I run a virus scan. I'm not sure how to figure out if it looks at the BOINC folder. Even so, why would I get this error? My other BOINC apps don't seem to mind.

Thanks,
Jim


Some virus scanners may lock files while they are scanning them.
The AP application probably needs exclusive access to its files.

Which scanner do you use?

In most virus scanners it is possible to exclude certain folders or files from scanning.



These errors look a lot like what I was getting as well. I am also still getting "Can't delete previous state file" and I am running Vista with OneCare.

I am not convinced that the issue is not with BOINC not properly closing files but I can never catch the file in a locked state.
____________
Richard U

Sirius B
Volunteer tester
Send message
Joined: 11 Jun 08
Posts: 16
Credit: 128,146
RAC: 0
Message 34073 - Posted: 2 Jul 2008, 14:52:44 UTC
Last modified: 2 Jul 2008, 15:01:00 UTC

Hi guys, have have successfully completed several wu's & have 2 currently crunching. However, one of them seems to be acting up. On several reboots since the weekend, it has restarted from 0, yet the completion time still states 02:03. It's currently showing 13:03 completed. This hasn't happened with the previous wu's, so should I let this continue or abort?

WU - 02/07/2008 02:39:31|SETI@home Beta Test|Restarting task 23ap08aa.14671.4980.3.11.23_1 using setiathome_enhanced version 602


Also, can someone tell me how to link it's url? - sorry, found it.

WUID
____________

Raistmer
Volunteer tester
Avatar
Send message
Joined: 18 Aug 05
Posts: 954
Credit: 5,341,466
RAC: 12,070
Message 34074 - Posted: 2 Jul 2008, 17:01:41 UTC - in response to Message 34073.


WUID

It's SETI V6 WU, not an AP one.
____________

Profile JLDun
Volunteer tester
Avatar
Send message
Joined: 23 May 07
Posts: 66
Credit: 17,824
RAC: 0
Message 34082 - Posted: 3 Jul 2008, 6:24:33 UTC - in response to Message 33751.

WU ID 1145159
Result ID 3932932

Example from "stderr out:"


<core_client_version>5.10.45</core_client_version>
<![CDATA[
<stderr_txt>
At ap_graphics_init
Just before client.init()
At Statefile::Read()
1: s.fbll 0 c.fbls 262144 s.dcln 896 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 896 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

At ap_graphics_init
Just before client.init()
At Statefile::Read()
1: s.fbll 61696 c.fbls 262144 s.dcln 896 s.dcsn 0 s.dn 0 s.dcn 1970176, s.ds -1

2: s.fbll 261760 c.fbls 262144 s.dcln 896 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 896 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 896 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

1: s.fbll 0 c.fbls 262144 s.dcln 1024 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 1024 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 1024 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 1024 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 1024 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

1: s.fbll 0 c.fbls 262144 s.dcln 1152 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 1152 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 1152 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 1152 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 1152 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

1: s.fbll 0 c.fbls 262144 s.dcln 1280 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 1280 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 1280 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 1280 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 1280 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

At ap_graphics_init
Just before client.init()
At Statefile::Read()
1: s.fbll 262144 c.fbls 262144 s.dcln 1280 s.dcsn 64 s.dn 1 s.dcn 5353472, s.ds -1

At ap_graphics_init
Just before client.init()
At Statefile::Read()
1: s.fbll 262144 c.fbls 262144 s.dcln 1280 s.dcsn 64 s.dn 9 s.dcn 6844416, s.ds -1

At ap_graphics_init
Just before client.init()
At Statefile::Read()
1: s.fbll 262144 c.fbls 262144 s.dcln 1280 s.dcsn 64 s.dn 9 s.dcn 6844416, s.ds -1

1: s.fbll 0 c.fbls 262144 s.dcln 1408 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 1408 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 1408 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 1408 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 1408 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

1: s.fbll 0 c.fbls 262144 s.dcln 1536 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 1536 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 1536 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 1536 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 1536 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

At ap_graphics_init
Just before client.init()
At Statefile::Read()
1: s.fbll 262144 c.fbls 262144 s.dcln 1536 s.dcsn 96 s.dn 0 s.dcn 5582848, s.ds 1

At ap_graphics_init
Just before client.init()
At Statefile::Read()
1: s.fbll 262144 c.fbls 262144 s.dcln 1536 s.dcsn 128 s.dn 0 s.dcn 0, s.ds 1

1: s.fbll 0 c.fbls 262144 s.dcln 1664 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 1664 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 1664 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 1664 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 1664 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

At ap_graphics_init
Just before client.init()
At Statefile::Read()
1: s.fbll 262144 c.fbls 262144 s.dcln 1664 s.dcsn 16 s.dn 5 s.dcn 6500352, s.ds 1

At ap_graphics_init
Just before client.init()
At Statefile::Read()
1: s.fbll 262144 c.fbls 262144 s.dcln 1664 s.dcsn 16 s.dn 9 s.dcn 7553024, s.ds 1

1: s.fbll 0 c.fbls 262144 s.dcln 1792 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 1792 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 1792 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 1792 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 1792 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

1: s.fbll 0 c.fbls 262144 s.dcln 1920 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 1920 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 1920 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 1920 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 1920 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

1: s.fbll 0 c.fbls 262144 s.dcln 2048 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 2048 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 2048 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 2048 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 2048 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

1: s.fbll 0 c.fbls 262144 s.dcln 2176 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1

2: s.fbll 0 c.fbls 262144 s.dcln 2176 s.dcsn 0 s.dn 0 s.dcn 0, s.ds 1, l 7

2: s.fbll 261760 c.fbls 262144 s.dcln 2176 s.dcsn 0 s.dn 0 s.dcn 8376320, s.ds 1, l 7

3: s.fbll 262016 c.fbls 262144 s.dcln 2176 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7

4: s.fbll 262144 c.fbls 262144 s.dcln 2176 s.dcsn 0 s.dn 0 s.dcn 8380416, s.ds 1, l 7
...

____________

Profile Pappa
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 13 Nov 05
Posts: 1724
Credit: 3,121,901
RAC: 0
Message 34099 - Posted: 4 Jul 2008, 1:43:34 UTC - in response to Message 34069.

Jim et al

Various programs that allow multiple users to access the same "file" follow different rules about what happens when the file is open and someone touches it. Any Seti application follows the Common Sense Rules. Virus scanners tend to ignore those rules, IF they trap a virus they want to insure they have it LOCKED! Then it can not do any damage. That is a reason why people tell you to exclude BOINC and any Project Applications and/or directories.

I hope that makes more sense...

Al

I run a relatively old version of Symantec Antivirus. I'll see if I can exclude files.

I'm not sure I am comfortable with this explanation. Windows apps should be able to retry or somehow get around trying to read a file that is being virus scanned. Admitting that I know nothing about AP, why should it be any different than other Windows BOINC apps that don't have this problem?

Thanks,
Jim

I run a virus scan. I'm not sure how to figure out if it looks at the BOINC folder. Even so, why would I get this error? My other BOINC apps don't seem to mind.

Thanks,
Jim


Some virus scanners may lock files while they are scanning them.
The AP application probably needs exclusive access to its files.

Which scanner do you use?

In most virus scanners it is possible to exclude certain folders or files from scanning.



____________
Thanks to Paul and Friends
Please consider a Donation to the Seti Project

Jim Wilkins
Volunteer tester
Send message
Joined: 1 Nov 06
Posts: 55
Credit: 344,829
RAC: 0
Message 34106 - Posted: 4 Jul 2008, 17:19:06 UTC - in response to Message 34099.

Pappa,

I think (sometimes it takes me awhile to figure out what I want to say) my point is that AP seems to be the ONLY BOINC app that may have this issue. Why would it be different from the others including SETI and SETI Beta?

Thanks,
Jim

Jim et al

Various programs that allow multiple users to access the same "file" follow different rules about what happens when the file is open and someone touches it. Any Seti application follows the Common Sense Rules. Virus scanners tend to ignore those rules, IF they trap a virus they want to insure they have it LOCKED! Then it can not do any damage. That is a reason why people tell you to exclude BOINC and any Project Applications and/or directories.

I hope that makes more sense...

Al

I run a relatively old version of Symantec Antivirus. I'll see if I can exclude files.

I'm not sure I am comfortable with this explanation. Windows apps should be able to retry or somehow get around trying to read a file that is being virus scanned. Admitting that I know nothing about AP, why should it be any different than other Windows BOINC apps that don't have this problem?

Thanks,
Jim

I run a virus scan. I'm not sure how to figure out if it looks at the BOINC folder. Even so, why would I get this error? My other BOINC apps don't seem to mind.

Thanks,
Jim


Some virus scanners may lock files while they are scanning them.
The AP application probably needs exclusive access to its files.

Which scanner do you use?

In most virus scanners it is possible to exclude certain folders or files from scanning.



Josef W. Segur
Volunteer tester
Send message
Joined: 14 Oct 05
Posts: 1018
Credit: 1,494,997
RAC: 232
Message 34108 - Posted: 4 Jul 2008, 18:57:13 UTC - in response to Message 34106.

I think (sometimes it takes me awhile to figure out what I want to say) my point is that AP seems to be the ONLY BOINC app that may have this issue. Why would it be different from the others including SETI and SETI Beta?

Thanks,
Jim

The primary difference from other apps is that Josh has put the "Error reading from statefile: wanted 2832 bytes, got 0" message in place, _enhanced simply starts over from the beginning of a WU if its state file is empty. That issue has reappeared many times, though it is rare enough it's impossible to reproduce for testing. See the Resuming WUs doesn't work reliably thread from late 2005 for example.

I have hopes Josh will find a fix which can be ported to _enhanced also.
Joe

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : AstroPulse : AstroPulse v4.33 Errors and reporting


Return to SETI@home/AstroPulse Beta main page


Copyright © 2013 University of California

AstroPulse is funded in part by the NSF through grant AST-0307956