Experiment for server operations check...

Message boards : News : Experiment for server operations check...
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
B-Man
Volunteer tester

Send message
Joined: 24 Aug 09
Posts: 79
Credit: 26,117
RAC: 0
United States
Message 43695 - Posted: 6 Sep 2012, 22:58:19 UTC

Well I turned on CPU apps on this project to help out. I did not increase resource share so only 1 AP WU per week on average right now. I had been waiting for the OSX GPU app but I did reopen my machine for other tasks. I'm only doing one CPU app at a time due to summer heat issues so work production is 30% or so of normal right now.
ID: 43695 · Report as offensive
Profile cAnDYmanS@H-Beta
Volunteer tester
Avatar

Send message
Joined: 24 May 12
Posts: 38
Credit: 436,379
RAC: 0
Romania
Message 43698 - Posted: 10 Sep 2012, 13:24:42 UTC

Looks like the credits picked up going as high as...25 per valid WU, while time estimates have dropped to approximately 38 minutes, even though the real processing time is somewhere around 2 hours. Still not seeing any over-credited units though.

Signing off my report and crunching on...
Per aspera, ad astra!

ID: 43698 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,124,700
RAC: 7,003
United States
Message 43699 - Posted: 10 Sep 2012, 17:46:21 UTC - in response to Message 43698.  

I've changed validators to see if that helps fix the low credit problems.
ID: 43699 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1667
Credit: 13,184,273
RAC: 13,712
Sweden
Message 43700 - Posted: 10 Sep 2012, 18:42:36 UTC - in response to Message 43699.  
Last modified: 10 Sep 2012, 18:58:17 UTC

I've changed validators to see if that helps fix the low credit problems.


My 6.04's, regardless of running on ATI or Nvidia, are still getting max 2.xx (often 1.xx) in credits when validated against another 6.04. When validated against a 6.01, it still gets mid to high 600 in credit.

So far it doesn't seem to make any difference with the "new" validators. Let them WU's come, I will crunch 'em.

End of report from a darker and darker Sweden. (I hate autumn and winter)

Edit: added: On second thought, it seems to matter who of the two running the WU that finish first maybe also, when it comes to the 6.04 and 6.01 pair. It seems as if the 6.01 finish the WU first, and the 6.04 second, you will get the mid to high 600 credit. If the 6.04 finish first, and the 6.01 second, it may fall back to the lower 1.XX or 2.xx credit range, same as if two 6.04's runs the WU. This conclusion may be totally wrong of course, since I do not crunch that many WU's in total, to make my results statistically significant.
ID: 43700 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,124,700
RAC: 7,003
United States
Message 43701 - Posted: 10 Sep 2012, 21:04:56 UTC - in response to Message 43700.  

I think that, at long last, I've found the bug. New results that are released after this point should have normal credit claims (with maybe a week or so of a settling down period).


ID: 43701 · Report as offensive
BetelgeuseFive
Volunteer tester

Send message
Joined: 3 Jun 12
Posts: 64
Credit: 2,516,533
RAC: 642
Netherlands
Message 43702 - Posted: 12 Sep 2012, 15:45:24 UTC - in response to Message 43701.  

I think that, at long last, I've found the bug. New results that are released after this point should have normal credit claims (with maybe a week or so of a settling down period).



Good news !
Was the problem with the time estimates caused by the same bug ?

Tom
ID: 43702 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1667
Credit: 13,184,273
RAC: 13,712
Sweden
Message 43703 - Posted: 12 Sep 2012, 15:58:24 UTC - in response to Message 43701.  

I think that, at long last, I've found the bug. New results that are released after this point should have normal credit claims (with maybe a week or so of a settling down period).



Great news, Eric. I have plenty of old WU's still to crunch though, before I get to the new freshly downloaded WU's. I hate to abort WU's if I don't need to, so I will just let my computers crunch through the cache, and receive low credit, until they get to the newly (after the fix) downloaded WU's.

If I have managed to survive this long on low credits, I'll manage another week too :-)

ID: 43703 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,124,700
RAC: 7,003
United States
Message 43704 - Posted: 12 Sep 2012, 18:46:43 UTC - in response to Message 43702.  


Was the problem with the time estimates caused by the same bug ?


As far as I can tell, yes.

ID: 43704 · Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 18 May 06
Posts: 280
Credit: 26,477,429
RAC: 65
United States
Message 43708 - Posted: 14 Sep 2012, 0:19:26 UTC - in response to Message 43684.  

Low credit is part of the problem we're trying to debug. Expect it to continue. Credit will be fixed once were sure we understand the problem and have it fixed.


Is it time now? No rush or anything like that. Just wondering if this latest breakthrough meets the requirement.
Dublin, California
Team: SETI.USA

ID: 43708 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,124,700
RAC: 7,003
United States
Message 43709 - Posted: 14 Sep 2012, 2:04:33 UTC - in response to Message 43708.  

I need to wait until the app_version pfc_avg returns to normal before I can fix the credits. That's when I'll be sure I've got it fixed. I'm guessing it'll be at least a week.
ID: 43709 · Report as offensive
Profile cAnDYmanS@H-Beta
Volunteer tester
Avatar

Send message
Joined: 24 May 12
Posts: 38
Credit: 436,379
RAC: 0
Romania
Message 43710 - Posted: 14 Sep 2012, 6:31:31 UTC

Hi Eric,

Almost finished returning a batch of brand new units (post-fix) and the credits seem to be OK:
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=4094410
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=4094408
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=4094397

Still got some others in pending and one in the pipe, but I would say we can put that behind us. However, the time estimates are still off (they come in as 34-38 minutes, but they end up taking 2-2.5 hours.

I hope this helps.

Cheers
Per aspera, ad astra!

ID: 43710 · Report as offensive
Grumpy Swede
Volunteer tester
Avatar

Send message
Joined: 10 Mar 12
Posts: 1667
Credit: 13,184,273
RAC: 13,712
Sweden
Message 43712 - Posted: 14 Sep 2012, 20:23:06 UTC
Last modified: 14 Sep 2012, 20:25:34 UTC

I think the credit/time est issue is solved now (some will still have a time est problem until they have reached more than 10 valid tasks, which at the same time are <10% blanked, and doesn't finish early)

Let me propose the next test, which likely will cause even more problems, tears, screaming, and nervous breakdowns, than this recent bug has :-)

Proposed test (which have been in a waiting state since at least April)

Multiple instances for stock 6.04. That's going to be a nightmare, folks....

And with that I will slip away into the darkness again :-)
ID: 43712 · Report as offensive
Profile uli
Volunteer tester
Avatar

Send message
Joined: 8 Aug 10
Posts: 299
Credit: 456,514
RAC: 0
Germany
Message 43715 - Posted: 16 Sep 2012, 3:13:57 UTC

Off topic

Is there a way to just clean up the database? I have a ton of them waiting to be resend. I don't care about credits and they may be no longer needed for testing.
ID: 43715 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,124,700
RAC: 7,003
United States
Message 43716 - Posted: 16 Sep 2012, 23:34:33 UTC - in response to Message 43715.  

Not a easy way. BOINC isn't every good at dealing with things going wrong, even when BOINC is part of the reason they went wrong. Once I've granted credit for this experiment I may wipe the result table of Astropulse entirely.
ID: 43716 · Report as offensive
Profile uli
Volunteer tester
Avatar

Send message
Joined: 8 Aug 10
Posts: 299
Credit: 456,514
RAC: 0
Germany
Message 43721 - Posted: 17 Sep 2012, 4:14:34 UTC

Mine are not AP, just regular WUs. Thank you for responding Eric, it might help some others tho.
ID: 43721 · Report as offensive
Profile John Neale
Volunteer tester
Avatar

Send message
Joined: 2 May 07
Posts: 192
Credit: 492,763
RAC: 275
South Africa
Message 43724 - Posted: 17 Sep 2012, 8:45:02 UTC - in response to Message 43721.  

Mine are not AP, just regular WUs. Thank you for responding Eric, it might help some others tho.


uli, are you referring to the issues which were discussed in this thread?
ID: 43724 · Report as offensive
Josef W. Segur
Volunteer tester

Send message
Joined: 14 Oct 05
Posts: 1137
Credit: 1,848,733
RAC: 0
United States
Message 43729 - Posted: 17 Sep 2012, 20:22:54 UTC - in response to Message 43724.  

Mine are not AP, just regular WUs. Thank you for responding Eric, it might help some others tho.


uli, are you referring to the issues which were discussed in this thread?

No, uli is simply suffering from the large amount of MB v7 work which is maintained in ready to send. There's about 150K tasks and only a few hundred being done daily. MB v7 tasks being issued now were created in April (either as initial replication or as a reissue), so there's a delay of several months from creation to issue.

I don't know why the MB splitters are configured to maintain such a large amount, it may be that they're picking up settings intended for the main project where 150K tasks last only a few hours.
                                                                    Joe
ID: 43729 · Report as offensive
Profile Eric J Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 15 Mar 05
Posts: 1547
Credit: 27,124,700
RAC: 7,003
United States
Message 43731 - Posted: 17 Sep 2012, 21:39:11 UTC - in response to Message 43729.  
Last modified: 18 Sep 2012, 0:44:20 UTC

There's a bug in the splitter that prevent it from recognizing when it's hit the limit. I should just turn it off for a while.

I cancelled the SAH workunits with id's above the highest that had been sent, but it still leaves 45000 in the queue. I've also changed the SAH/AP ratio from 5:1 back to a more normal 25:1

On the pulse side if we get a GPU SAHv7 out there is should crunch through those quickly.
ID: 43731 · Report as offensive
BetelgeuseFive
Volunteer tester

Send message
Joined: 3 Jun 12
Posts: 64
Credit: 2,516,533
RAC: 642
Netherlands
Message 43743 - Posted: 21 Sep 2012, 15:38:30 UTC


Hmm, I thought the problem with the time estimates had been solved.
This task failed with a 'Maximum elapsed time exceeded' message.

http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=11172699

Are the estimates too low now instead of being too high ?
I completed several Astropulse GPU tasks after the fix that was announced on September 10. They all took between 9000 and 10000 seconds to complete, but this one failed after less than 5000 seconds.

Any clues ?

Tom
ID: 43743 · Report as offensive
Christoph
Volunteer tester

Send message
Joined: 16 Oct 09
Posts: 58
Credit: 662,990
RAC: 0
Germany
Message 43745 - Posted: 21 Sep 2012, 18:57:00 UTC

I did resume with beta after the fix was announced. Still I had to edit my client_state for that the tasks will complete.
Here the one result which I gave a try before editing: http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=11170369
Christoph
ID: 43745 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : News : Experiment for server operations check...


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.