BOINC 4 and Win98

Message boards : Number crunching : BOINC 4 and Win98
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 19 · Next

AuthorMessage
Profile Rom Walton (BOINC)
Volunteer tester
Avatar

Send message
Joined: 28 Apr 00
Posts: 579
Credit: 130,733
RAC: 0
United States
Message 20068 - Posted: 31 Aug 2004, 0:21:59 UTC

Okay folks...

There is a lot of information on this thread, so i'm going to go ahead and ask some questions that some of you have already anwsered.

For those of you that are experiencing crashes, is S@H the only application you are running?

How long have you let BOINC run before terminating it?

The reason I ask the last question is this is the first release that included the client-side CPU scheduler. I have noticed on a couple of machines that the first hour BOINC is executed on a machine subscribed to multiple projects, the CPU scheduler switches between the projects rapidly but settles down after an hour.

So could somebody just let it sit in the state where it seems like it is hung for a couple of hours and see if it is the CPU scheduler?

Thanks in advance.

----- Rom
BOINC Development Team, U.C. Berkeley
ID: 20068 · Report as offensive
MPBroida

Send message
Joined: 6 Sep 00
Posts: 337
Credit: 16,433
RAC: 0
United States
Message 20071 - Posted: 31 Aug 2004, 0:26:07 UTC - in response to Message 20068.  
Last modified: 31 Aug 2004, 0:29:08 UTC

> For those of you that are experiencing crashes, is S@H the only application
> you are running?
>
> How long have you let BOINC run before terminating it?

See this thread for a lot of info about the circumstances around the problems I hit.

Note that my system didn't "crash"; it slowed to a halt over just a few minutes, then locked up, requiring a reset. This happened several times. The slowdowns/lockups appear to have been during the system "timeouts" for disk accesses while it tried to work with a corrupted FAT. The FAT was fine before the BOINC upgrade; it was corrupted within 30 minutes after the upgrade with NOTHING but BOINC/SETI running.

<And it appears that I'll have to completely wipe the disk and reinstall Windows to recover.>
ID: 20071 · Report as offensive
Profile Rom Walton (BOINC)
Volunteer tester
Avatar

Send message
Joined: 28 Apr 00
Posts: 579
Credit: 130,733
RAC: 0
United States
Message 20074 - Posted: 31 Aug 2004, 0:31:20 UTC - in response to Message 20071.  

> Note that my system didn't "crash"; it slowed to a halt over just a few
> minutes, then locked up, requiring a reset. This happened several times. The
> slowdowns/lockups appear to have been during the system "timeouts" for disk
> accesses while it tried to work with a corrupted FAT. The FAT was
> fine before the BOINC upgrade; it was corrupted within 30 minutes after
> the upgrade with NOTHING but BOINC/SETI running.

I'm working on a response to the other thread.

Are you suggesting that the FAT corruption happened before the hard reset?

----- Rom
BOINC Development Team, U.C. Berkeley
ID: 20074 · Report as offensive
S@NL - EJG
Volunteer tester

Send message
Joined: 21 Apr 00
Posts: 64
Credit: 25,162,101
RAC: 0
Netherlands
Message 20089 - Posted: 31 Aug 2004, 0:50:41 UTC - in response to Message 20068.  
Last modified: 31 Aug 2004, 1:13:54 UTC

In this thread I explained in two posts what happened to Boinc on my systems.

To answer your questions:

> For those of you that are experiencing crashes, is S@H the only application
> you are running?

Yes, S@H is the only project application in Boinc. The PC is running no other software than a couple of Norton applications (anti virus and cleansweep).


> How long have you let BOINC run before terminating it?

On my WinME machine for example: this morning 8:30h I noticed Boinc was stuck and the PC was very slow. After rebooting the PC I left it alone. When I came back from work (18:00h) Boinc was stuck again and the PC slow. I noticed the time in the Windows taskbar was still displaying roughly 8:45 in the morning.


Thanks for your help !
ID: 20089 · Report as offensive
Profile Tuvok

Send message
Joined: 14 Apr 03
Posts: 16
Credit: 13,753,353
RAC: 0
Canada
Message 20098 - Posted: 31 Aug 2004, 1:03:21 UTC - in response to Message 20068.  

My client only slowed then froze after downloading the upgrade. My computer will not respond to ctrl +alt+del. On my XP laptop, I do not experience any problems. I do not know what to do, my client slows anf freezes seconds after started, and the only way to try and regain control is a hard reset.

Big J







> Okay folks...
>
> There is a lot of information on this thread, so i'm going to go ahead and ask
> some questions that some of you have already anwsered.
>
> For those of you that are experiencing crashes, is S@H the only application
> you are running?
>
> How long have you let BOINC run before terminating it?
>
> The reason I ask the last question is this is the first release that included
> the client-side CPU scheduler. I have noticed on a couple of machines that
> the first hour BOINC is executed on a machine subscribed to multiple projects,
> the CPU scheduler switches between the projects rapidly but settles down after
> an hour.
>
> So could somebody just let it sit in the state where it seems like it is hung
> for a couple of hours and see if it is the CPU scheduler?
>
> Thanks in advance.
>
> ----- Rom
> BOINC Development Team, U.C. Berkeley
> <a> href="http://www.boinc.dk/index.php?page=user_statistics&project=sah&userid=85465">
>
ID: 20098 · Report as offensive
Profile Tuvok

Send message
Joined: 14 Apr 03
Posts: 16
Credit: 13,753,353
RAC: 0
Canada
Message 20100 - Posted: 31 Aug 2004, 1:04:18 UTC - in response to Message 20098.  

This is for my win98 box.

> My client only slowed then froze after downloading the upgrade. My computer
> will not respond to ctrl +alt+del. On my XP laptop, I do not experience any
> problems. I do not know what to do, my client slows anf freezes seconds after
> started, and the only way to try and regain control is a hard reset.
>
> Big J
>
>
>
>
>
>
>
> > Okay folks...
> >
> > There is a lot of information on this thread, so i'm going to go ahead
> and ask
> > some questions that some of you have already anwsered.
> >
> > For those of you that are experiencing crashes, is S@H the only
> application
> > you are running?
> >
> > How long have you let BOINC run before terminating it?
> >
> > The reason I ask the last question is this is the first release that
> included
> > the client-side CPU scheduler. I have noticed on a couple of machines
> that
> > the first hour BOINC is executed on a machine subscribed to multiple
> projects,
> > the CPU scheduler switches between the projects rapidly but settles down
> after
> > an hour.
> >
> > So could somebody just let it sit in the state where it seems like it is
> hung
> > for a couple of hours and see if it is the CPU scheduler?
> >
> > Thanks in advance.
> >
> > ----- Rom
> > BOINC Development Team, U.C. Berkeley
> > <a>
> href="http://www.boinc.dk/index.php?page=user_statistics&project=sah&userid=85465">
> >
>
>
ID: 20100 · Report as offensive
MPBroida

Send message
Joined: 6 Sep 00
Posts: 337
Credit: 16,433
RAC: 0
United States
Message 20101 - Posted: 31 Aug 2004, 1:05:35 UTC - in response to Message 20074.  
Last modified: 31 Aug 2004, 1:15:57 UTC

> I'm working on a response to the other thread.

OK, thanks. :) Try to ignore my frustrated tone there and in other msgs.
I'm struggling with the loss of a lot of my data on that corrupted disk.

> Are you suggesting that the FAT corruption happened before the hard reset?

Yes, I think so (but have no way to be sure).

WinME, worked fine under BOINC 3.19.

I hope I'm remembering this right: The first couple times I was able to stop BOINC cleanly though it didn't quit right away. The next few times, I had to kill it from the CTRL-ALT-DEL list (Win9x doesn't have a full TaskManager) and that took a loooonng time to appear. The last time, I had to hard reset.

I was watching the disk light as the system performance (based on mouse movement since nothing else was running) was getting worse and worse. It seemed that every disk access stopped everything for a short while (getting worse each time I restarted BOINC). I think that time was a filesystem timeout due to the corrupted FAT: when I finally managed to get the system up enough for me to run DiskDoctor (yes, after the hard reset) it behaved the same way for several minutes and then 1) reported a corrupted primary FAT, 2) reported that the primary and secondary FAT didn't match, and 3) then hung trying to fix the FAT. I'm sure that just made the FAT problem worse, of course.

But I'm not an expert on Windows internals, so I could be way off in my thought that it was a filesystem timeout I was seeing. It just seems that that explains what I observed pretty well.

It could be that, as you seem to be getting at, the actual FAT corruption was caused by the hard reset. A few others have reported some data file corruption which could just be open files at the point of reset, but I haven't seen anyone reporting FAT corruption. (I thought I did, but that person clarified that his problem was datafiles, not FAT.)

So, it could just be my bad luck that the datafile corruption happened to screw the FAT itself. And maybe the whole problem is just the BOINC or SETI clients taking over the CPU to the exclusion of all else, and files being corrupted only due to having to kill the application or the system to get it back.

Could there be some Priority problems in the new clients??
ID: 20101 · Report as offensive
Mr. GoodWrench

Send message
Joined: 26 Jun 99
Posts: 19
Credit: 8,937,626
RAC: 0
United States
Message 20103 - Posted: 31 Aug 2004, 1:09:11 UTC - in response to Message 20068.  
Last modified: 31 Aug 2004, 1:20:47 UTC

>
> So could somebody just let it sit in the state where it seems like it is hung
> for a couple of hours and see if it is the CPU scheduler?
>
> Thanks in advance.
>
> ----- Rom
> BOINC Development Team, U.C. Berkeley

The problem seems to occur just after a data transmit and hangs before processing a new unit. I noticed it last eve, but did a reboot and things seemed to settle down. When I awoke today I noticed that my system clock has stopped at about 7:31AM, (it was about 9:30 at the time). I did cntrl-alt-del it said that Boinc_gui was not responding. I closed it and system time caught up. At this time the mouse was virtually useless because it moved so slowly. This is on my ME machine (63227)

My other machine (63278) Win98 just now posted another error: Boinc-gui has performed an illegal operation and will be shut down. Invalid page fault in module KERNEL32.DLL at 0177:bff8836d
Registers:
EAX=c00300ec CS=0177 EIP=bff8836d EFLGS=00010212
EBX=006effec SS=017f ESP=005efe98 EBP=005f0010
ECX=00000000 DS=017f ESI=00000000 FS=2f2f
EDX=bff76855 ES=017f EDI=bff79060 GS=0000
Bytes at CS:EIP:
53 56 57 8b 75 10 8b 38 33 db 85 f6 75 2d 8d b5
Stack dump:

Both machines had worked flawlessly up until the upgrade over the weekend. Hope this helps. Could be just S@H4.03 since others have been running BOINC4.05 in other projects without problems.
ID: 20103 · Report as offensive
Profile Rom Walton (BOINC)
Volunteer tester
Avatar

Send message
Joined: 28 Apr 00
Posts: 579
Credit: 130,733
RAC: 0
United States
Message 20114 - Posted: 31 Aug 2004, 1:24:28 UTC

Okay, if an effort to try and narrow down the cause could somebody try adding this tag to the account_setiathome.berkeley.edu.xml file between the project preferences tag:

[leave_apps_in_memory/]

NOTE: the [] need to be changed to less than, greater than marks.

What does BOINC do after you start it back up?

----- Rom
BOINC Development Team, U.C. Berkeley
ID: 20114 · Report as offensive
Underground Tech
Avatar

Send message
Joined: 4 Jan 00
Posts: 50
Credit: 14,579
RAC: 0
United States
Message 20120 - Posted: 31 Aug 2004, 1:30:39 UTC - in response to Message 20068.  
Last modified: 31 Aug 2004, 2:13:23 UTC

> Okay folks...
>
> There is a lot of information on this thread, so i'm going to go ahead and ask
> some questions that some of you have already anwsered.
>
> For those of you that are experiencing crashes, is S@H the only application
> you are running?
>
> How long have you let BOINC run before terminating it?
>
> So could somebody just let it sit in the state where it seems like it is hung
> for a couple of hours and see if it is the CPU scheduler?
>


I know I have gone to bed shortly befor a WU has finished, so I can say it has sit there for several hours in this state. Except for ATi Remote Wonder controls that run in the background, BOINC is the only app used on my Win98 PC (no anti-virus, ect..)
Of course the monitor goes into stand-by mode and will not wake up, so I can not say if BOINC has given any errors.


After it is restarted, it works fine until the next WU is finnished, then locks-up again.


ID: 20120 · Report as offensive
grumpy

Send message
Joined: 2 Jun 99
Posts: 209
Credit: 152,987
RAC: 0
Canada
Message 20149 - Posted: 31 Aug 2004, 2:04:30 UTC

Had it working on my old win 98 box for 30 minutes....

Nope it does not work well with win98.I did not get any fat32 corruptions nor error messages.

What happened is that it slows ( the system) down and stalls ( gui).
It does that if you try anything with the gui like: update,download etc while you are processing a unit.

When it's working ok the cpu cycles for gui are low (priority 8 normal) and cpu
cycles are high for intelx86, priority 4, idle.

When the gui performs a update or download the gui cycles goes way up and the intelx86 down and kills it witch leaves the gui running alone with high cycles.

Funny thing just appened on my xp machine, could be bad news.( gui running, intelx86 dead also, it was trying to upload .I'll have to check what happened.







ID: 20149 · Report as offensive
Profile Rom Walton (BOINC)
Volunteer tester
Avatar

Send message
Joined: 28 Apr 00
Posts: 579
Credit: 130,733
RAC: 0
United States
Message 20157 - Posted: 31 Aug 2004, 2:14:04 UTC

Oh oh oh, just had another question, does this problem happen if you use the command line interface.

boinc_cli.exe

Does the machine go into limbo if you run the commandline interface?

----- Rom
BOINC Development Team, U.C. Berkeley
ID: 20157 · Report as offensive
Profile KW2E
Avatar

Send message
Joined: 18 May 99
Posts: 346
Credit: 104,396,190
RAC: 34
United States
Message 20163 - Posted: 31 Aug 2004, 2:20:56 UTC - in response to Message 20114.  
Last modified: 31 Aug 2004, 2:23:57 UTC

>
> What does BOINC do after you start it back up?
>
> ----- Rom

I tried this on one W98 machine. When I started Boinc it came up and tried to download the 4.03 app again but said it already existed. It then downloaded a few more wus. When I switched to the Projects page it showed that I was attached to SAH twice. I stopped the gui, replaced the xml file with a backup and then restarted boinc. Showed the second SAH attachment under projects but had no account name. I detached from that second SAH and left it run.

I am hoping you meant to leave that / in the line you wished us to add.

For testing purposes I was executing the boinc_gui.exe from explorer. That client is now locked up solid.

Rob
<A>
ID: 20163 · Report as offensive
Walt Gribben
Volunteer tester

Send message
Joined: 16 May 99
Posts: 353
Credit: 304,016
RAC: 0
United States
Message 20165 - Posted: 31 Aug 2004, 2:23:12 UTC - in response to Message 20114.  

> Okay, if an effort to try and narrow down the cause could somebody try adding
> this tag to the account_setiathome.berkeley.edu.xml file between the project
> preferences tag:
>
> [leave_apps_in_memory/]
>
> NOTE: the [] need to be changed to less than, greater than marks.
>
> What does BOINC do after you start it back up?
>
> ----- Rom

After my latest abend, rebooted and added that tag. Will see in around 9 hours.

Do note that the "invalid page fault" errors are actually stack overflows. The ESP register just dropped below the beginning of the stack frame, so the next PUSH bombed. I have several DrWatson logs of the error, if theres some place I can sent one. Also put an item in Questions and Problems: Windows on this. See http://setiweb.ssl.berkeley.edu/forum_thread.php?id=3416

ID: 20165 · Report as offensive
Petit Soleil
Avatar

Send message
Joined: 17 Feb 03
Posts: 1497
Credit: 70,934
RAC: 0
Canada
Message 20172 - Posted: 31 Aug 2004, 2:32:00 UTC
Last modified: 31 Aug 2004, 14:32:10 UTC

I have a suggestion for you Rom !

You could create a special WIN98 problem thread and announce it on the
main page. I see you are currently going from one post to another. It
would be easier for all of you (dev and users) to discuss and investigate
the problem all at the same place in order to share the same informations.

Just an idea.

Friendly
Marc
ID: 20172 · Report as offensive
Profile Rom Walton (BOINC)
Volunteer tester
Avatar

Send message
Joined: 28 Apr 00
Posts: 579
Credit: 130,733
RAC: 0
United States
Message 20174 - Posted: 31 Aug 2004, 2:34:12 UTC - in response to Message 20165.  

> After my latest abend, rebooted and added that tag. Will see in around 9
> hours.
>
> Do note that the "invalid page fault" errors are actually stack overflows.
> The ESP register just dropped below the beginning of the stack frame, so the
> next PUSH bombed. I have several DrWatson logs of the error, if theres some
> place I can sent one. Also put an item in Questions and Problems: Windows on
> this. See http://setiweb.ssl.berkeley.edu/forum_thread.php?id=3416

Okay, now we are making some progress, so if it is a stack overflow then the exception handler executes some code where it changes the ESP to a static location in memory so it can try to dump out the callstack. This code was originally part of the stackwalker exception handler.

Can I build you a private without that chunk of code and see what the real exception tells you?

----- Rom
BOINC Development Team, U.C. Berkeley
ID: 20174 · Report as offensive
Profile Rom Walton (BOINC)
Volunteer tester
Avatar

Send message
Joined: 28 Apr 00
Posts: 579
Credit: 130,733
RAC: 0
United States
Message 20177 - Posted: 31 Aug 2004, 2:34:47 UTC - in response to Message 20172.  

> I have a suggestion for you Rom !
>
> You could create a special WIN98 problem thread and announce it on the
> main page. I see you are currently going from one post to another. It
> would be easier for all of you (dev and users) to discuss and investigate
> the problem all at the same place in order to share the same informations.

Great idea, will do that...

----- Rom
BOINC Development Team, U.C. Berkeley
ID: 20177 · Report as offensive
Walt Gribben
Volunteer tester

Send message
Joined: 16 May 99
Posts: 353
Credit: 304,016
RAC: 0
United States
Message 20181 - Posted: 31 Aug 2004, 2:39:07 UTC - in response to Message 20174.  


> Okay, now we are making some progress, so if it is a stack overflow then the
> exception handler executes some code where it changes the ESP to a static
> location in memory so it can try to dump out the callstack. This code was
> originally part of the stackwalker exception handler.
>
> Can I build you a private without that chunk of code and see what the real
> exception tells you?
>
> ----- Rom
> BOINC Development Team, U.C. Berkeley


Sure.


ID: 20181 · Report as offensive
Lloyd

Send message
Joined: 22 Jan 02
Posts: 41
Credit: 1,266,000
RAC: 0
United States
Message 20183 - Posted: 31 Aug 2004, 2:42:25 UTC

I changed my prefs to leave app in memory using an XP machine, then dumped and reinstalled on the 98 machine and it seems to be working. However it seemed to be working before until it finished a work unit.??? (this makes it MORE fun) :)
<A>
ID: 20183 · Report as offensive
Profile Rom Walton (BOINC)
Volunteer tester
Avatar

Send message
Joined: 28 Apr 00
Posts: 579
Credit: 130,733
RAC: 0
United States
Message 20189 - Posted: 31 Aug 2004, 2:54:03 UTC - in response to Message 20181.  

> Sure.

Okay, checkin in the remarked out code and I'm preparing to build now.

----- Rom
BOINC Development Team, U.C. Berkeley
ID: 20189 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 19 · Next

Message boards : Number crunching : BOINC 4 and Win98


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.