Message boards :
Number crunching :
Task postponed: Waiting to acquire slot directory lock. Another instance may be running.
Message board moderation
Author | Message |
---|---|
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I've got two cpu tasks from the last of my cache that refuse to run. If i exit BOINC and then restart them, they run for 34-35 seconds each time and shift to postponed. They are the only cpu tasks running. I can't figure out why they won't run. There ISN'T another instance running which I assume means another instance of BOINC. And just to make sure, they aren't the same task copied to two slots. They are different work units. Anyone care to offer an explanation as to what is happening? Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
OK, I think I just figured it out. I think if there is an existing boinc_lockfile in the slot when the task starts computing is what was causing the tasks to keep getting postponed after 35 seconds. I don't think that slot cleanup happened after the last task occupied the slot. I deleted the boinc_lockfile in both of the offending slots and the task postponement output file and restarted BOINC and the tasks are now computing past 35 seconds. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
To change the thread title. Problem solved. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462 ![]() |
OK, I think I just figured it out. I think if there is an existing boinc_lockfile in the slot when the task starts computing is what was causing the tasks to keep getting postponed after 35 seconds. I don't think that slot cleanup happened after the last task occupied the slot. I deleted the boinc_lockfile in both of the offending slots and the task postponement output file and restarted BOINC and the tasks are now computing past 35 seconds. Thank you. I have had this issue within memory and ended up rebooting the system to get it cleared. So that is a lockfile symptom. Tom A proud member of the OFA (Old Farts Association). |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
I has a similar problem in the past. Not sure could be related to what you talk about. The source is the way the Linux kills the crunching process when called by the rescheduler. Eventually it kills the crunching process but not clears the lock file. But that happening randomly, never really catch when or how the error happening. Fixed by adding a clearing of all lock files when running the scheduler. After that a i made some experiences and discovered if i change the time the boinc auto saves the task from 120 secs to 180 secs (more time than the task needs to complete the process up to 130 secs on my GPU's ) the problem disappears. What i imagine is something like: if the kill process happening when the autosave is creating the backup it leaves the file locked. That's could explain why the error is rare. Why that happening is well beyond my knowledge. ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Could be. I did reschedule at the last minute yesterday morning before the outage. Could have caught those two cpu tasks with the lockfile present at the one reschedule. Now I understand it, it it happens again I know the simple and fast fix. Thanks for the insight Juan. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.