Yesterday morning, I found these two WUs, both running on the same system.
82347095 82343175
Both had run-times of over 6 hours. Neither was using any CPU time, just idling along, preventing other WUs from being processed.
I aborted both. Haven't seen any others on my three active systems running EON.
I will cheerfully provide any supporting information you would like.
Les
Runaway WUs
Moderator: moderators
Re: Runaway WUs
Caught and aborted another one this morning, after one hour and fifty minutes.
WU# 84912243
Same symptoms: run length over an hour and not using any CPU time.
If the admins are aware of this problem and have sufficient samples and info to solve it, just let me know and I'll shut up about it.
WU# 84912243
Same symptoms: run length over an hour and not using any CPU time.
If the admins are aware of this problem and have sufficient samples and info to solve it, just let me know and I'll shut up about it.
Re: Runaway WUs
Yet another. WU# 88214266
Aborted after 2+ hours.
Aborted after 2+ hours.
Re: Runaway WUs
Still another. WU# 96019272.
Aborted after 2 hours, 39 minutes.
Just spinning - no work being done.
Cheers.
Aborted after 2 hours, 39 minutes.
Just spinning - no work being done.
Cheers.
Re: Runaway WUs
Two more tasks found running for 3+ hours, using no CPU time, just idling.
Getting REALLY tired of having to check all my machines every few hours.
Is there a fix for this or am I just wasting my time reporting it?
I'll CHEERFULLY post WU ID numbers and any other desired information.
Getting REALLY tired of having to check all my machines every few hours.
Is there a fix for this or am I just wasting my time reporting it?
I'll CHEERFULLY post WU ID numbers and any other desired information.
-
- Posts: 20
- Joined: Wed Oct 27, 2010 2:07 pm
Re: Runaway WUs
Same issue here.
here is one WU that I aborted after 14hrs:
985262977_14115_189180275
Time was running but CPU was not in use for the WU.
here is one WU that I aborted after 14hrs:
985262977_14115_189180275
Time was running but CPU was not in use for the WU.
Re: Runaway WUs
I get one of those sometimes, but lately the big problem has been the server aborting a lot of WU's and also getting overloaded. Many times I get this...
519 eon2 6/16/2011 9:53:32 AM Started upload of 985262977_14643_201901525_0_0
520 eon2 6/16/2011 9:53:32 AM Started upload of 453769022_18500_197012431_0_0
521 eon2 6/16/2011 9:53:34 AM [error] Error reported by file upload server: can't parse config file
522 eon2 6/16/2011 9:53:34 AM [error] Error reported by file upload server: can't parse config file
523 eon2 6/16/2011 9:53:34 AM Temporarily failed upload of 985262977_14643_201901525_0_0: transient upload error
519 eon2 6/16/2011 9:53:32 AM Started upload of 985262977_14643_201901525_0_0
520 eon2 6/16/2011 9:53:32 AM Started upload of 453769022_18500_197012431_0_0
521 eon2 6/16/2011 9:53:34 AM [error] Error reported by file upload server: can't parse config file
522 eon2 6/16/2011 9:53:34 AM [error] Error reported by file upload server: can't parse config file
523 eon2 6/16/2011 9:53:34 AM Temporarily failed upload of 985262977_14643_201901525_0_0: transient upload error