memory leak/OOM kill
Posted: Wed Jun 05, 2013 8:02 pm
Hi, After noticing that my box was running cool and the system was running unusually slow (everything was moved to swap), I found out that the kernel killed all the running eons (I have 4GB mem, 2GB swap).
kern.log:
1 gig seems awfully huge. Normally, I think they run at 100-300 megs. My BOINC version is "6.12.40 x86_64-pc-linux-gnu", with eon as "eonclient_5.00_x86_64-pc-linux-gnu". I use the stock/downloaded binary for eon. I have been using these two flawlessly together for some time now (since 5.0's release), so I guess something's wrong with the workunits.
Task pages:
http://eon.ices.utexas.edu/eon2/result. ... =229032325
http://eon.ices.utexas.edu/eon2/result. ... =229032310
http://eon.ices.utexas.edu/eon2/result. ... =229031981
http://eon.ices.utexas.edu/eon2/result. ... =229031508
This also happened for four other tasks.
In case they get removed from the database, the "error message" on the workunit page says "Too many total results".
EDIT:
After doing some approximate research, the client sometimes crawls up to 1-1.3GiB across a handful of minutes then drops down to a few K. The times that it doesn't, everything works as expected. Limiting the amount of memory boinc is allowed to use prevents doomsday sluggishness/OOM kills, however the tasks still error with code -177 (0xffffffffffffff4f):
<core_client_version>6.12.40</core_client_version>
<![CDATA[
<message>
Maximum memory exceeded
</message>
<stderr_txt>
SIGSEGV: segmentation violation
Stack trace (2 frames):
[0x83ae89e]
[0xf77bb400]
Exiting...
SIGSEGV: segmentation violation
Stack trace (2 frames):
[0x83ae89e]
[0xf772e400]
Exiting...
SIGSEGV: segmentation violation
Stack trace (2 frames):
[0x83ae89e]
[0xf77c8400]
Exiting...
SIGSEGV: segmentation violation
Stack trace (2 frames):
[0x83ae89e]
[0xf77d8400]
Exiting...
</stderr_txt>
]]>
kern.log:
Code: Select all
Jun 5 12:09:44 cygnus kernel: [2974759.069810] Killed process 6271 (eonclient_5.00_) total-vm:1063100kB, anon-rss:842092kB, file-rss:208kB
Jun 5 12:12:08 cygnus kernel: [2974898.801129] Killed process 6272 (eonclient_5.00_) total-vm:1063100kB, anon-rss:911912kB, file-rss:0kB
Jun 5 12:13:31 cygnus kernel: [2974986.922215] Killed process 6296 (eonclient_5.00_) total-vm:1063100kB, anon-rss:725892kB, file-rss:0kB
Jun 5 12:13:31 cygnus kernel: [2974986.990299] Killed process 6299 (eonclient_5.00_) total-vm:1063100kB, anon-rss:725916kB, file-rss:0kB
Jun 5 12:13:31 cygnus kernel: [2974987.125096] Killed process 6297 (eonclient_5.00_) total-vm:1063100kB, anon-rss:720416kB, file-rss:8kB
Jun 5 12:16:14 cygnus kernel: [2975139.441500] Killed process 6315 (eonclient_5.00_) total-vm:774156kB, anon-rss:514640kB, file-rss:0kB
Task pages:
http://eon.ices.utexas.edu/eon2/result. ... =229032325
http://eon.ices.utexas.edu/eon2/result. ... =229032310
http://eon.ices.utexas.edu/eon2/result. ... =229031981
http://eon.ices.utexas.edu/eon2/result. ... =229031508
This also happened for four other tasks.
In case they get removed from the database, the "error message" on the workunit page says "Too many total results".
EDIT:
After doing some approximate research, the client sometimes crawls up to 1-1.3GiB across a handful of minutes then drops down to a few K. The times that it doesn't, everything works as expected. Limiting the amount of memory boinc is allowed to use prevents doomsday sluggishness/OOM kills, however the tasks still error with code -177 (0xffffffffffffff4f):
<core_client_version>6.12.40</core_client_version>
<![CDATA[
<message>
Maximum memory exceeded
</message>
<stderr_txt>
SIGSEGV: segmentation violation
Stack trace (2 frames):
[0x83ae89e]
[0xf77bb400]
Exiting...
SIGSEGV: segmentation violation
Stack trace (2 frames):
[0x83ae89e]
[0xf772e400]
Exiting...
SIGSEGV: segmentation violation
Stack trace (2 frames):
[0x83ae89e]
[0xf77c8400]
Exiting...
SIGSEGV: segmentation violation
Stack trace (2 frames):
[0x83ae89e]
[0xf77d8400]
Exiting...
</stderr_txt>
]]>