Page 1 of 1
NEB interrupts abnormally
Posted: Mon Aug 02, 2010 3:11 am
by leon_qian
Dear Sir,
When I use c-NEB, the job interrupts abnormally, and the error message is :
p0_8544: p4_error: interrupt SIGx: 15
bm_list_8545: p4_error: interrupt SIGx: 15
p0_8544: (864031.140625) net_send: could not write to fd=4, errno = 32
What's wrong with it?
Many thanks!
Re: NEB interrupts abnormally
Posted: Mon Aug 02, 2010 9:25 pm
by graeme
This looks like an mpi error.
Is the job running normally and then timing out? You can get errors like this if your queueing system starts killing processes when the wall-clock time limit is reached.
Or does it crash at a particular point in the calculation. If it is the latter, and it looks like an issue with our code, I would like to investigate it further.
Re: NEB interrupts abnormally
Posted: Tue Aug 03, 2010 11:46 am
by leon_qian
Dear Sir,
Thanks for your kind reply!
Here is another question. Sometimes, my NEB would interrupt after several steps, and the error message is:
rm_l_2_7375: p4_error: interrupt SIGx: 15
rm_l_2_7375: (2015.175781) net_send: could not write to fd=5, errno = 32
bm_list_7370: p4_error: interrupt SIGx: 15
rm_l_1_7374: p4_error: interrupt SIGx: 15
rm_l_1_7374: (2015.691406) net_send: could not write to fd=5, errno = 32
After resubmission, sometimes, the job can continue several steps, but sometimes, it quits after only one step.
What is wrong with it? How to solve this problem?
Many many thanks!