Job does not terminate/continue after convergence in a image

Vasp transition state theory tools

Moderator: moderators

Post Reply
ckande
Posts: 11
Joined: Wed May 09, 2012 5:58 am

Job does not terminate/continue after convergence in a image

Post by ckande »

I wonder if someone has experienced such a problem.

My job does not terminate/continue after convergence in one image has been reached. The following are the brief specifications of what I am using.

VTST: Latest available on the website
VASP: 5.2.12
OpenMPI: 1.5.4
Queuing: PBS Torque
Compiler: Ifort, 12.1.1.256 Build 20111011

When I submit a job, let's say with 3 images, on to the queueing system and the 01/stdout (../vasp.out) shows that it has reached convergence the whole job just seems to freeze. Everything on 02, 03 stops just after the electronic convergence and just before writing the F= ...E0 =... line in the stdout. The job does not get killed, does not go ahead to full convergence on all images. I have to manually kill the job, move the CONTCARs to POSCARs and just hope that all images reach convergence at the same time. As you can see that is a big pain. I wonder if someone has experienced this before and if there something that I am doing wrong.

Thanks in advance.
graeme
Site Admin
Posts: 2260
Joined: Tue Apr 26, 2005 4:25 am
Contact:

Re: Job does not terminate/continue after convergence in a i

Post by graeme »

All images must reach your specified electronic convergence criterion (ediff) in order for them to take an ionic step. If some images take longer, all the others must wait. If this is the issue that you are seeing, it is normal. If I misunderstood, let me know.
ckande
Posts: 11
Joined: Wed May 09, 2012 5:58 am

Re: Job does not terminate/continue after convergence in a i

Post by ckande »

Dear Graeme,

Thanks for the quick replies.

You did misunderstand. After one image has reached convergence, nothing is happening with the other images. The calculations of the other images are NOT going forward to the required convergence which as you say is normal behavior.

I have a calculation running right now. I will try to post a detailed query later.

Regards.
graeme
Site Admin
Posts: 2260
Joined: Tue Apr 26, 2005 4:25 am
Contact:

Re: Job does not terminate/continue after convergence in a i

Post by graeme »

Ok. Does the calculation stop when the climbing image reaches convergence? Otherwise, I am at a loss to explain this.
ckande
Posts: 11
Joined: Wed May 09, 2012 5:58 am

Re: Job does not terminate/continue after convergence in a i

Post by ckande »

Please find attached the output from a hung job.

The job hung once the image 05 reached convergence. There was no output anymore from any of the other images (01, 02, 03, 04, 06, 07) and the job also did not terminate. I had to kill the job myself.

I also attached the time stamps of all the files. As you can see, once 05/stdout had reached convergence, {01,02,03,04,06,07} images had the electronic convergence complete and then they hung just before writing the line F= ...

Please let me know if you have any clues.

Thanks in advance.

[attachment=0]attach.txt[/attachment]
Attachments
attach.txt
Last few lines of vasp.out and */stdout and their timestamps
(4.07 KiB) Downloaded 978 times
ckande
Posts: 11
Joined: Wed May 09, 2012 5:58 am

Re: Job does not terminate/continue after convergence in a i

Post by ckande »

Just a small update. I observe similar behavior on another cluster as well with a completely different setup. On this cluster it's a pgf90 compiler with VASP 5.2.12 and openmpi 1.4.4.

Maybe something is wrong with my parameters in the INCAR file. Please find the INCAR below. Please let me know if I am doing something wrong.

I will try to use VASP 4 and see if that will work. Will update later on that.

ENCUT = 400.000000
ENAUG = 645.000000
SIGMA = 0.050000
POTIM = 0.100000
EDIFF = 1.00e-05
PREC = Accurate
ISMEAR = 1
NSW = 120
IBRION = 2
LREAL = Auto
IMAGES = 4
SPRING = -5
LCLIMB = .TRUE.
LWAVE = .FALSE.
LCHARG = .FALSE.
NPAR = 2
zhangqingfan
Posts: 1
Joined: Thu May 24, 2012 12:46 am

Re: Job does not terminate/continue after convergence in a i

Post by zhangqingfan »

Hi,

I meet the same problem when I run NEB on the cluster. It looks like that the problem just occured when used the version vasp.5.2. Some one can solve this problem?

thx
graeme
Site Admin
Posts: 2260
Joined: Tue Apr 26, 2005 4:25 am
Contact:

Re: Job does not terminate/continue after convergence in a i

Post by graeme »

I don't know for sure, but I see that you are running the built-in CG optimizer (IBRION=2) which does not work with the NEB. Try running again with IBRION=3 or 1 (or any of our IOPT optimizers). I don't exactly know the logic of the built-in CG in vasp and how it interacts with the NEB, but an image may decide to quit on it's own without consulting with the other images.

Alternatively, (and if the first suggestion doesn't work) you can send me a .tar.gz file of the calculation so I can try to reproduce the error here, or look more closely at the output.
ckande
Posts: 11
Joined: Wed May 09, 2012 5:58 am

Re: Job does not terminate/continue after convergence in a i

Post by ckande »

Changing IBRION to 1 or 3 solves the problem. Thanks a lot Graeme. :)
Post Reply