NEB running on multiple nodes

Vasp transition state theory tools

Moderator: moderators

Post Reply
ng00822
Posts: 5
Joined: Sun Sep 15, 2013 3:59 am

NEB running on multiple nodes

Post by ng00822 »

Hi guys,

I am currently doing an NEB run with one image/node, where each node has 16 cores (2 sockets, 8 cores/socket). I found for normal VASP optimization runs that using KPAR=2 and NPAR=1 seems to work best, along with "-bysocket -bind-to-socket" mpirun time parameters (the mpirun parameters may be superfluous, still gathering data...). When using the same settings on VTST, certain images have nearly equal performance, while others are much worse (DAV iterations hit worse than RMM). I think the problem may lie on the nodes itself, is there a way to tell which node each image was assigned to? Do you have any recommendations on any mpi-specific parameters to use (or not use)?

Thanks!
ng00822
Posts: 5
Joined: Sun Sep 15, 2013 3:59 am

Re: NEB running on multiple nodes

Post by ng00822 »

I think I narrowed the problem to bad behavior by sge_execd, part of the UGE/SGE cluster management system. It is causing a ~30x slowdown during the DAV iterations, ~8x slowdown during RMM-DIIS. Need to get that investigated.
Post Reply