This is an old revision of the document!
Table of Contents
Replica exchange of the disordering of a cluster
Dear Student,
In order to be able to run simulations at high priority, today we will work on the Empa Cluster. We have created a personal account for you. Since the cluster is behind a firewall, we must connect to a gate machine (jumphost) to be allowed to access to the cluster. For security reasons, there are two temporary passwords that you should change to a personal password (can be the same for the gate and for the cluster.
Here the instructions to connect. Your username/password (EMPA-USER, TEMP-PASSW1, TEMP-PASSW2) are listed at the end of this message.
1) Decide a password (we will call it EMPA-PASSW )
2) connect to the jumphost:
ssh -X EMPA-USER@jump1.empa.ch Password: TEMP-PASSW1
3) Accept the contract
4) Set a new password (input old password, TEMP-PASSW1, write new password, EMPA-PASSW)
5) Connect to hypatia: ssh -X hypatia password: TEMP-PASSW2
6) Accept the contract
7) Change your password as in point 4) using TMP-PASSW2 as old password and set EMPA-PASSW
User-specific information (note: TMP-PASSW1 ist the password for jump1, that is the FIRST one, but is listed as second):
EMPA-USER:TMP-PASSW2:TMP-PASSW1
[you@hypatia ~]$ mmm-init [you@hypatia ~]$ cd /mnt/scratch/YOURUSER/ [you@hypatia ~]$ cp -r /home/psd/exercise_7 . [you@hypatia ~]$ cd exercise_7
The commands that you need to do to perform the exercise are, in this order:
[you@hypatia ~]$ qsub 00_run [you@hypatia ~]$ ./01_adapt_files [you@hypatia ~]$ ./02_reorder [you@hypatia ~]$ ./03_extract_allaverages
Running the job
The script contains the directives for the queuing system, including 16 cores on one nodes reserved for the job.
#=== job name: #PBS -N parallel #=== wall time limit (h:m:s) #PBS -l walltime=1:00:00 #choice of the number of nodes and proc. per node #PBS -l nodes=1:ppn=16 #PBS -q short #which queue #=== memory usage ##PBS -l mem=1024mb #=== join stdout and stderr #PBS -j oe #====================================== # # set environment variables # module unload mvapich2 module load openmpi module load lammps/17Nov16/openmpi/2.0.1/gcc/4.9.4 cd $PBS_O_WORKDIR rm parallel.o* log.* screen* mpiexec -np 16 lmp_mpi -partition 16x1 -in input
The last line is the command to run a parallel lammps job with the input file input
The input file for lammps
The file input contains information for the program lammps. Details on the documentation can be found here
There is an initialization section, showing the kind of units (see this page), the dimensionality, the boundary conditions.
# Initialization units metal dimension 3 boundary p p p atom_style atomic
In the second part of the input file a spherical region is defined (to confine the cluster). Then the atoms are read from input.dat
. We also assign a mass to the kind number 1 (there is just one atomic type for Argon).
region rs sphere 0 0 0 12.66 read_data input.dat mass 1 39.948
Then, we define the parameters for the Lennard-Jones potential. The units are eV for epsilon, and angstrom for sigma. The last number is the cutoff, in Angstrom.
pair_style lj/cut 8.5 pair_coeff 1 1 0.01042 3.405 8.5
Then, we initialize the fix
and the velocity as well as the temperature of each replica, which have been previously generated using the program t.x present in the same directory. We distribute the temperature exponentially between 2 K and 40 K. In LAMMPS, a fix
is any operation that is applied to the system during timestepping or minimization. Here we have a fix
for controlling temperature with NVT (a different temperature for each temperature), and a fix
for applying a harmonic restraint to the spherical region confining the cluster. In this way, the atoms going beyond this region will be elastically pushed back into the sphere.
variable i equal part variable t world 2.00 2.44 2.98 3.64 4.45 5.43 6.63 8.09 9.88 12.07 14.74 17.99 21.97 26.83 32.76 40.00 velocity all create $t 293288 velocity all zero linear velocity all zero angular fix 1 all nvt temp $t $t 0.1 fix 2 all wall/region rs harmonic 2.0 0.0 0.4
The next section is about writing out each 1000 steps the relecant information about temperature and energy. We also dump a restart file at the end, and every 10000 steps a structure in xyz format.
thermo 1000 thermo_style custom step temp pe ke etotal thermo_modify line one restart 5000000 restart.* dump 2 all xyz 10000 structure_$i.xyz dump_modify 2 element "Ar" sort id
Finally, this is the command to run the tempering, with an exchange move attempted every 1000 step of molecular dynamics and an initial temperature $t that is different from replica to replica. The last numbers are random seeds that are used for choosing which replica to exchange and for the Metropolis criterion.
temper 5000000 1000 $t 1 3678 3490
Adapting the output files
We must now make some postprocessing on the output files. The goal is to performs averages at different temperatures. These averages are enhanced by the exchanges that were performed between different molecular dynamics replica. Note that temperature is set by a thermostat.
The script 01_adapt_files
performs the following operations:
- prunes the
log.lammps
file which contains a log of all exchanges between the replicas. Take only the steps for which we also have a dump of the atomic coordinates. - For all the
log.lammps.*
files from each replica take only the lines for which we also have a dump of the atomic coordinates. These lines are put in a file *.nxyz, one for each replica. Each line contains temperatures, potential energies, etc. - Compute the q4 order parameter for all structure files and create
*.q4
files, one for each replica. - now paste the
*.nxyz
and the*.q4
files into a filet_epot_q4_etot.*.out
containing the dump of temperature, energy, q4 every 10000 steps.
Reordering the replica: one temperature, one files
At this point, we have a set of t_ene_q4_etot.*.out
, one for each replica (processor). But along each of these files, the temperatures change a lot due to the exchanges. So, we use the file exchanges_nxyz.log
that keeps track of the exchanges, and tells us at a given timestep which replica has which temperature: we scramble the t_ene_q4_etot.*.out
files, and at the end we will have one file for each temperature. This is accomplished by the script 02_reorder
.
- Consider each file t_epot_q4_etot.*.out (processor by processor). Say you consider the number 5 (6th replica):
t_epot_q4_etot.5.out
. - At the step 50000, the file shows the following line:
50000 6.7133746 -1.7636174 0.189 -1.7315099
indicating a temperature of 6.7133746.
- The file
exchanges_nxyz.log
, at the step 50000, gives us the following line:
50000 7 0 3 2 1 6 10 5 12 8 11 4 9 13 15 14
indicating that at the 6th replica (column 7), we have the temperature 6, which is (see input file) T=6.63 K. Meaning that at step 50000, the thermostat is keeping replica 5 around the temperature T=6.63 K.
- This means that this line has to be put in the temperature file number 6.