Updated 2023-05-31

LAMMPS-GPU

Overview

LAMMPS is a classical molecular dynamics code with a focus on materials modeling. It's an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS has potentials for solid-state materials (metals, semiconductors) and soft matter (biomolecules, polymers) and coarse-grained or mesoscopic systems.

Running LAMMPS-GPU Interactively

Allocating Resources####

  • In order to run Lammps-GPU interactively we can use the salloc command to specify the account, partitions, time, and queue

  • Here is an example of an salloc command you can use: salloc -A [Account] -N 1 -n 8 -t 15 -q embers -G 1

  • This will allocate the proper resources to run LAMMPS-GPU

Using an Interactive File Example

  • The following example will show Lammps running interactively using a example interactive file

  • Here is what the interactive file lammps-gpu-example should look like:

# This LAMMPS input script simulates LJ particles in a 2D box
# Written by Simon Gravelle (https://simongravelle.github.io/)
# Find more scripts here: https://github.com/simongravelle/lammps-input-files
# LAMMPS tutorials for beginners: https://lammpstutorials.github.io/

# main parameters
units lj
dimension 2
atom_style atomic
pair_style lj/cut 2.5
boundary p p p

# create system and insert atoms
region myreg block -30 30 -30 30 -0.5 0.5
create_box 2 myreg
create_atoms 1 random 1500 341341 myreg
create_atoms 2 random 100 127569 myreg

# atom settings
mass 1 1
mass 2 1
pair_coeff 1 1 1.0 1.0
pair_coeff 2 2 0.5 3.0
neigh_modify every 1 delay 5 check yes

# minimisation
minimize 1.0e-4 1.0e-6 1000 10000
reset_timestep 0

# dynamics
fix mynve all nve
fix mylgv all langevin 1.0 1.0 0.1 1530917
fix myefn all enforce2d
timestep 0.005

# outputs
thermo 1000
dump mydmp all atom 1000 dump.lammpstrj

# run
run 10000
  • We can do the following in order to run this file with Lammps-GPU:

  • Load module: module load lammps-gpu

  • Run the script: srun -n 6 lmp < lammps-gpu-example

  • Your output should look something like this:

LAMMPS (7 Jan 2022)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/src/comm.cpp:98)
  using 1 OpenMP thread(s) per MPI task
Created orthogonal box = (-30 -30 -0.5) to (30 30 0.5)
  3 by 2 by 1 MPI processor grid
Created 1500 atoms
  using lattice units in orthogonal box = (-30 -30 -0.5) to (30 30 0.5)
  create_atoms CPU = 0.000 seconds
Created 100 atoms
  using lattice units in orthogonal box = (-30 -30 -0.5) to (30 30 0.5)
  create_atoms CPU = 0.000 seconds
WARNING: Using 'neigh_modify every 1 delay 0 check yes' setting during minimization (src/src/min.cpp:187)
  generated 1 of 1 mixed pair_coeff terms from geometric mixing rule

...
...

Setting up cg style minimization ...
  Unit style    : lj
  Current step  : 0
Per MPI rank memory allocation (min/avg/max) = 4.176 | 4.176 | 4.177 Mbytes
Step Temp E_pair E_mol TotEng Press 
       0            0 5.8997404e+14            0 5.8997404e+14 1.5732641e+15 
      81            0   -1.7518285            0   -1.7518285  -0.15730928 
Loop time of 0.0161491 on 6 procs for 81 steps with 1600 atoms

99.2% CPU use with 6 MPI tasks x 1 OpenMP threads

Minimization stats:
  Stopping criterion = energy tolerance
  Energy initial, next-to-last, final = 
       589974040194331  -1.75166415802626  -1.75182852779174
  Force two-norm initial, final = 2.5817498e+20 60.584174
  Force max component initial, final = 1.5160091e+20 11.519543
  Final line search alpha, max atom move = 6.6931485e-05 0.00077102009
  Iterations, force evaluations = 81 197

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 0.0059371  | 0.0071613  | 0.0088512  |   1.0 | 44.34
Neigh   | 0.0015247  | 0.0017042  | 0.0019472  |   0.3 | 10.55
Comm    | 0.002375   | 0.0041855  | 0.0055143  |   1.4 | 25.92
Output  | 0          | 0          | 0          |   0.0 |  0.00
Modify  | 0          | 0          | 0          |   0.0 |  0.00
Other   |            | 0.003098   |            |       | 19.18

...
...

Total # of neighbors = 8440
Ave neighs/atom = 5.275
Neighbor list builds = 24
Dangerous builds = 0
  generated 1 of 1 mixed pair_coeff terms from geometric mixing rule
Setting up Verlet run ...
  Unit style    : lj
  Current step  : 0
  Time step     : 0.005
Per MPI rank memory allocation (min/avg/max) = 4.063 | 4.063 | 4.063 Mbytes
Step Temp E_pair E_mol TotEng Press 
       0            0   -1.7518285            0   -1.7518285  -0.15730928 
    1000   0.99279852   -1.3476948            0  -0.35551678   0.78484062 
    2000     1.022668   -1.3292054            0  -0.30717657   0.81660226 
    3000    1.0185213   -1.3456334            0  -0.32774866   0.76583719 
    4000   0.96371604   -1.3062798            0  -0.34316611   0.85257195 
    5000   0.96229603   -1.3303442            0  -0.36864964   0.70962141 
    6000   0.94309004   -1.3151242            0  -0.37262362   0.72161759 
    7000   0.99747756   -1.3064984            0  -0.30964422   0.84469381 
    8000    1.0138762   -1.3348616            0  -0.32161914   0.78244061 
    9000    0.9639628   -1.3148769            0  -0.35151653   0.85549524 
   10000    1.0020337   -1.3172927            0  -0.31588535   0.83832425 
Loop time of 0.670191 on 6 procs for 10000 steps with 1600 atoms

Performance: 6445926.779 tau/day, 14921.127 timesteps/s
99.8% CPU use with 6 MPI tasks x 1 OpenMP threads

...
...

Total # of neighbors = 8568
Ave neighs/atom = 5.355
Neighbor list builds = 1152
Dangerous builds = 0
Total wall time: 0:00:00

Running Lammps-GPU in Batch Mode

  • We can also test this in a normal batch mode. Here is an example batch script:
#!/bin/bash
#SBATCH -J SBATCHlammpsTest
#SBATCH -A phx-pace-staff
#SBATCH -N 2 --ntasks-per-node=4
#SBATCH -t 10
#SBATCH -q embers
#SBATCH -o Report-%j.out
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --mail-user=gburdell3@gatech.edu

cd $SLURM_SUBMIT_DIR
module load lammps-gpu
srun -n 6 lmp < lammps-gpu-example
  • Your expected output should look something like this:
---------------------------------------
Begin Slurm Prolog: Apr-28-2023 01:54:28
Job ID:    1774754
User ID:   gburdell3
Account:   [Account]
Job name:  SBATCHlammpsTest
Partition: cpu-small
QOS:       embers
---------------------------------------

Lmod is automatically replacing "gcc/10.3.0-o57x6h" with "intel/20.0.4".


The following have been reloaded with a version change:
  1) mvapich2/2.3.6-ouywal => mvapich2/2.3.6-z2duuy

LAMMPS (7 Jan 2022)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/src/comm.cpp:98)
  using 1 OpenMP thread(s) per MPI task
Created orthogonal box = (-30 -30 -0.5) to (30 30 0.5)
  3 by 2 by 1 MPI processor grid
Created 1500 atoms
  using lattice units in orthogonal box = (-30 -30 -0.5) to (30 30 0.5)
  create_atoms CPU = 0.000 seconds
Created 100 atoms
  using lattice units in orthogonal box = (-30 -30 -0.5) to (30 30 0.5)
  create_atoms CPU = 0.000 seconds
WARNING: Using 'neigh_modify every 1 delay 0 check yes' setting during minimization (src/src/min.cpp:187)
  generated 1 of 1 mixed pair_coeff terms from geometric mixing rule

...
...

Setting up cg style minimization ...
  Unit style    : lj
  Current step  : 0
Per MPI rank memory allocation (min/avg/max) = 4.176 | 4.176 | 4.177 Mbytes
Step Temp E_pair E_mol TotEng Press 
       0            0 5.8997404e+14            0 5.8997404e+14 1.5732641e+15 
      81            0   -1.7518285            0   -1.7518285  -0.15730928 
Loop time of 0.0161491 on 6 procs for 81 steps with 1600 atoms

99.2% CPU use with 6 MPI tasks x 1 OpenMP threads

Minimization stats:
  Stopping criterion = energy tolerance
  Energy initial, next-to-last, final = 
       589974040194331  -1.75166415802626  -1.75182852779174
  Force two-norm initial, final = 2.5817498e+20 60.584174
  Force max component initial, final = 1.5160091e+20 11.519543
  Final line search alpha, max atom move = 6.6931485e-05 0.00077102009
  Iterations, force evaluations = 81 197

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 0.0059371  | 0.0071613  | 0.0088512  |   1.0 | 44.34
Neigh   | 0.0015247  | 0.0017042  | 0.0019472  |   0.3 | 10.55
Comm    | 0.002375   | 0.0041855  | 0.0055143  |   1.4 | 25.92
Output  | 0          | 0          | 0          |   0.0 |  0.00
Modify  | 0          | 0          | 0          |   0.0 |  0.00
Other   |            | 0.003098   |            |       | 19.18

...
...

Total # of neighbors = 8440
Ave neighs/atom = 5.275
Neighbor list builds = 24
Dangerous builds = 0
  generated 1 of 1 mixed pair_coeff terms from geometric mixing rule
Setting up Verlet run ...
  Unit style    : lj
  Current step  : 0
  Time step     : 0.005
Per MPI rank memory allocation (min/avg/max) = 4.063 | 4.063 | 4.063 Mbytes
Step Temp E_pair E_mol TotEng Press 
       0            0   -1.7518285            0   -1.7518285  -0.15730928 
    1000   0.99279852   -1.3476948            0  -0.35551678   0.78484062 
    2000     1.022668   -1.3292054            0  -0.30717657   0.81660226 
    3000    1.0185213   -1.3456334            0  -0.32774866   0.76583719 
    4000   0.96371604   -1.3062798            0  -0.34316611   0.85257195 
    5000   0.96229603   -1.3303442            0  -0.36864964   0.70962141 
    6000   0.94309004   -1.3151242            0  -0.37262362   0.72161759 
    7000   0.99747756   -1.3064984            0  -0.30964422   0.84469381 
    8000    1.0138762   -1.3348616            0  -0.32161914   0.78244061 
    9000    0.9639628   -1.3148769            0  -0.35151653   0.85549524 
   10000    1.0020337   -1.3172927            0  -0.31588535   0.83832425 
Loop time of 0.670191 on 6 procs for 10000 steps with 1600 atoms

Performance: 6445926.779 tau/day, 14921.127 timesteps/s
99.8% CPU use with 6 MPI tasks x 1 OpenMP threads

...
...

Total # of neighbors = 8568
Ave neighs/atom = 5.355
Neighbor list builds = 1152
Dangerous builds = 0
Total wall time: 0:00:00
---------------------------------------
Begin Slurm Epilog: Apr-28-2023 01:54:32
Job ID:        1774754
Array Job ID:  _4294967294
User ID:      gburdell3
Account:      [Account]
Job name:      SBATCHlammpsTest
Resources:     cpu=8,mem=8G,node=2
Rsrc Used:     cput=00:01:28,vmem=336K,walltime=00:00:11,mem=0,energy_used=0
Partition:     cpu-small
QOS:           embers
Nodes:         atl1-1-02-010-22-2,atl1-1-02-010-23-1
---------------------------------------