0% found this document useful (0 votes)
271 views92 pages

LAMMPS

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 92

LAMMPS Features and Capabilities

Steve Plimpton
Sandia National Labs
sjplimp@sandia.gov

LAMMPS Users and Developers Workshop


International Centre for Theoretical Physics (ICTP)
March 2014 - Trieste, Italy

Presentation: SAND2014-2239C

Classical MD in a nutshell

LAMMPS from 10,000 meters


Large-scale Atomic/Molecular Massively Parallel Simulator
http://lammps.sandia.gov
Classical MD code
Open source, portable C++
3-legged stool: soft matter, solids, mesoscale

LAMMPS from 10,000 meters


Large-scale Atomic/Molecular Massively Parallel Simulator
http://lammps.sandia.gov
Classical MD code
Open source, portable C++
3-legged stool: soft matter, solids, mesoscale

Particle simulator at varying length and time scales


electrons atomistic coarse-grained continuum
Spatial-decomposition of simulation domain for parallelism
MD, non-equilibrium MD, energy minimization
GPU and OpenMP enhanced
Can be coupled to other scales: QM, kMC, FE, CFD, ...

Reasons to use LAMMPS

Versatile
bio, materials, mesoscale
atomistic, coarse-grained, continuum
use with other codes, e.g. multiscale models

2
3

Good parallel performance


Easy to extend
Tuesday AM - Modifying & Extending LAMMPS
Wednesday PM - Hands-on: Writing new code for LAMMPS

Well documented
extensive web site
1300 page manual

Active and supportive user community


45K postings to mail list, 1500 subscribers
quick turn-around on Qs posted to mail list

Resources for learning LAMMPS

Examples: about 35 sub-dirs under examples in distro


Manual: doc/Manual.html
Intro, Commands, Packages, Accelerating
Howto, Modifying, Errors

Alphabetized command list: one doc page per command


doc/Section commands.html 3.5

Web site: http://lammps.sandia.gov


Pictures, Movies - examples of others work
Papers - find a paper similar to what you want to model
Workshops - slides from LAMMPS simulation talks

Mail list: search it, post to it


http://lammps.sandia.gov/mail.html

These slides (more info than I can probably present!)

Structure of typical input scripts


1
2

Units and atom style


Create simulation box and atoms
region, create box, create atoms, region commands
lattice command vs box units

read data command


data file is a text file
look at examples/micelle/data.micelle
see read data doc page for full syntax
3
4
5
6
7
8
9
10

Define groups
Set attributes of atoms: mass, velocity
Pair style for atom interactions
Fixes for time integration and constraints
Computes for diagnostics
Output: thermo, dump, restart
Run or minimize
Rinse and repeat (script executed one command at a time)

Debugging an input script


LAMMPS tries hard to flag many kinds of errors and warnings
1

If an input command generates the error ...


% lmp linux -echo screen < in.polymer
re-read the doc page for the command

For input, setup, run-time errors ...


search doc/Section errors.html for text of error message
also for warnings, they are usually important
if specific input command causes problems,
look for IMPORTANT NOTE info on doc page
look in the source code file at the line number

Search the mail list, others may have similar problem


google for: lammps-users fix npt, or error message

Debugging an input script


LAMMPS tries hard to flag many kinds of errors and warnings
1

If an input command generates the error ...


% lmp linux -echo screen < in.polymer
re-read the doc page for the command

For input, setup, run-time errors ...


search doc/Section errors.html for text of error message
also for warnings, they are usually important
if specific input command causes problems,
look for IMPORTANT NOTE info on doc page
look in the source code file at the line number

Search the mail list, others may have similar problem

Remember: an input script is like a program

google for: lammps-users fix npt, or error message


start with small systems
start with one processor
turn-on complexity one command at a time
monitor thermo output, viz the results (use dump image)

Debug by examining screen output

LAMMPS (15 Aug 2013)


Lattice spacing in x,y,z = 1.28436 2.22457 1.28436
Created orthogonal box = (0 0 -0.321089)
to (51.3743 22.2457 0.321089)
4 by 1 by 1 MPI processor grid
Created 840 atoms
120 atoms in group lower
120 atoms in group upper
240 atoms in group boundary
600 atoms in group flow
Setting atom values ...
120 settings made for type
Setting atom values ...
120 settings made for type
Deleted 36 atoms, new total = 804
Deleted 35 atoms, new total = 769

Thermodynamic output
Look for blow-ups or NaNs, print every step if necessary
WARNING: Temperature for thermo pressure is not
for group all (../thermo.cpp:436)
Setting up run ...
Memory usage per processor = 2.23494 Mbytes
Step Temp E pair E mol TotEng Press Volume
0 1.0004177 0 0 0.68689281 0.46210058 1143.0857
1000 1 -0.32494012 0 0.36166587 1.2240503 1282.5239
2000 1 -0.37815616 0 0.30844982 1.0642877 1312.5691
...
...
...
25000 1 -0.36649381 0 0.32011217 0.98366691 1451.5444
25000 1 -0.38890426 0 0.29770172 0.95284427 1455.9361
Loop time of 1.76555 on 4 procs for
25000 steps with 769 atoms

Timing info

Loop time of 1.76555 on 4 procs for


25000 steps with 769 atoms
Pair time (%) = 0.14617 (8.27903)
Neigh time (%) = 0.0467809 (2.64966)
Comm time (%) = 0.307951 (17.4422)
Outpt time (%) = 0.674575 (38.2078)
Other time (%) = 0.590069 (33.4213)

Run statistics

Per-processor values at end of run


Nlocal: 192.25 ave 242 max 159 min
Histogram: 2 0 0 0 0 1 0 0 0 1
Nghost: 43 ave 45 max 39 min
Histogram: 1 0 0 0 0 0 0 0 2 1
Neighs: 414 ave 588 max 284 min
Histogram: 2 0 0 0 0 0 1 0 0 1
Total # of neighbors = 1656
Ave neighs/atom = 2.15345
Neighbor list builds = 1641
Dangerous builds = 1

Debug by visualization - what does your system do?


Dump image for instant JPGs
image.16500.jpg
ImageMagick display
Mac Preview
Make/view a movie
ImageMagick
convert *.jpg image.gif
open in browser
open -a Safari image.gif
Mac QuickTime
open image sequence
Windows Media Player
VMD, AtomEye, ...

Defining variables in input scripts

Styles: index, loop, equal, atom, ...


variable
variable
variable
variable

x
x
x
x

index run1 run2 run3 run4


loop 100
equal trap(f JJ[3])*${scale}
atom -(c p[1]+c p[2]+c p[3])/(3*vol)

Defining variables in input scripts

Styles: index, loop, equal, atom, ...


variable
variable
variable
variable

x
x
x
x

index run1 run2 run3 run4


loop 100
equal trap(f JJ[3])*${scale}
atom -(c p[1]+c p[2]+c p[3])/(3*vol)

Formulas can be complex


see doc/variable.html
thermo keywords (temp, press, ...)
math operators & functions (sqrt, log, cos, ...)
group and region functions (count, xcm, fcm, ...)
various special functions (min, ave, trap, stride, stagger, ...)
per-atom vectors (x, vx, fx, ...)
output from computes, fixes, other variables

Formulas can be time- and/or spatially-dependent

Using variables in input scripts

Substitute in any command via $x or ${myVar}


Can define them as command-line arguments
% lmp linux -v myTemp 350.0 < in.polymer

Use in next command to increment a variable


with jump command to create loops

Many commands allow them as arguments


fix addforce 0.0 v fy 1.0
dump modify every v count
region sphere 0.0 0.0 0.0 v radius

Allows time- and spatially-dependent commands

Power tools for input scripts


Filename options:
dump.*.% for per-snapshot or per-processor output
read data data.protein.gz
read restart old.restart.*

If/then/else via if command


Insert another script via include command
useful for long list of parameters

Power tools for input scripts


Filename options:
dump.*.% for per-snapshot or per-processor output
read data data.protein.gz
read restart old.restart.*

If/then/else via if command


Insert another script via include command
useful for long list of parameters

Looping via next and jump commands


easy to run incrementally and stop when condition met
see examples on jump command doc page

Invoke a shell command or external program


shell cd subdir1
shell my analyze out.file $n ${param}

Various ways to run multiple simulations from one script


see Section howto 6.4 of manual

Example script for multiple runs


variable r equal random(1,1000000000,58798)
variable a loop 8
variable t index 0.8 0.85 0.9 0.95 1.0 1.05 1.1 1.15
log log.$a
read data.polymer
velocity all create $t $r
fix 1 all nvt $t $t 1.0
dump 1 all atom 1000 dump.$a.*
run 100000
next t
next a
jump in.polymer

Example script for multiple runs


variable r equal random(1,1000000000,58798)
variable a loop 8
variable t index 0.8 0.85 0.9 0.95 1.0 1.05 1.1 1.15
log log.$a
read data.polymer
velocity all create $t $r
fix 1 all nvt $t $t 1.0
dump 1 all atom 1000 dump.$a.*
run 100000
next t
next a
jump in.polymer
Run 8 simulations on 3 partitions until finished:
change a,t to universe-style variables
% mpirun -np 12 lmp linux -p 3x4 -in in.polymer

Building systems: a pre-processing task


In general, can be a hard problem!
Molecular topology is an input to LAMMPS
get it from a builder, massage into LAMMPS format
auto-magical assignment of force-fields is also hard

LAMMPS includes some basic pre-processors (tools dir)


bead-spring chain builder
ch2lmp = PDB to LAMMPS converter
amber2lmp = AMBER to LAMMPS converter
msi2lmp = Accelrys to LAMMPS converter

Building systems: a pre-processing task


In general, can be a hard problem!
Molecular topology is an input to LAMMPS
get it from a builder, massage into LAMMPS format
auto-magical assignment of force-fields is also hard

LAMMPS includes some basic pre-processors (tools dir)


bead-spring chain builder
ch2lmp = PDB to LAMMPS converter
amber2lmp = AMBER to LAMMPS converter
msi2lmp = Accelrys to LAMMPS converter

3rd party builders and force-field generators


VMD TopoTools, Avogodro, PackMol, Moltemplate
Votca for CG force-field generation
http://lammps.sandia.gov/prepost.html

Monte Carlo builders and force-field assignment


Towhee (configurational bias) & others

Be willing to write system-building scripts yourself

Moltemplate

http://www.moltemplate.org (Andrew Jewett, UCSB)


Bundled with LAMMPS, designed to work with it
Scripting language to build monomers/chains/systems
hierarchically

Provide atom charges & bond list


Moltemplate generates angles, dihedrals, etc
Also assigns force field params (only OPLS-AA currently)

More complex geometries with Moltemplate

Pair styles

LAMMPS lingo for interaction potentials

Pair styles

LAMMPS lingo for interaction potentials


A pair style can be true pair-wise or many-body
LJ, Coulombic, Buckingham, Morse, Yukawa, ...
EAM, Tersoff, REBO, ReaxFF, ...

Bond/angle/dihedral/improper styles = permanent bonds

Pair styles

LAMMPS lingo for interaction potentials


A pair style can be true pair-wise or many-body
LJ, Coulombic, Buckingham, Morse, Yukawa, ...
EAM, Tersoff, REBO, ReaxFF, ...

Bond/angle/dihedral/improper styles = permanent bonds


Variants optimized for GPU and many-core
GPU, USER-CUDA, USER-OMP packages
lj/cut, lj/cut/gpu, lj/cut/cuda, lj/cut/omp
see doc/Section accelerate.html

Coulomb interactions included in pair style


lj/cut, lj/cut/coul/cut, lj/cut/coul/wolf, lj/cut/coul/long
done to optimize inner loop

Categories of potentials (pair styles) in LAMMPS


All-atom: OPLS, CHARMM, AMBER, etc
Charged systems:
pair lj/cut/coul/cut, lj/cut/coul/long + kspace style

UA: pair lj, pair coul, bond/angle/dihedral harmonic, etc


Coarse-grained
FENE, DPD, SDK, granular, SPH, peri, colloid, lubricate,
brownian, FLD

Aspherical
gayberne, resquared, line, tri

Tabulated (e.g. force matching)


pair table, bond table, angle table, etc

Reactive: ReaxFF, COMB, AIREBO, other bond-order models


Hybrid systems: pair hybrid and hybrid/overlay
polymers on metal surface
polymers with nano-particles
solid-solid interface between 2 materials

Pair styles
See doc/Section commands.html for full list

Pair styles
And they come in accelerated flavors: omp, gpu, cuda

Pair styles
See doc/pair style.html for one-line descriptions

Relative computational cost of different potentials


See lammps.sandia.gov/bench.html#potentials

Relative computational cost of different potentials


See lammps.sandia.gov/bench.html#potentials

Estimate CPU cost for system size & timesteps you need
Assume good parallel scalability if have 1000+ atoms/core

Cost [core-sec/atom-timestep]

Moores Law for potentials

10

-2

GAP

-3

10

COMB
ReaxFF

10

-4

AIREBO
MEAM
CHARMm

-5

10
10

eFF

BOP

SPC/E

EIM

REBO
Stillinger-Weber
EAM
Tersoff

-6

1980

1990
2000
Year Published

2010

Neighbor lists in LAMMPS


Problem: how to efficiently find neighbors within cutoff?
For each atom, test against all others
O(N 2 ) algorithm

Verlet lists:
Verlet, Phys Rev, 159, p 98 (1967)
Rneigh = Rforce + skin
build list: once every few timesteps
other timesteps: scan larger list for
neighbors within force cutoff
rebuild: any atom moves 1/2 skin

Link-cells (bins):
Hockney et al, J Comp Phys,
14, p 148 (1974)
grid domain: bins of size Rforce
each step: search 27 bins for
neighbors (or 14 bins)

Neighbor lists (continued)

Verlet list is 6x savings over bins


Vsphere = 4/3 r 3
Vcube = 27 r 3

LAMMPS does both


link-cell to build Verlet list
use Verlet list on non-build timesteps
O(N) in CPU and memory
constant-density assumption

Bond styles (also angle, dihedral, improper)

Used for molecules with fixed bonds


Fix bond/break and bond style quartic can break them
Fix bond/create can add them (e.g. cross-linking)

To learn what bond styles LAMMPS has ...


where would you look?

Bond styles (also angle, dihedral, improper)

Used for molecules with fixed bonds


Fix bond/break and bond style quartic can break them
Fix bond/create can add them (e.g. cross-linking)

To learn what bond styles LAMMPS has ...


where would you look?
doc/Section commands.html or doc/bond style.html

Long-range Coulombics
KSpace style in LAMMPS lingo, see doc/kspace style.html
Options:
traditional Ewald, scales as O(N 3/2 )
PPPM (like PME), scales as O(N log(N))
MSM, scales as O(N), lj/cut/coul/msm

Additional options:
non-periodic, PPPM (z) vs MSM (xyz)
long-range dispersion (LJ)

Long-range Coulombics
KSpace style in LAMMPS lingo, see doc/kspace style.html
Options:
traditional Ewald, scales as O(N 3/2 )
PPPM (like PME), scales as O(N log(N))
MSM, scales as O(N), lj/cut/coul/msm

Additional options:
non-periodic, PPPM (z) vs MSM (xyz)
long-range dispersion (LJ)

PPPM is fastest choice for most systems


FFTs can scale poorly for large processor counts

MSM can be faster for low-accuracy or large proc counts

Long-range Coulombics
KSpace style in LAMMPS lingo, see doc/kspace style.html
Options:
traditional Ewald, scales as O(N 3/2 )
PPPM (like PME), scales as O(N log(N))
MSM, scales as O(N), lj/cut/coul/msm

Additional options:
non-periodic, PPPM (z) vs MSM (xyz)
long-range dispersion (LJ)

PPPM is fastest choice for most systems


FFTs can scale poorly for large processor counts

MSM can be faster for low-accuracy or large proc counts


Ways to speed-up long-range calculations:
see doc/Section accelerate.html
cutoff & accuracy settings adjust Real vs KSpace work
kspace style pppm/stagger for PPPM
kspace modify diff ad for smoothed PPPM
run style verlet/split

PPPM (particle-particle particle-mesh) in LAMMPS


Hockney & Eastwood, Comp Sim Using Particles (1988)
Darden, et al, J Chem Phys, 98, p 10089 (1993).
Like Ewald, except sum over periodic images evaluated:
interpolate atomic charge to 3d mesh
solve Poissons equation on mesh (4 FFTs)
interpolate E-fields back to atoms

User-specified accuracy + cutoff ewald-G + mesh-size


p
Scales as N log(N) if grow cutoff with N
Scales as N log(N) if cutoff held fixed

Parallel FFTs in LAMMPS

3d FFT is 3 sets of 1d FFTs


in parallel, 3d grid is distributed
across procs
1d FFTs on-processor
native library or FFTW
(www.fftw.org)
multiple transposes of 3d grid
data transfer can be costly

FFTs for PPPM can scale poorly


on large # of procs and on clusters

Parallel FFTs in LAMMPS

3d FFT is 3 sets of 1d FFTs


in parallel, 3d grid is distributed
across procs
1d FFTs on-processor
native library or FFTW
(www.fftw.org)
multiple transposes of 3d grid
data transfer can be costly

FFTs for PPPM can scale poorly


on large # of procs and on clusters

Good news: Cost of PPPM is only 2x more than 8-10 Ang cutoff

Fixes

Most flexible feature in LAMMPS


Allows control of what happens when within each timestep
Loop over timesteps:
communicate ghost atoms
build neighbor list (once in a while)
compute forces
communicate ghost forces

output to screen and files

Fixes

Most flexible feature in LAMMPS


Allows control of what happens when within each timestep
Loop over timesteps:
fix initial
NVE, NVT, NPT, rigid-body integration
communicate ghost atoms
fix neighbor
insert particles
build neighbor list (once in a while)
compute forces
communicate ghost forces
fix force
SHAKE, langevin drag, wall, spring, gravity
fix final
NVE, NVT, NPT, rigid-body integration
fix end
volume & T rescaling, diagnostics
output to screen and files

Fixes
100+ fixes in LAMMPS
You choose what group of atoms to apply fix to
Already saw some in obstacle example:
fix
fix
fix
fix
fix

1
2
3
5
6

all nve
flow temp/rescale 200 1.0 1.0 0.02 1.0
lower setforce 0.0 0.0 0.0
upper aveforce 0.0 -0.5 0.0
flow addforce 1.0 0.0 0.0

Fixes
100+ fixes in LAMMPS
You choose what group of atoms to apply fix to
Already saw some in obstacle example:
fix
fix
fix
fix
fix

1
2
3
5
6

all nve
flow temp/rescale 200 1.0 1.0 0.02 1.0
lower setforce 0.0 0.0 0.0
upper aveforce 0.0 -0.5 0.0
flow addforce 1.0 0.0 0.0

To learn what fix styles LAMMPS has ...


where would you look?

Fixes
100+ fixes in LAMMPS
You choose what group of atoms to apply fix to
Already saw some in obstacle example:
fix
fix
fix
fix
fix

1
2
3
5
6

all nve
flow temp/rescale 200 1.0 1.0 0.02 1.0
lower setforce 0.0 0.0 0.0
upper aveforce 0.0 -0.5 0.0
flow addforce 1.0 0.0 0.0

To learn what fix styles LAMMPS has ...


where would you look?
doc/Section commands.html or doc/fix.html

Fixes
100+ fixes in LAMMPS
You choose what group of atoms to apply fix to
Already saw some in obstacle example:
fix
fix
fix
fix
fix

1
2
3
5
6

all nve
flow temp/rescale 200 1.0 1.0 0.02 1.0
lower setforce 0.0 0.0 0.0
upper aveforce 0.0 -0.5 0.0
flow addforce 1.0 0.0 0.0

To learn what fix styles LAMMPS has ...


where would you look?
doc/Section commands.html or doc/fix.html
If you familiarize yourself with fixes,
youll know many things LAMMPS can do
Many fixes store output accessible by other commands
rigid body COM
thermostat energy
forces before modified

Computes
75 computes in LAMMPS
Calculate some property of system, in parallel
Always for the current timestep
To learn what compute styles LAMMPS has ...

Computes
75 computes in LAMMPS
Calculate some property of system, in parallel
Always for the current timestep
To learn what compute styles LAMMPS has ...
doc/Section commands.html or doc/compute.html

Computes
Key point:
computes store their results
other commands invoke them and use the results
e.g. thermo output, dumps, fixes

Output of computes: (discussion in section 6.15 of manual)


global vs per-atom vs local
scalar vs vector vs array
extensive vs intensive values

Computes
Key point:
computes store their results
other commands invoke them and use the results
e.g. thermo output, dumps, fixes

Output of computes: (discussion in section 6.15 of manual)


global vs per-atom vs local
scalar vs vector vs array
extensive vs intensive values

Examples:
temp & pressure = global scalar or vector
pe/atom = potential energy per atom (vector)
displace/atom = displacement per atom (array)
pair/local & bond/local = per-neighbor or per-bond info

Many computes are useful with averaging fixes:


fix ave/time, fix ave/spatial, fix ave/atom
fix ave/histo, fix ave/correlate

Thermo output
One line of output every N timesteps to screen and log file
See doc/thermo style.html

Thermo output
One line of output every N timesteps to screen and log file
See doc/thermo style.html
Any scalar can be output:
dozens of keywords: temp, pyy, eangle, lz, cpu
any output of a compute or fix: c ID, f ID[N], c ID[N][M]
fix ave/time stores time-averaged quantities

equal-style variable: v MyVar


one value from atom-style variable: v xx[N]
any property for one atom: q, fx, quat, etc
useful for debugging or post-analysis

Thermo output
One line of output every N timesteps to screen and log file
See doc/thermo style.html
Any scalar can be output:
dozens of keywords: temp, pyy, eangle, lz, cpu
any output of a compute or fix: c ID, f ID[N], c ID[N][M]
fix ave/time stores time-averaged quantities

equal-style variable: v MyVar


one value from atom-style variable: v xx[N]
any property for one atom: q, fx, quat, etc
useful for debugging or post-analysis

Post-process via:
tools/python/logplot.py log.lammps X Y (via GnuPlot)
tools/python/log2txt.py log.lammps data.txt X Y ...
Pizza.py log tool
can read thermo output across multiple runs

tools/xmgrace/README and one-liners and auto-plotter

Dump output

Snapshot of per-atom values every N timesteps


See doc/dump.html

Dump output

Snapshot of per-atom values every N timesteps


See doc/dump.html
Styles
atom, custom (both native LAMMPS)
VMD will auto-read if file named *.lammpstraj

xyz for coords only


cfg for AtomEye
DCD, XTC for CHARMM, NAMD, GROMACS
useful for back-and-forth runs and analysis

Dump output

Snapshot of per-atom values every N timesteps


See doc/dump.html
Styles
atom, custom (both native LAMMPS)
VMD will auto-read if file named *.lammpstraj

xyz for coords only


cfg for AtomEye
DCD, XTC for CHARMM, NAMD, GROMACS
useful for back-and-forth runs and analysis

Two additional styles


local: per-neighbor, per-bond, etc info
image: instant JPG/PPM picture, rendered in parallel

Dump output

Any per-atom quantity can be output


dozens of keywords: id, type, x, xs, xu, mux, omegax, ...
any output of a compute or fix: f ID, c ID[M]
atom-style variable: v foo

Dump output

Any per-atom quantity can be output


dozens of keywords: id, type, x, xs, xu, mux, omegax, ...
any output of a compute or fix: f ID, c ID[M]
atom-style variable: v foo

Additional options:
control which atoms by group or region
control which atoms by threshold
dump modify thresh c pe > 3.0

text or binary or gzipped


one big file or per snapshot or per proc
see dump modify fileper or nfile

Dump output

Any per-atom quantity can be output


dozens of keywords: id, type, x, xs, xu, mux, omegax, ...
any output of a compute or fix: f ID, c ID[M]
atom-style variable: v foo

Additional options:
control which atoms by group or region
control which atoms by threshold
dump modify thresh c pe > 3.0

text or binary or gzipped


one big file or per snapshot or per proc
see dump modify fileper or nfile

Post-run conversion
tools/python/dump2cfg.py, dump2pdb.py, dump2xyz.py
Pizza.py dump, cfg, ensight, pdb, svg, vtk, xyz tools

Classical MD in parallel

MD is inherently parallel
forces on each atom can be computed simultaneously
X and V can be updated simultaneously

Nearly all MD codes are parallelized


distributed-memory message-passing (MPI) between nodes
MPI or threads (OpenMP, GPU) within node

Classical MD in parallel

MD is inherently parallel
forces on each atom can be computed simultaneously
X and V can be updated simultaneously

Nearly all MD codes are parallelized


distributed-memory message-passing (MPI) between nodes
MPI or threads (OpenMP, GPU) within node

MPI = message-passing interface


MPICH or OpenMPI
assembly-language of parallel computing
lowest-common denominator
most portable
runs on all parallel machines, even on multi- and many-core
more scalable than shared-memory parallel

Goals for parallel algorithms


Scalable
short-range MD scales as N
optimal parallel scaling is N/P
even on clusters with higher communication costs

Good for short-range forces


80% of CPU
long-range Coulombics have short-range component

Fast for small systems, not just large


nano, polymer, bio systems require long timescales
1M steps of 10K atoms is more useful than 10K steps of 1M
atoms

Efficient at finding neighbors


liquid state, polymer melts, small-molecule diffusion
neighbors change rapidly
atoms on a fixed lattice is simpler to parallelize

Parallel algorithms for MD

Plimpton, J Comp Phys, 117, p 1 (1995)


3 classes of algorithms, used by all MD codes
atom-decomposition = split and replicate atoms
force-decomposition = partition forces
3 spatial-decomposition = geometric split of simulation box
1
2

Parallel algorithms for MD

Plimpton, J Comp Phys, 117, p 1 (1995)


3 classes of algorithms, used by all MD codes
atom-decomposition = split and replicate atoms
force-decomposition = partition forces
3 spatial-decomposition = geometric split of simulation box
1
2

All 3 methods balance computation optimally as N/P


Differ in organization of inter-particle force computation,
other tasks can be done within any of 3 algorithms
molecular forces
time integration (NVE/NVT/NPT)
thermodynamics, diagnostics, ...

Differ in issues affecting parallel scalability


communication costs
load-balance

LAMMPS is parallelized via spatial-decomposition

Physical domain divided into 3d bricks


One brick per MPI task
Compute forces on atoms in box
using ghost info from nearby bricks
Atoms carry properties &
topology as they migrate
Comm of ghost atoms within cutoff
6-way local stencil

Short-range forces
CPU cost scales as O(N/P)

Parallel performance
See http://lammps.sandia.gov/bench.html

Parallel performance
See http://lammps.sandia.gov/bench.html

Useful exercise:
run bench/in.lj, change N and P, is it O(N/P) ?
% mpirun -np 2 lmp linux < in.lj
% lmp linux -v x 2 -v y 2 -v z 2 < in.lj

How to speed-up your simulations

See doc/Section accelerate.html of manual

Many ideas for long-range Coulombics


PPPM with 2 vs 4 FFTs
PPPM with staggered grid
run style verlet/split command
adjust processor layout via processors command

How to speed-up your simulations

GPU and USER-CUDA and USER-OMP packages


GPU:
pair style and neighbor list build on GPU
can use multiple cores per GPU
39 supported pair styles, PPPM

USER-CUDA:
fixes and computes onto GPU (many timesteps)
one core per GPU
30 pair styles, 15 fixes, 4 computes, PPPM

USER-OMP:
threading via OpenMP, run 1 or 2 MPI tasks/node
95 pair styles, 29 fixes, many PPPM variants

GPU benchmark data at


http://lammps.sandia.gov/bench.html
desktop and Titan (ORNL)

How to speed-up your simulations

Increase time scale via timestep size


fix shake for rigid bonds (2 fs)
run style respa for hierarchical steps (4 fs)

Increase length scale via coarse graining


all-atom vs united-atom vs bead-spring
also increases time scale
mesoscale models:
ASPHERE, BODY, COLLOID, FLD packages
GRANULAR, PERI, RIGID, SRD packages
see doc/Section packages.html for details

Quick tour of more advanced topics

See http://lammps.sandia.gov/features.html
1

Units
see doc/units.html
LJ, real, metal, cgs, si, micro, nano
all input/output in one unit system

Ensembles
see doc/Section howto.html 6.16
one or more thermostats (by group)
single barostat
rigid body dynamics (RIGID package)

Hybrid models
pair style hybrid and hybrid/overlay
atom style hybrid sphere bond ...

Quick tour of more advanced topics

Aspherical particles
see doc/Section howto.html 6.14
ellipsoidal, lines, triangles, rigid bodies
ASPHERE package

Mesoscale and continuum models


COLLOID, FLD, SRD packages for NPs and colloids
PERI package for Peridynamics
USER-ATC package for atom-to-continuum (FE)
GRANULAR package for granular media
add-on LIGGGHTS package for DEM
www.liggghts.com and www.cfdem.com

Quick tour of more advanced topics


6

Multi-replica modeling
see doc/Section howto.html 6.14
parallel tempering via temper command
PRD, TAD, NEB in REPLICA package

Quick tour of more advanced topics


6

Multi-replica modeling
see doc/Section howto.html 6.14
parallel tempering via temper command
PRD, TAD, NEB in REPLICA package

Load balancing
balance command for static LB
fix balance command for dynamic LB
work by adjusting proc dividers in 3d brick grid

Quick tour of more advanced topics

Energy minimization
Via dynamics to un-overlap particles
pair style soft with time-dependent push-off
fix nve/limit and fix viscous

Via gradient-based minimization


min style cg, hftn, sd

Via damped-dynamics minimization


min style quickmin and fire
used for nudged-elastic band (NEB)

Use LAMMPS as a library or from Python

doc/Section howto.html
6.10 and 6.19
C-style interface
(C, C++,
Fortran, Python)
examples/COUPLE dir
python and
python/examples
directories

Coupling MD to other scales

Multi-physics or multi-scale models often lead to


numeric or coupling interface between two methods or two codes

Coupling MD to other scales

Multi-physics or multi-scale models often lead to


numeric or coupling interface between two methods or two codes
LAMMPS can call other codes as libraries
write a simple fix to wrap the library

Another code can instantiate LAMMPS (one or more times)


LAMMPS is really a library (single C++ class)
C interface also provided
enables LAMMPS to be called from C, Fortran, Python

Examples of MD in multi-scale context

MD + DFT: dynamics with quantum forces


MD + on-lattice kinetic MC: stress-driven grain growth
MD + FE: thermal/mechanical coupling to continuum
MD + CFD (OpenFoam): fluidized granular bed
MD + Navier-Stokes: flowing biomolecules

AtC package for atomistic to continuum coupling


Reese Jones, Jon Zimmerman, Jeremy Templeton,
Greg Wagner (Sandia)

Particles in parallel, FE solution in serial


Different PDEs can be solved: thermal, deformation, etc

Thermal coupling with AtC package

Mechanical coupling with AtC package

Elasto-dynamic response:

What have people done with LAMMPS?


Pictures: http://lammps.sandia.gov/pictures.html
Movies: http://lammps.sandia.gov/movies.html

What have people done with LAMMPS?


Pictures: http://lammps.sandia.gov/pictures.html
Movies: http://lammps.sandia.gov/movies.html

Papers: http://lammps.sandia.gov/papers.html
authors, titles, abstracts for 3600 papers

Modifying LAMMPS (advert for tomorrow)

LAMMPS is designed to be easy to extend


90% of LAMMPS is customized add-on classes, via styles
Write a new derived class, drop into src, re-compile

Modifying LAMMPS (advert for tomorrow)

LAMMPS is designed to be easy to extend


90% of LAMMPS is customized add-on classes, via styles
Write a new derived class, drop into src, re-compile
Tuesday AM - Modifying & Extending LAMMPS
Wednesday PM - Hands-on: Writing new code for LAMMPS
Resources:
doc/PDF/Developer.pdf
class hierarchy & timestep structure

doc/Section modify.html

Please contribute your code to the LAMMPS distro!

Links and thanks

LAMMPS: http://lammps.sandia.gov
post a question: http://lammps.sandia.gov/mail.html
my email: sjplimp@sandia.gov
Thanks to LAMMPS developers at Sandia and elsewhere:
Aidan Thompson, Paul Crozier
Stan Moore, Ray Shan, Christian Trott
Axel Kohlmeyer (ICTP & Temple Univ)
http://lammps.sandia.gov/authors.html

You might also like