For compilation instructions checkout https://groups.google.com/d/msg/plumed-users/Tx29XNNRq8o/xeAu7RNaBAAJ For a while we've been preparing to run some simulations on this machine, hosted by Oak Ridge National Lab. Every cluster is a bit different and that's definitely true for TITAN : each box has 16 CPUs (arranged in 2 "numas") and 1 K20 nVidia GPU. There is no usual MPI or Infiniband, there is some other Cray-specific beast. Here is an example submission script for a non-replica exchange simulation: #!/bin/bash module add gromacs/5.0.2 cd $PBS_O_WORKDIR # important - allow the GPU to be shared between MPI ranks export CRAY_CUDA_MPS=1 mpirun=`which aprun` application=`which mdrun_mpi` options="-v -maxh 0.2 -s tpr/topol0.tpr " gpu_id=000000000000 # only 12, discard last '0000' $mpirun -n 32 -N 16 $application -gpu_id $gpu_id $options Submit with $ qsub -l walltime=1:00:00 -l nodes=2 submit.sh ...