next up previous contents
Next: About this document ... Up: Discussion Previous: Monte Carlo Bootstrap Error   Contents


Using the code for replica exchange simulations

One major application for the ability to combine simulations run at different temperatures is the analysis of replica exchange simulations, and if the email I've gotten over the last couple of years is any indication, it's a pretty common one. My code can be used for replica exchange, but I should start by admitting that it wasn't designed with it in mind, and may seem a bit clumsy.

First, the metadata file format has changed as of the November, 2007 release of the code. If you want to specify temperatures in the metadata file, you also have to specify the number of Monte Carlo points to use (if you're not using bootstrapping, you can safely set this to any integer). See section 4.1.2 for details.

In order to use wham with time series collected at different temperatures, the first thing to do is to follow the instructions given in section 4.1.2 regarding the format of the metadata and time series files, while setting the spring constants to 0. Indeed, for simple circumstances involving small systems this may be enough for you to make a successful calculation.

However, for large systems this simple approach will almost certainly get you nothing but a bunch of NaNs in your output. If this happens, the most likely candidate is either a overflow or underflow in the probability histograms. The reason is that the temperature-sensitive version of the code increments the histogram by $\exp(-E/k_B T)$ for each point (as opposed to counting each point as 1). Since the potential energies for condensed-phase molecular dynamics systems using standard force fields are typically of order -50,000 kcal/mol, the means we'd be taking the exponential of a very large number, which is a Bad Thing numerically.

However, in many circumstances one can work around this easily, by shifting the location of zero energy. The simplest procedure is to locate this lowest energy in any of the trajectories, and shift all of the energies in all of the trajectories such that the lowest (most negative) value is now zero. This will eliminate the overflows, since the largest contribution from an individual data point will now be 1.

However, shifting the energies upward can lead to a different set of problems, where a given simulation appears to have no probability associated with it, e.g. the sum of $\exp(-E/k_B T)$ for the trajectory underflows and is effectively zero. This can occur if the energies in the simulation are significantly higher than those in the lowest energy trajectories, which is expected for condensed phase systems at high temperatures. Underflow in itself isn't a problem, but if that simulation is the only one which contributes to a bin in the histogram (or more generally if all of the simulations which sample a given bin have zero overall weight), the result will be a division by zero causing the probability to be NaN or Inf.

Solving this problem is sometimes quite simple: reshift the energies by a few kcal/mol, such that the lowest energy is moderately small instead of zero (say -5 kcal/mol). If the problem is just numerical underflow, a small shift may be sufficient to make the problem numerically well-behaved. However, if the relevant portion of the histogram really is unaccessible except at high temperature, then there may be no way to fix the problem, short of running an additional umbrella-sampled trajectory.


next up previous contents
Next: About this document ... Up: Discussion Previous: Monte Carlo Bootstrap Error   Contents
Alan Grossfield 2010-06-20