PMF & WHAM¶

The Potential of Mean Force (PMF) is the primary output of AdsPro. It quantifies the free energy G(ξ) as a function of the reaction coordinate ξ — the distance between the protein center of mass (COM) and the NP surface. The minimum of the PMF at ξ_min (adsorbed state) relative to the plateau at large ξ (bulk reference) gives the adsorption free energy:

\[ \Delta G_\text{ads} = G(\xi_\text{min}) - G(\xi_\to\infty) \]

A negative ΔG_ads means adsorption is thermodynamically favorable.

Why Standard MD Cannot Compute ΔG_ads¶

Once a protein adsorbs onto a nanoparticle surface, it effectively becomes trapped: the adsorption energy (tens of kJ/mol) is many times kBT (~2.6 kJ/mol at 310 K), so spontaneous desorption is an extremely rare event on simulation timescales. Standard NVT MD launched from the adsorbed state never samples the bulk reference state; MD launched from solution rarely samples the adsorbed state for strongly adsorbing proteins.

This is a rare event problem, and umbrella sampling is the standard solution.

Steered Molecular Dynamics¶

Phase 2 begins by running constant-velocity steered MD (SMD), pulling the protein COM from its adsorbed position ξ_ads toward the bulk along the surface normal.

The pulling force is:

\[ \mathbf{F}_\text{pull}(t) = k_\text{pull} \left[\xi_\text{target}(t) - \xi(t)\right] \hat{\mathbf{n}} \]

where: - \(\xi_\text{target}(t) = \xi_\text{ads} + v_\text{pull} \cdot t\) — the moving target - \(v_\text{pull}\) = pulling velocity (default 0.01 nm/ps) - \(k_\text{pull}\) = stiff spring constant (default 500 kJ/mol/nm²) - \(\hat{\mathbf{n}}\) — unit vector from NP center to protein COM

The protein is pulled at constant velocity while the surface beads remain fixed. At each step, the snapshot (ξ, positions) is saved — these form the trajectory used by the smart window detection algorithm.

The steered MD does not directly give ΔG_ads — it generates the initial configurations for umbrella windows and probes the range of ξ over which the protein interacts with the surface.

Smart Window Detection¶

The Problem with Fixed Window Range¶

A naive approach sets the umbrella sampling range to cover the entire protein diameter:

\[ \xi_\text{max}^\text{naive} = \xi_\text{ads} + 2.5 \times d_\text{protein} \approx 7.5 \text{ nm for Lysozyme} \]

This wastes most of the simulation time on the flat bulk region (G = const) where there is no useful information. For silica, where only ARG and LYS contribute (both with Morse cutoff 1.5 nm), the protein-surface interaction is essentially zero beyond ξ ≈ 4 nm.

The Smart Algorithm¶

After steered MD, AdsPro scans the trajectory snapshots to detect the last distance at which the protein meaningfully interacts with the surface:

Sample up to 100 snapshots uniformly from the steered MD trajectory
For each snapshot at distance ξ, compute the Morse energy using the active zone
Compare to the thermal noise threshold \(\frac{1}{2} k_B T \approx 1.29\) kJ/mol at 310 K
Record the last ξ where \(|E_\text{Morse}| > \frac{1}{2}k_BT\)

Call this distance \(\xi_\text{last}\). The window maximum is then:

\[ \xi_\text{max} = \xi_\text{last} + \lambda_D + 0.5 \text{ nm} \]

The Debye length buffer \(\lambda_D\) ensures the electrostatic tail is also covered (DH decays as \(e^{-r/\lambda_D}\), so at \(r = \xi_\text{last} + \lambda_D\), the DH contribution is reduced to \(e^{-1} \approx 37\%\) of its value at contact — negligible compared to the PMF well depth). The 0.5 nm flat buffer provides a bulk reference region.

A hard floor \(\xi_\text{max} \geq \xi_\text{ads} + 1.5\) nm prevents degenerate ranges for strongly adsorbed proteins.

For Lysozyme on silica at 20 mM NaCl: - Old formula: \(\xi_\text{max} = 7.45\) nm → ~21 hours - Smart detection: \(\xi_\text{last} = 3.21\) nm, \(\lambda_D = 2.19\) nm → \(\xi_\text{max} = 5.90\) nm → ~16 hours

Umbrella Sampling¶

Umbrella sampling solves the rare event problem by dividing the reaction coordinate into N overlapping windows and applying a harmonic biasing potential in each window to force the system to sample that region of ξ.

Window Placement¶

N windows are placed at evenly spaced positions along \([\xi_\text{min}, \xi_\text{max}]\):

\[ \xi_0^{(k)} = \xi_\text{min} + k \cdot \frac{\xi_\text{max} - \xi_\text{min}}{N - 1}, \quad k = 0, 1, \ldots, N-1 \]

Biasing Potential¶

In window k, the simulation adds a harmonic bias:

\[ V_k^\text{bias}(\xi) = \frac{k_\text{window}}{2} \left(\xi - \xi_0^{(k)}\right)^2 \]

This restrains the protein COM near \(\xi_0^{(k)}\), forcing sampling of the entire reaction coordinate range.

Window Overlap Criterion¶

For WHAM to work correctly, adjacent windows must overlap in their sampled ξ distributions. The standard deviation of ξ in window k is:

\[ \sigma_k = \sqrt{\frac{k_B T}{k_\text{window}}} \]

With \(k_\text{window} = 50\) kJ/(mol·nm²) at 310 K:

\[ \sigma_k = \sqrt{\frac{2.58}{50}} \approx 0.23 \text{ nm} \]

For adequate overlap, the window spacing should satisfy \(\sigma_k > \Delta\xi / 2\):

\[ \Delta\xi = \frac{\xi_\text{max} - \xi_\text{min}}{N-1} < 2\sigma_k \approx 0.46 \text{ nm} \]

AdsPro prints the overlap ratio \(\sigma_k / \Delta\xi\) after smart window detection. A value > 0.5 is required; > 1.0 is excellent.

Window Initialization¶

Each window is initialized from the steered MD snapshot closest to \(\xi_0^{(k)}\), followed by steps_equil_window equilibration steps (not collected for histograms). This removes the non-equilibrium bias introduced by the pulling force.

Sampling¶

After equilibration, the window runs for steps_per_window steps, recording ξ at every step. This builds a biased probability histogram \(P_k^\text{biased}(\xi)\).

WHAM: Weighted Histogram Analysis Method¶

The biased histograms from all N windows are combined by WHAM to recover the unbiased free energy profile G(ξ).

WHAM Equations¶

WHAM solves a set of self-consistent equations. Let: - \(n_k(\xi)\) = count of samples in window k at value ξ - \(N_k\) = total samples in window k - \(f_k\) = free energy offset for window k (the WHAM unknowns)

The unbiased probability density is:

\[ \rho(\xi) = \frac{\displaystyle\sum_k n_k(\xi)}{\displaystyle\sum_k N_k \, e^{-(V_k^\text{bias}(\xi) - f_k)/k_BT}} \]

The free energy offsets are updated iteratively:

\[ e^{-f_k/k_BT} = \int \rho(\xi) \, e^{-V_k^\text{bias}(\xi)/k_BT} \, d\xi \]

These two equations are solved by alternating updates until convergence:

\[ \max_k |f_k^{(n+1)} - f_k^{(n)}| < \varepsilon_\text{WHAM} \]

AdsPro uses \(\varepsilon_\text{WHAM} = 10^{-4}\) kJ/mol and a maximum of 500 iterations.

PMF from WHAM¶

Once WHAM has converged:

\[ G(\xi) = -k_B T \ln \rho(\xi) \]

The bulk reference is taken as the mean G over the rightmost 10% of populated ξ bins — the region far from the surface where G is flat.

The adsorption free energy is:

\[ \Delta G_\text{ads} = G(\xi_\text{min}) - G_\text{bulk} \]

where \(\xi_\text{min}\) is the position of the PMF minimum (adsorbed state).

Convergence Diagnostics¶

WHAM prints whether it converged (WHAM converged: Yes/No) and the number of iterations required. If WHAM does not converge:

Poor overlap — reduce k_window (σ_k grows as \(1/\sqrt{k_\text{window}}\)) or add more windows
Insufficient sampling — increase steps_per_window
Incomplete coverage — check whether steered MD covered the full ξ range

Multiple Independent Runs (n_runs)¶

To reduce sampling noise, AdsPro can run N independent simulations (n_runs > 1). Each run uses a different Phase 1 random seed, so the protein may settle into a slightly different adsorbed conformation with a slightly different ξ_ads. The window range \([\xi_\text{ads}, \xi_\text{max}]\) shifts accordingly.

Results are reported as mean ± SD. The run whose ΔG_ads is closest to the median is selected as the representative. This selection is robust to outliers — a single failed run does not dominate the result.

Rule of thumb: SD > 10 kJ/mol across runs signals insufficient sampling. Increase steps_per_window or max_steps.

Interpreting ΔG_ads¶

ΔG_ads (kJ/mol)	Interpretation
> 0	Unfavorable — protein does not adsorb spontaneously
0 to −10	Weakly adsorbing — reversible, concentration-dependent
−10 to −40	Moderate — typical for many globular proteins
−40 to −100	Strong — likely irreversible on experimental timescales
< −100	Very strong — hard corona component

The PMF-derived ΔG_ads is directly comparable to experimental measurements from isothermal titration calorimetry (ITC), surface plasmon resonance (SPR), or quartz crystal microbalance with dissipation (QCM-D), all of which measure the thermodynamic free energy of the same adsorption process.