openfe_analysis.rmsd

Functions

gather_rms_data(pdb_topology, dataset[, skip])

Compute structural RMSD-based metrics for a multistate BFE simulation.

make_Universe(top, trj, state)

Construct an MDAnalysis Universe from a MultiState NetCDF trajectory and apply standard analysis transformations.

twoD_RMSD(positions, w)

Compute a flattened 2D RMSD matrix from a trajectory.

openfe_analysis.rmsd.make_Universe(top: Path, trj: Dataset, state: int) MDAnalysis.Universe

Construct an MDAnalysis Universe from a MultiState NetCDF trajectory and apply standard analysis transformations.

The Universe is created using the custom FEReader to extract a single state from a multistate simulation.

Parameters:
  • top (pathlib.Path or Topology) – Path to a topology file (e.g. PDB) or an already-loaded MDAnalysis topology object.

  • trj (nc.Dataset) – Open NetCDF dataset produced by openmmtools.multistate.MultiStateReporter.

  • state (int) – Thermodynamic state index to extract from the multistate trajectory.

Returns:

A Universe with trajectory transformations applied.

Return type:

MDAnalysis.Universe

Notes

Identifies two AtomGroups:

  • Protein, defined as having standard amino acid names, then filtered down to CA

  • Ligand, defined as “resname UNK”

Depending on whether a protein is present, a sequence of trajectory transformations is applied:

If a protein is present:

  • Unwraps protein and ligand atom to be made whole

  • Shifts protein chains and the ligand to the image closest to the first protein chain (ClosestImageShift)

  • Aligns the entire system to minimise the protein RMSD (Aligner)

If only a ligand is present:

  • Prevents the ligand from jumping between periodic images

  • Aligns the ligand to minimize its RMSD

openfe_analysis.rmsd.gather_rms_data(pdb_topology: Path, dataset: Path, skip: int | None = None) dict[str, list[float]]

Compute structural RMSD-based metrics for a multistate BFE simulation.

Parameters:
  • pdb_topology (pathlib.Path) – Path to the PDB file defining system topology.

  • dataset (pathlib.Path) – Path to the NetCDF trajectory file produced by a multistate simulation.

  • skip (int, optional) – Frame stride for analysis. If None, a stride is chosen such that approximately 500 frames are analyzed per state.

Returns:

Dictionary containing per-state analysis results with keys: protein_RMSD, ligand_RMSD, ligand_wander, protein_2D_RMSD, and time(ps).

Return type:

dict[str, list]

Notes

For each thermodynamic state (lambda), this function:

  • Loads the trajectory using FEReader

  • Applies standard PBC-handling and alignment transformations

  • Computes protein and ligand structural metrics over time

The following analyses are produced per state:

  • 1D protein CA RMSD time series

  • 1D ligand RMSD time series

  • Ligand center-of-mass displacement from its initial position (ligand_wander)

  • Flattened 2D protein RMSD matrix (pairwise RMSD between frames)

openfe_analysis.rmsd.twoD_RMSD(positions, w: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None) list[float]

Compute a flattened 2D RMSD matrix from a trajectory.

For all unique frame pairs (i, j) with i < j, this function computes the RMSD between atomic coordinates after optimal alignment.

Parameters:
  • positions (np.ndarray) – Atomic coordinates for all frames in the trajectory.

  • w (np.ndarray, optional) – Per-atom weights to use in the RMSD calculation. If None, all atoms are weighted equally.

Returns:

Flattened list of RMSD values corresponding to all frame pairs (i, j) with i < j.

Return type:

list of float