Water bridge finder (WatFinder)

This module provides the the WatFinder toolkit that detects, predicts and analyzes water bridges.

prody.proteins.waterbridges.calcBridgingResiduesHistogram(frames, **kwargs)[source]

Calculates, plots and returns number of frames that each residue is involved in making water bridges, sorted by value.

Parameters:
  • frames (list) – list of water bridges from calcWaterBridgesTrajectory(), output=’atomic’

  • clip (int) – maximal number of residues on graph; to represent all set None default is 20

  • use_segname (bool) – whether to use segname to label residues default is False, because then the labels get long

prody.proteins.waterbridges.calcWaterBridgeMatrix(data, metric)[source]

Returns matrix which has metric as value and residue ids as ax indices.

Parameters:
  • data (dict) – dictionary returned by calcWaterBridgesStatistics, output=’indices’

  • metric ('percentage' | 'distAvg' | 'distStd') – dict key from data

prody.proteins.waterbridges.calcWaterBridges(atoms, **kwargs)[source]

Compute water bridges for a protein that has water molecules.

Parameters:
  • atoms (Atomic) – Atomic object from which atoms are considered

  • method (string 'cluster' | 'chain') – cluster or chain, where chain find shortest water bridging path between two protein atoms default is ‘chain’

  • distDA (int, float) – maximal distance between water/protein donor and acceptor default is 3.5

  • distWR (int, float) – maximal distance between considered water and any residue default is 4

  • anglePDWA ((int, int)) – angle range where protein is donor and water is acceptor default is (100, 200)

  • anglePAWD – angle range where protein is acceptor and water is donor default is (100, 140)

  • angleWW ((int, int)) – angle between water donor/acceptor default is (140, 180)

  • maxDepth (int, None) – maximum number of waters in chain/depth of residues in cluster default is 2

  • maxNumRes (int, None) – maximum number of water+protein residues in cluster default is None

  • donors (list) – which atoms to count as donors default is [‘N’, ‘O’, ‘S’, ‘F’]

  • acceptors (list) – which atoms to count as acceptors default is [‘N’, ‘O’, ‘S’, ‘F’]

  • output (bool) – return information arrays, (protein atoms, water atoms), or just atom indices per bridge default is ‘atomic’

  • isInfoLog – should log information default is True

  • selstr (str) – selection string for focusing analysis default of None focuses on everything

  • expand_selection (bool) – whether to expand the selection with selectSurroundingsBox(), selecting a box surrounding it. Default is False

  • considered_atoms_sel (str) – selection string for which atoms to consider Default is “protein”

prody.proteins.waterbridges.calcWaterBridgesDistribution(frames, res_a, res_b=None, **kwargs)[source]

Returns distribution for certain metric and plots if possible.

Parameters:
  • res_a – name of first residue

  • res_b – name of second residue default is None

  • metric ('residues' | 'waters' | 'distance' | 'location') – ‘residues’ returns names and frame count of residues interacting with res_a, ‘waters’ returns water count for each bridge between res_a and res_b ‘distance’ returns distance between each pair of protein atoms involved in bridge between res_a and res_b ‘location’ returns dictionary with backbone/sidechain count information

  • output ('dict' | 'indices') – return 2D matrices or dictionary where key is residue info default is ‘dict’

Trajectory:

DCD file - necessary for distance distribution

prody.proteins.waterbridges.calcWaterBridgesStatistics(frames, trajectory, **kwargs)[source]

Returns statistics. Value is percentage of bridge appearance of frames for each residue.

Parameters:
  • frames (list) – list of water bridges from calcWaterBridgesTrajectory(), output=’atomic’

  • output ('info' | 'indices') – return dictorinary whose keys are tuples of resnames or resids default is ‘indices’

  • filename (string) – name of file to save statistic information if wanted default is None

  • considered_atoms_sel (str) – selection string for which atoms to consider Default is “protein”

prody.proteins.waterbridges.calcWaterBridgesTrajectory(atoms, trajectory, **kwargs)[source]

Computes water bridges for a given trajectory. Kwargs for options are the same as in calcWaterBridges.

Parameters:
  • atoms (Atomic) – Atomic object from which atoms are considered

  • trajectory (Trajectory', :class:.Ensemble`, Atomic) – Trajectory data coming from a DCD, ensemble or multi-model PDB file.

  • start_frame (int) – frame to start from

  • stop_frame (int) – frame to stop

  • max_proc (int) – maximum number of processes to use default is half of the number of CPUs

  • selstr (str) – selection string for focusing analysis default of None focuses on everything

  • expand_selection (bool) – whether to expand the selection with selectSurroundingsBox(), selecting a box surrounding it. Default is False

If selstr is provided, a common selection will be found across all frames combining selections satifying the criteria in each.

Parameters:
  • return_selection (bool) – whether to return the combined common selection Default is False to keep expected behaviour. However, this output is required when using selstr.

  • considered_atoms_sel (str) – selection string for which atoms to consider Default is “protein”

prody.proteins.waterbridges.filterStructuresWithoutWater(structures, min_water=0, filenames=None)[source]

This function will filter out structures from structures that have no water or fewer water molecules than min_water.

Parameters:
  • structures (list) – list of Atomic structures to be filtered

  • min_water (int) – minimum number of water O atoms, default is 0

  • filenames (list) – an optional list of filenames to filter too This is an output argument

prody.proteins.waterbridges.findClusterCenters(file_pattern, **kwargs)[source]

Find molecules that are forming cluster in 3D space.

Parameters:
  • file_pattern (str) – file pattern for anlaysis it can include ‘*’ example:’file_*.pdb’ will analyze file_1.pdb, file_2.pdb, etc.

  • selection (str) – selection string by default ‘water and name “O.*”’ is used

  • distC (int, float default is 0.3) – distance to other molecules

  • numC (int) – min number of molecules in a cluster default is 3

  • filename (str) – filename for output pdb file with clusters Default of None leads to ‘clusters_’+file_pattern.split(“*”)[0]+’.pdb’

prody.proteins.waterbridges.findCommonSelectionTraj(atoms, traj, selstr, **kwargs)[source]

Select selstr within atoms for each frame in traj using a bounding box with optional padding.

Parameters:

expand_selection (bool) – whether to expand selections with selectSurroundingsBox(). Default False

Returns the common selection and corresponding indices and optionally the corresponding selstr if return_selstr is True

prody.proteins.waterbridges.getWaterBridgeStatInfo(stats, atoms, **kwargs)[source]

Converts calcWaterBridgesStatistic indices output to info output from stat.

Parameters:
  • stats (dictionary) – statistics returned by calcWaterBridgesStatistics, output=’indices’

  • atoms (Atomic) – Atomic object from which atoms are considered

prody.proteins.waterbridges.getWaterBridgesInfoOutput(waterBridgesAtomic)[source]

Converts single frame/trajectory atomic output from calcWaterBridges/Trajectory to info output.

Parameters:

waterBridgesAtomic (list) – water bridges from calcWaterBridges/Trajectory

prody.proteins.waterbridges.parseWaterBridges(filename, atoms)[source]

Parse water bridges from .wb file saved by saveWaterBridges, returns atomic type.

Parameters:
  • filename (string) – path of file where bridges are stored

  • atoms (Atomic) – Atomic object on which calcWaterBridges was performed

prody.proteins.waterbridges.savePDBWaterBridges(bridges, atoms, filename)[source]

Saves single PDB with occupancy on protein atoms and waters involved bridges.

Parameters:
  • bridges (list) – atomic output from calcWaterBridges

  • atoms (Atomic) – Atomic object from which atoms are considered

  • filename (string) – name of file to be saved

prody.proteins.waterbridges.savePDBWaterBridgesTrajectory(bridgeFrames, atoms, filename, trajectory=None, max_proc=1)[source]

Saves one PDB per frame with occupancy and beta on protein atoms and waters forming bridges in frame.

Parameters:
  • bridgeFrames (list) – atomic output from calcWaterBridgesTrajectory

  • atoms (Atomic) – Atomic object from which atoms are considered

  • filename (string) – name of file to be saved; must end in .pdb

  • trajectory (Trajectory', :class:.Ensemble`, Atomic) – trajectory data (not needed for multi-model PDB)

prody.proteins.waterbridges.saveWaterBridges(atomicBridges, filename)[source]

Save water bridges as information (.txt) or WaterBridges (.wb) parsable file.

Parameters:
  • atomicBridges (list) – atomic output from calcWaterBridges/Trajectory

  • filename (string) – path where file should be saved

prody.proteins.waterbridges.selectSurroundingsBox(atoms, select, **kwargs)[source]

Select the surroundings of select within atoms using a bounding box with optional padding.

Parameters:

return_selstr (bool) – whether to return the final selstr Default False

prody.proteins.waterbridges.showWaterBridgeMatrix(data, metric)[source]

Shows matrix which has percentage/avg distance as value and residue ids as ax indices.

Parameters:
  • data (dict) – dictionary returned by calcWaterBridgesStatistics, output=’indices’

  • metric ('percentage' | 'distAvg' | 'distStd') – dict key from data