mmCIF File

This module defines functions for parsing mmCIF files.

prody.proteins.ciffile.parseCIF(pdb, **kwargs)

Returns an AtomGroup and/or a StarDict containing header data parsed from an mmCIF file. If not found, the mmCIF file will be downloaded from the PDB. It will be downloaded in uncompressed format regardless of the compressed keyword.

This function extends parseMMCIFStream().

Parameters:

pdb (str) – a PDB identifier or a filename If needed, mmCIF files are downloaded using fetchPDB() function.
title (str) – title of the AtomGroup instance, default is the PDB filename or PDB identifier
chain (str) – chain identifiers for parsing specific chains, e.g. chain='A', chain='B', chain='DE', by default all chains are parsed
segment (str) – segment identifiers for parsing specific chains, e.g. segment='A', segment='B', segment='DE', by default all segment are parsed
subset (str) – a predefined keyword to parse subset of atoms, valid keywords are 'calpha' ('ca'), 'backbone' ('bb'), or None (read all atoms), e.g. subset='bb'
model (int, list) – model index or None (read all models), e.g. model=10
altloc (str) – if a location indicator is passed, such as 'A' or 'B', only indicated alternate locations will be parsed as the single coordinate set of the AtomGroup, if altloc is set 'all' then all alternate locations will be parsed and each will be appended as a distinct coordinate set, default is "A"
unite_chains (bool) – unite chains with the same segment name (auth_asym_id), making chain ids be auth_asym_id instead of label_asym_id. This can be helpful in some cases e.g. alignments, but can cause some problems too. For example, using buildBiomolecules() afterwards requires original chain id (label_asym_id). Using biomol=True, inside parseMMCIF is fine. Default is False
packmol (bool) – whether to renumber chains like packmol, default is False

prody.proteins.ciffile.parseMMCIF(pdb, **kwargs)[source]

Returns an AtomGroup and/or a StarDict containing header data parsed from an mmCIF file. If not found, the mmCIF file will be downloaded from the PDB. It will be downloaded in uncompressed format regardless of the compressed keyword.

This function extends parseMMCIFStream().

Parameters:

pdb (str) – a PDB identifier or a filename If needed, mmCIF files are downloaded using fetchPDB() function.
title (str) – title of the AtomGroup instance, default is the PDB filename or PDB identifier
chain (str) – chain identifiers for parsing specific chains, e.g. chain='A', chain='B', chain='DE', by default all chains are parsed
segment (str) – segment identifiers for parsing specific chains, e.g. segment='A', segment='B', segment='DE', by default all segment are parsed
subset (str) – a predefined keyword to parse subset of atoms, valid keywords are 'calpha' ('ca'), 'backbone' ('bb'), or None (read all atoms), e.g. subset='bb'
model (int, list) – model index or None (read all models), e.g. model=10
altloc (str) – if a location indicator is passed, such as 'A' or 'B', only indicated alternate locations will be parsed as the single coordinate set of the AtomGroup, if altloc is set 'all' then all alternate locations will be parsed and each will be appended as a distinct coordinate set, default is "A"
unite_chains (bool) – unite chains with the same segment name (auth_asym_id), making chain ids be auth_asym_id instead of label_asym_id. This can be helpful in some cases e.g. alignments, but can cause some problems too. For example, using buildBiomolecules() afterwards requires original chain id (label_asym_id). Using biomol=True, inside parseMMCIF is fine. Default is False
packmol (bool) – whether to renumber chains like packmol, default is False

prody.proteins.ciffile.parseMMCIFStream(stream, **kwargs)[source]

Returns an AtomGroup and/or a class:.StarDict containing header data parsed from a stream of CIF lines.

Parameters:

stream – Anything that implements the method readlines (e.g. file, buffer, stdin)
title (str) – title of the AtomGroup instance, default is the PDB filename or PDB identifier
chain (str) – chain identifiers for parsing specific chains, e.g. chain='A', chain='B', chain='DE', by default all chains are parsed
segment (str) – segment identifiers for parsing specific chains, e.g. segment='A', segment='B', segment='DE', by default all segment are parsed
subset (str) – a predefined keyword to parse subset of atoms, valid keywords are 'calpha' ('ca'), 'backbone' ('bb'), or None (read all atoms), e.g. subset='bb'
model (int, list) – model index or None (read all models), e.g. model=10
altloc (str) – if a location indicator is passed, such as 'A' or 'B', only indicated alternate locations will be parsed as the single coordinate set of the AtomGroup, if altloc is set 'all' then all alternate locations will be parsed and each will be appended as a distinct coordinate set, default is "A"
unite_chains (bool) – unite chains with the same segment name (auth_asym_id), making chain ids be auth_asym_id instead of label_asym_id. This can be helpful in some cases e.g. alignments, but can cause some problems too. For example, using buildBiomolecules() afterwards requires original chain id (label_asym_id). Using biomol=True, inside parseMMCIF is fine. Default is False
packmol (bool) – whether to renumber chains like packmol, default is False

prody.proteins.ciffile.writeMMCIF(filename, atoms, csets=None, autoext=True, **kwargs)[source]

Write atoms in MMTF format to a file with name filename and return filename. If filename ends with .gz, a compressed file will be written.

Parameters:

atoms (Atomic) – an object with atom and coordinate data
csets – coordinate set indices, default is all coordinate sets
autoext – when not present, append extension .cif to filename

Keyword Arguments:

header – header to write too