mmCIF File
This module defines functions for parsing mmCIF files.
- prody.proteins.ciffile.parseCIF(pdb, **kwargs)
Returns an
AtomGroupand/or aStarDictcontaining header data parsed from an mmCIF file. If not found, the mmCIF file will be downloaded from the PDB. It will be downloaded in uncompressed format regardless of the compressed keyword.This function extends
parseMMCIFStream().- Parameters:
pdb (str) – a PDB identifier or a filename If needed, mmCIF files are downloaded using
fetchPDB()function.title (str) – title of the
AtomGroupinstance, default is the PDB filename or PDB identifierchain (str) – chain identifiers for parsing specific chains, e.g.
chain='A',chain='B',chain='DE', by default all chains are parsedsegment (str) – segment identifiers for parsing specific chains, e.g.
segment='A',segment='B',segment='DE', by default all segment are parsedsubset (str) – a predefined keyword to parse subset of atoms, valid keywords are
'calpha'('ca'),'backbone'('bb'), or None (read all atoms), e.g.subset='bb'model (int, list) – model index or None (read all models), e.g.
model=10altloc (str) – if a location indicator is passed, such as
'A'or'B', only indicated alternate locations will be parsed as the single coordinate set of the AtomGroup, if altloc is set'all'then all alternate locations will be parsed and each will be appended as a distinct coordinate set, default is"A"unite_chains (bool) – unite chains with the same segment name (auth_asym_id), making chain ids be auth_asym_id instead of label_asym_id. This can be helpful in some cases e.g. alignments, but can cause some problems too. For example, using
buildBiomolecules()afterwards requires original chain id (label_asym_id). Using biomol=True, inside parseMMCIF is fine. Default is Falsepackmol (bool) – whether to renumber chains like packmol, default is False
- prody.proteins.ciffile.parseMMCIF(pdb, **kwargs)[source]
Returns an
AtomGroupand/or aStarDictcontaining header data parsed from an mmCIF file. If not found, the mmCIF file will be downloaded from the PDB. It will be downloaded in uncompressed format regardless of the compressed keyword.This function extends
parseMMCIFStream().- Parameters:
pdb (str) – a PDB identifier or a filename If needed, mmCIF files are downloaded using
fetchPDB()function.title (str) – title of the
AtomGroupinstance, default is the PDB filename or PDB identifierchain (str) – chain identifiers for parsing specific chains, e.g.
chain='A',chain='B',chain='DE', by default all chains are parsedsegment (str) – segment identifiers for parsing specific chains, e.g.
segment='A',segment='B',segment='DE', by default all segment are parsedsubset (str) – a predefined keyword to parse subset of atoms, valid keywords are
'calpha'('ca'),'backbone'('bb'), or None (read all atoms), e.g.subset='bb'model (int, list) – model index or None (read all models), e.g.
model=10altloc (str) – if a location indicator is passed, such as
'A'or'B', only indicated alternate locations will be parsed as the single coordinate set of the AtomGroup, if altloc is set'all'then all alternate locations will be parsed and each will be appended as a distinct coordinate set, default is"A"unite_chains (bool) – unite chains with the same segment name (auth_asym_id), making chain ids be auth_asym_id instead of label_asym_id. This can be helpful in some cases e.g. alignments, but can cause some problems too. For example, using
buildBiomolecules()afterwards requires original chain id (label_asym_id). Using biomol=True, inside parseMMCIF is fine. Default is Falsepackmol (bool) – whether to renumber chains like packmol, default is False
- prody.proteins.ciffile.parseMMCIFStream(stream, **kwargs)[source]
Returns an
AtomGroupand/or a class:.StarDict containing header data parsed from a stream of CIF lines.- Parameters:
stream – Anything that implements the method
readlines(e.g.file, buffer, stdin)title (str) – title of the
AtomGroupinstance, default is the PDB filename or PDB identifierchain (str) – chain identifiers for parsing specific chains, e.g.
chain='A',chain='B',chain='DE', by default all chains are parsedsegment (str) – segment identifiers for parsing specific chains, e.g.
segment='A',segment='B',segment='DE', by default all segment are parsedsubset (str) – a predefined keyword to parse subset of atoms, valid keywords are
'calpha'('ca'),'backbone'('bb'), or None (read all atoms), e.g.subset='bb'model (int, list) – model index or None (read all models), e.g.
model=10altloc (str) – if a location indicator is passed, such as
'A'or'B', only indicated alternate locations will be parsed as the single coordinate set of the AtomGroup, if altloc is set'all'then all alternate locations will be parsed and each will be appended as a distinct coordinate set, default is"A"unite_chains (bool) – unite chains with the same segment name (auth_asym_id), making chain ids be auth_asym_id instead of label_asym_id. This can be helpful in some cases e.g. alignments, but can cause some problems too. For example, using
buildBiomolecules()afterwards requires original chain id (label_asym_id). Using biomol=True, inside parseMMCIF is fine. Default is Falsepackmol (bool) – whether to renumber chains like packmol, default is False
- prody.proteins.ciffile.writeMMCIF(filename, atoms, csets=None, autoext=True, **kwargs)[source]
Write atoms in MMTF format to a file with name filename and return filename. If filename ends with
.gz, a compressed file will be written.- Parameters:
atoms (
Atomic) – an object with atom and coordinate datacsets – coordinate set indices, default is all coordinate sets
autoext – when not present, append extension
.cifto filename
- Keyword Arguments:
header – header to write too