Atom Flags
This module defines atom flags that are used in Atom Selections.
You can read this page in interactive sessions using help(flags).
Flag labels can be used in atom selections:
Flag labels can be combined with dot operator as follows to make selections:
Flag labels can be prefixed with 'is' to check whether all atoms in
an Atomic instance are flagged the same way:
Flag labels can also be used to make quick atom counts:
Protein
- protein
- aminoacid
indicates the twenty standard amino acids (stdaa) and some non-standard amino acids (nonstdaa) described below. Residue must also have an atom named
'CA'in addition to having a qualifying residue name.- stdaa
indicates the standard amino acid residues: ALA, ARG, ASN, ASP, CYS, GLN, GLU, GLY, HIS, ILE, LEU, LYS, MET, PHE, PRO, SER, THR, TRP, TYR, and VAL
- nonstdaa
indicates one of the following residues:
ASX (B)
asparagine or aspartic acid
GLX (Z)
glutamine or glutamic acid
CSO (C)
S-hydroxycysteine
HIP (H)
ND1-phosphohistidine
HSD (H)
prototropic tautomer of histidine, H on ND1 (CHARMM)
HSE (H)
prototropic tautomer of histidine, H on NE2 (CHARMM)
HSP (H)
protonated histidine
selenomethionine
SEC (U)
selenocysteine
SEP (S)
phosphoserine
TPO (T)
phosphothreonine
PTR (Y)
O-phosphotyrosine
XLE (J)
leucine or isoleucine
XAA (X)
unspecified or unknown
You can modify the list of non-standard amino acids using
addNonstdAminoacid(),delNonstdAminoacid(), andlistNonstdAAProps().- calpha
- ca
Cα atoms of protein residues, same as selection
'name CA and protein'- backbone
- bb
non-hydrogen backbone atoms of protein residues, same as selection
'name CA C O N and protein'- backbonefull
- bbfull
backbone atoms of protein residues, same as selection
'name CA C O N H H1 H2 H3 OXT and protein'- sidechain
- sc
side-chain atoms of protein residues, same as selection
'protein and not backbonefull'- acidic
residues ASH, ASP, GLH, GLU, HISP, HSP, PHD, PTR, SEP, TPO
- acyclic
residues ALA, ARG, ARN, ASH, ASN, ASP, ASX, CME, CSO, CYS, CYX, GLH, LN, GLU, GLX, GLY, ILE, LEU, LYN, LYS, MET, MSE, PHD, SEC, SEP, ER, THR, TPO, VAL, XLE
- aliphatic
residues ALA, GLY, ILE, LEU, PRO, VAL, XLE
- aromatic
residues HIS, PHE, PTR, TRP, TYM, TYR
- basic
residues ARG, HID, HIE, HIP, HIS, HISD, HISE, HSD, HSE, LYS, TYM
- buried
residues ALA, CME, CYS, CYX, ILE, LEU, MET, MSE, PHE, SEC, TRP, VAL, LE
- charged
residues ARG, ASP, GLU, HIS, LYS
- cyclic
residues HID, HIE, HIP, HIS, HISD, HISE, HISP, HSD, HSE, HSP, PHE, RO, PTR, TRP, TYM, TYR
- hydrophobic
residues ALA, ILE, LEU, MET, PHE, PRO, TRP, VAL, XLE
- large
residues ARG, ARN, CME, GLH, GLN, GLU, GLX, HID, HIE, HIP, HIS, HISD, ISE, HISP, HSD, HSE, HSP, ILE, LEU, LYN, LYS, MET, MSE, PHD, HE, PTR, SEP, TPO, TRP, TYM, TYR, XLE
- medium
residues ASH, ASN, ASP, ASX, CSO, CYS, CYX, PRO, SEC, THR, VAL
- neutral
residues ALA, ARN, ASN, CME, CSO, CYS, CYX, GLN, GLY, ILE, LEU, LYN, ET, MSE, PHE, PRO, SEC, SER, THR, TRP, TYR, VAL
- polar
residues ARG, ARN, ASH, ASN, ASP, ASX, CSO, CYS, CYX, GLH, GLN, GLU, LX, GLY, HID, HIE, HIP, HIS, HISD, HISE, HISP, HSD, HSE, HSP, YN, LYS, PHD, PTR, SEC, SEP, SER, THR, TPO, TYM, TYR
- small
residues ALA, GLY, SER
- surface
residues ARG, ARN, ASH, ASN, ASP, ASX, CSO, GLH, GLN, GLU, GLX, GLY, ID, HIE, HIP, HIS, HISD, HISE, HISP, HSD, HSE, HSP, LYN, LYS, HD, PRO, PTR, SEP, SER, THR, TPO, TYM, TYR
Nucleic
- nucleic
indicates nucleobase, nucleotide, and some nucleoside derivatives that are described below, so it is same as
'nucleobase or nucleotide or nucleoside'.- nucleobase
indicates ADE (adenine), GUN (guanine), CYT (cytosine), THY (thymine), and URA (uracil).
- nucleotide
indicates residues with the following names:
2’-deoxyadenosine-5’-monophosphate
2’-deoxycytidine-5’-monophosphate
2’-deoxyguanosine-5’-monophosphate
2’-deoxythymidine-5’-monophosphate
2’-deoxyuridine-5’-monophosphate
adenosine-5’-monophosphate
cytidine-5’-monophosphate
guanosine-5’-monophosphate
2’-deoxythymidine-5’-monophosphate
uridine-5’-monophosphate
- nucleoside
indicates following nucleosides and their derivatives that are recognized by PDB:
adenosine
adenosine-5’-monophosphate
adenosine-5’-diphosphate
adenosine-5’-triphosphate
adenosine-5’-triphosphate-gamma-S
cyclic adenosine-3’,5’-monophosphate
adenosine-2’,5’-diphosphate
adenosine-3’,5’-diphosphate
cytidine
cytidine-2’-monophosphate
cytidine-3’-monophosphate
cytidine-5’-monophosphate
cytidine-5’-diphosphate
cytidine-5’-triphosphate
guanosine
guanosine-5’-monophosphate
guanosine-5’-diphosphate
guanosine-5’-triphosphate
thymidine
thymidine-5’-monophosphate
thymidine-5’-diphosphate
thymidine-5’-triphosphate
uridine (uracil plus ribose)
2’-deoxyuridine 5’-monophosphate
uridine 5’-diphosphate
uridine 5’-triphosphate
- at
same as selection
'resname ADE A THY T'- cg
same as selection
'resname CYT C GUN G'- purine
same as selection
'resname ADE A GUN G'- pyrimidine
same as selection
'resname CYT C THY T URA U'
Heteros
- hetero
indicates anything other than a protein or a nucleic residue, i.e.
'not (protein or nucleic)'.- hetatm
is available when atomic data is parsed from a PDB or similar format file and indicates atoms that are marked
'HETATM'in the file.- water
indices HOH and DOD recognized by PDB and also WAT, TIP3, H2O, OH2, TIP, TIP2, TIP4 and SPC recognized by molecular dynamics (MD) force fields.
Previously used water types HH0, OHH, and SOL conflict with other compounds in the PDB, so are removed from the definition of this flag except SOL (restored) as compound SOL (L-sorbose) is only used 3 times.
- ion
indicates the following ions most of which are recognized by the PDB and others by MD force fields.
PDB
Source
Conflict
aluminum
Yes
barium
Yes
calcium
Yes
cadmium
Yes
chloride
Yes
cobalt (ii)
Yes
cesium
Yes
copper (ii)
Yes
copper (i)
Yes
dinuclear copper
Yes
mercury (ii)
Yes
indium (iii)
Yes
iodide
Yes
potassium
Yes
magnesium
Yes
manganese (iii)
Yes
manganese (ii)
Yes
sodium
Yes
lead (ii)
Yes
platinum (ii)
Yes
rubidium
Yes
terbium (iii)
Yes
thallium (i)
Yes
thungstate (vi)
Yes
ytterbium (iii)
Yes
zinc
Yes
CAL
calcium
No
CHARMM
Yes
CES
cesium
No
CHARMM
Yes
CLA
chloride
No
CHARMM
Yes
POT
potassium
No
CHARMM
Yes
SOD
sodium
No
CHARMM
Yes
ZN2
zinc
No
CHARMM
No
CU2P
copper (ii)
No
CHARMM
No
CU2
copper (ii)
No
CHARMM
No
Ion identifiers that are obsoleted by PDB (MO3, MO4, MO5, MO6, NAW, OC7, and ZN1) are removed from this definition.
CU2 comes from CU2P if parsing PDB files without long_resname or writing them again (always trims to 3-character resnames).
- lipid
indicates GPE, LPP, OLA, SDS, and STE from PDB, and also POPE, STEA, PALM, OLEO, DMPC, DLPE, PCGL, LPPC, POPC from CHARMM force field.
- sugar
indicates BGC, GLC, and GLO from PDB, and also AGLC from CHARMM.
- heme
indicates 1FH, 2FH, DDH, DHE, HAS, HDD, HDE, HDM, HEA, HEB, HEC, HEM, HEO, HES, HEV, NTE, SRM, and VER from PDB, and also HEMO and HEMR from CHARMM.
- pdbter
is available when atomic data is parsed from a PDB format file and indicates atoms that were followed by
'TER'record.- selpdbter
is available when atomic data is parsed from a PDB format file and then a selection is made and indicates selected atoms that should be followed by
'TER'record.
Elements
Following elements found in proteins are recognized by applying regular expressions to atom names:
- carbon
carbon atoms, same as
'name "C.*" and not ion'- nitrogen
nitrogen atoms, same as
'name "N.*" and not ion'- oxygen
oxygen atoms, same as
'name "O.*" and not ion'- sulfur
sulfur atoms, same as
'name "S.*" and not ion'- hydrogen
hydrogen atoms, same as
'name "[1-9]?H.*" and not ion'- noh
- heavy
non hydrogen atoms, same as
'not hydrogen
'not ion' is appended to above definitions to avoid conflicts with
ion atoms.
Structure
Following secondary structure flags are defined but before they can be used, secondary structure assignments must be made.
- extended
extended conformation, same as
'secondary E'- helix
α-helix conformation, same as
'secondary H'- helix310
3_10-helix conformation, same as
'secondary G'- helixpi
π-helix conformation, same as
'secondary I'- turn
hydrogen bonded turn conformation, same as
'secondary T'- bridge
isolated beta-bridge conformation, same as
'secondary B'- bend
bend conformation, same as
'secondary S'- coil
not in one of above conformations, same as
'secondary C'
Others
- all
indicates all atoms, returns a new view of the instance
- none
indicates no atoms, returns None
- dummy
indicates dummy atoms in an
AtomMap- mapped
indicates mapped atoms in an
AtomMap
Functions
The following functions can be used to customize flag definitions:
- prody.atomic.flags.addNonstdAminoacid(resname, *properties)[source]
Add non-standard amino acid resname with properties selected from:
Default set of non-standard amino acids can be restored as follows:
- prody.atomic.flags.delNonstdAminoacid(resname)[source]
Delete non-standard amino acid resname.
Default set of non-standard amino acids can be restored as follows:
- prody.atomic.flags.flagDefinition(*arg, **kwarg)[source]
Learn, change, or reset Atom Flags definitions.
Learn a definition
Calling this function with no arguments will return list of flag names whose definitions you can learn:
Passing a flag name will return its definition:
Change a definition
Calling the function with
editable=Trueargument will return flag names those definitions that can be edited:Pass an editable flag name with its new definition:
Note that the type of the new definition must be the same as the type of the old definition. Flags with editable definitions are: at, backbone, backbonefull, bb, bbfull, carbon, cg, heme, hydrogen, ion, lipid, nitrogen, nucleobase, nucleoside, nucleotide, oxygen, purine, pyrimidine, sugar, sulfur, and water
Reset definitions
Pass reset keyword as follows to restore all default definitions of editable flags and also non-standard amino acids.
Or, pass a specific editable flag label to restore its definition:
- prody.atomic.flags.getNonstdProperties(resname)[source]
Deprecated for removal in v1.4, use
listNonstdAAProps()instead.