Atom Flags

This module defines atom flags that are used in Atom Selections. You can read this page in interactive sessions using help(flags).

Flag labels can be used in atom selections:

Flag labels can be combined with dot operator as follows to make selections:

Flag labels can be prefixed with 'is' to check whether all atoms in an Atomic instance are flagged the same way:

Flag labels can also be used to make quick atom counts:

Protein

protein
aminoacid

indicates the twenty standard amino acids (stdaa) and some non-standard amino acids (nonstdaa) described below. Residue must also have an atom named 'CA' in addition to having a qualifying residue name.

stdaa

indicates the standard amino acid residues: ALA, ARG, ASN, ASP, CYS, GLN, GLU, GLY, HIS, ILE, LEU, LYS, MET, PHE, PRO, SER, THR, TRP, TYR, and VAL

nonstdaa

indicates one of the following residues:

ASX (B)

asparagine or aspartic acid

GLX (Z)

glutamine or glutamic acid

CSO (C)

S-hydroxycysteine

HIP (H)

ND1-phosphohistidine

HSD (H)

prototropic tautomer of histidine, H on ND1 (CHARMM)

HSE (H)

prototropic tautomer of histidine, H on NE2 (CHARMM)

HSP (H)

protonated histidine

MSE

selenomethionine

SEC (U)

selenocysteine

SEP (S)

phosphoserine

TPO (T)

phosphothreonine

PTR (Y)

O-phosphotyrosine

XLE (J)

leucine or isoleucine

XAA (X)

unspecified or unknown

You can modify the list of non-standard amino acids using addNonstdAminoacid(), delNonstdAminoacid(), and listNonstdAAProps().

calpha
ca

Cα atoms of protein residues, same as selection 'name CA and protein'

backbone
bb

non-hydrogen backbone atoms of protein residues, same as selection 'name CA C O N and protein'

backbonefull
bbfull

backbone atoms of protein residues, same as selection 'name CA C O N H H1 H2 H3 OXT and protein'

sidechain
sc

side-chain atoms of protein residues, same as selection 'protein and not backbonefull'

acidic

residues ASH, ASP, GLH, GLU, HISP, HSP, PHD, PTR, SEP, TPO

acyclic

residues ALA, ARG, ARN, ASH, ASN, ASP, ASX, CME, CSO, CYS, CYX, GLH, LN, GLU, GLX, GLY, ILE, LEU, LYN, LYS, MET, MSE, PHD, SEC, SEP, ER, THR, TPO, VAL, XLE

aliphatic

residues ALA, GLY, ILE, LEU, PRO, VAL, XLE

aromatic

residues HIS, PHE, PTR, TRP, TYM, TYR

basic

residues ARG, HID, HIE, HIP, HIS, HISD, HISE, HSD, HSE, LYS, TYM

buried

residues ALA, CME, CYS, CYX, ILE, LEU, MET, MSE, PHE, SEC, TRP, VAL, LE

charged

residues ARG, ASP, GLU, HIS, LYS

cyclic

residues HID, HIE, HIP, HIS, HISD, HISE, HISP, HSD, HSE, HSP, PHE, RO, PTR, TRP, TYM, TYR

hydrophobic

residues ALA, ILE, LEU, MET, PHE, PRO, TRP, VAL, XLE

large

residues ARG, ARN, CME, GLH, GLN, GLU, GLX, HID, HIE, HIP, HIS, HISD, ISE, HISP, HSD, HSE, HSP, ILE, LEU, LYN, LYS, MET, MSE, PHD, HE, PTR, SEP, TPO, TRP, TYM, TYR, XLE

medium

residues ASH, ASN, ASP, ASX, CSO, CYS, CYX, PRO, SEC, THR, VAL

neutral

residues ALA, ARN, ASN, CME, CSO, CYS, CYX, GLN, GLY, ILE, LEU, LYN, ET, MSE, PHE, PRO, SEC, SER, THR, TRP, TYR, VAL

polar

residues ARG, ARN, ASH, ASN, ASP, ASX, CSO, CYS, CYX, GLH, GLN, GLU, LX, GLY, HID, HIE, HIP, HIS, HISD, HISE, HISP, HSD, HSE, HSP, YN, LYS, PHD, PTR, SEC, SEP, SER, THR, TPO, TYM, TYR

small

residues ALA, GLY, SER

surface

residues ARG, ARN, ASH, ASN, ASP, ASX, CSO, GLH, GLN, GLU, GLX, GLY, ID, HIE, HIP, HIS, HISD, HISE, HISP, HSD, HSE, HSP, LYN, LYS, HD, PRO, PTR, SEP, SER, THR, TPO, TYM, TYR

Nucleic

nucleic

indicates nucleobase, nucleotide, and some nucleoside derivatives that are described below, so it is same as 'nucleobase or nucleotide or nucleoside'.

nucleobase

indicates ADE (adenine), GUN (guanine), CYT (cytosine), THY (thymine), and URA (uracil).

nucleotide

indicates residues with the following names:

DA

2’-deoxyadenosine-5’-monophosphate

DC

2’-deoxycytidine-5’-monophosphate

DG

2’-deoxyguanosine-5’-monophosphate

DT

2’-deoxythymidine-5’-monophosphate

DU

2’-deoxyuridine-5’-monophosphate

A

adenosine-5’-monophosphate

C

cytidine-5’-monophosphate

G

guanosine-5’-monophosphate

T

2’-deoxythymidine-5’-monophosphate

U

uridine-5’-monophosphate

nucleoside

indicates following nucleosides and their derivatives that are recognized by PDB:

ADN

adenosine

AMP

adenosine-5’-monophosphate

ADP

adenosine-5’-diphosphate

ATP

adenosine-5’-triphosphate

AGS

adenosine-5’-triphosphate-gamma-S

CMP

cyclic adenosine-3’,5’-monophosphate

A2P

adenosine-2’,5’-diphosphate

A3P

adenosine-3’,5’-diphosphate

CTN

cytidine

C2P

cytidine-2’-monophosphate

C3P

cytidine-3’-monophosphate

C5P

cytidine-5’-monophosphate

CDP

cytidine-5’-diphosphate

CTP

cytidine-5’-triphosphate

GMP

guanosine

5GP

guanosine-5’-monophosphate

GDP

guanosine-5’-diphosphate

GTP

guanosine-5’-triphosphate

THM

thymidine

TMP

thymidine-5’-monophosphate

TPP

thymidine-5’-diphosphate

TTP

thymidine-5’-triphosphate

URI

uridine (uracil plus ribose)

UMP

2’-deoxyuridine 5’-monophosphate

UDP

uridine 5’-diphosphate

UTP

uridine 5’-triphosphate

at

same as selection 'resname ADE A THY T'

cg

same as selection 'resname CYT C GUN G'

purine

same as selection 'resname ADE A GUN G'

pyrimidine

same as selection 'resname CYT C THY T URA U'

Heteros

hetero

indicates anything other than a protein or a nucleic residue, i.e. 'not (protein or nucleic)'.

hetatm

is available when atomic data is parsed from a PDB or similar format file and indicates atoms that are marked 'HETATM' in the file.

water

indices HOH and DOD recognized by PDB and also WAT, TIP3, H2O, OH2, TIP, TIP2, TIP4 and SPC recognized by molecular dynamics (MD) force fields.

Previously used water types HH0, OHH, and SOL conflict with other compounds in the PDB, so are removed from the definition of this flag except SOL (restored) as compound SOL (L-sorbose) is only used 3 times.

ion

indicates the following ions most of which are recognized by the PDB and others by MD force fields.

PDB

Source

Conflict

AL

aluminum

Yes

BA

barium

Yes

CA

calcium

Yes

CD

cadmium

Yes

CL

chloride

Yes

CO

cobalt (ii)

Yes

CS

cesium

Yes

CU

copper (ii)

Yes

CU1

copper (i)

Yes

CUA

dinuclear copper

Yes

HG

mercury (ii)

Yes

IN

indium (iii)

Yes

IOD

iodide

Yes

K

potassium

Yes

MG

magnesium

Yes

MN3

manganese (iii)

Yes

MN

manganese (ii)

Yes

NA

sodium

Yes

PB

lead (ii)

Yes

PT

platinum (ii)

Yes

RB

rubidium

Yes

TB

terbium (iii)

Yes

TL

thallium (i)

Yes

WO4

thungstate (vi)

Yes

YB

ytterbium (iii)

Yes

ZN

zinc

Yes

CAL

calcium

No

CHARMM

Yes

CES

cesium

No

CHARMM

Yes

CLA

chloride

No

CHARMM

Yes

POT

potassium

No

CHARMM

Yes

SOD

sodium

No

CHARMM

Yes

ZN2

zinc

No

CHARMM

No

CU2P

copper (ii)

No

CHARMM

No

CU2

copper (ii)

No

CHARMM

No

Ion identifiers that are obsoleted by PDB (MO3, MO4, MO5, MO6, NAW, OC7, and ZN1) are removed from this definition.

CU2 comes from CU2P if parsing PDB files without long_resname or writing them again (always trims to 3-character resnames).

lipid

indicates GPE, LPP, OLA, SDS, and STE from PDB, and also POPE, STEA, PALM, OLEO, DMPC, DLPE, PCGL, LPPC, POPC from CHARMM force field.

sugar

indicates BGC, GLC, and GLO from PDB, and also AGLC from CHARMM.

heme

indicates 1FH, 2FH, DDH, DHE, HAS, HDD, HDE, HDM, HEA, HEB, HEC, HEM, HEO, HES, HEV, NTE, SRM, and VER from PDB, and also HEMO and HEMR from CHARMM.

pdbter

is available when atomic data is parsed from a PDB format file and indicates atoms that were followed by 'TER' record.

selpdbter

is available when atomic data is parsed from a PDB format file and then a selection is made and indicates selected atoms that should be followed by 'TER' record.

Elements

Following elements found in proteins are recognized by applying regular expressions to atom names:

carbon

carbon atoms, same as 'name "C.*" and not ion'

nitrogen

nitrogen atoms, same as 'name "N.*" and not ion'

oxygen

oxygen atoms, same as 'name "O.*" and not ion'

sulfur

sulfur atoms, same as 'name "S.*" and not ion'

hydrogen

hydrogen atoms, same as 'name "[1-9]?H.*" and not ion'

noh
heavy

non hydrogen atoms, same as 'not hydrogen

'not ion' is appended to above definitions to avoid conflicts with ion atoms.

Structure

Following secondary structure flags are defined but before they can be used, secondary structure assignments must be made.

extended

extended conformation, same as 'secondary E'

helix

α-helix conformation, same as 'secondary H'

helix310

3_10-helix conformation, same as 'secondary G'

helixpi

π-helix conformation, same as 'secondary I'

turn

hydrogen bonded turn conformation, same as 'secondary T'

bridge

isolated beta-bridge conformation, same as 'secondary B'

bend

bend conformation, same as 'secondary S'

coil

not in one of above conformations, same as 'secondary C'

Others

all

indicates all atoms, returns a new view of the instance

none

indicates no atoms, returns None

dummy

indicates dummy atoms in an AtomMap

mapped

indicates mapped atoms in an AtomMap

Functions

The following functions can be used to customize flag definitions:

prody.atomic.flags.addNonstdAminoacid(resname, *properties)[source]

Add non-standard amino acid resname with properties selected from:

Default set of non-standard amino acids can be restored as follows:

prody.atomic.flags.delNonstdAminoacid(resname)[source]

Delete non-standard amino acid resname.

Default set of non-standard amino acids can be restored as follows:

prody.atomic.flags.flagDefinition(*arg, **kwarg)[source]

Learn, change, or reset Atom Flags definitions.

Learn a definition

Calling this function with no arguments will return list of flag names whose definitions you can learn:

Passing a flag name will return its definition:

Change a definition

Calling the function with editable=True argument will return flag names those definitions that can be edited:

Pass an editable flag name with its new definition:

Note that the type of the new definition must be the same as the type of the old definition. Flags with editable definitions are: at, backbone, backbonefull, bb, bbfull, carbon, cg, heme, hydrogen, ion, lipid, nitrogen, nucleobase, nucleoside, nucleotide, oxygen, purine, pyrimidine, sugar, sulfur, and water

Reset definitions

Pass reset keyword as follows to restore all default definitions of editable flags and also non-standard amino acids.

Or, pass a specific editable flag label to restore its definition:

prody.atomic.flags.getNonstdProperties(resname)[source]

Deprecated for removal in v1.4, use listNonstdAAProps() instead.

prody.atomic.flags.listNonstdAAProps(resname)[source]

Returns properties of non-standard amino acid resname.