STAR File
This module defines functions for parsing STAR (Self-defining Text Archiving and Retrieval) files. This includes metadata files from cryo-EM image analysis programs including RELION and XMIPP as well as the Crystallographic Information File (CIF) format used for much of the PDB.
- prody.proteins.starfile.parseImagesFromSTAR(particlesSTAR, **kwargs)[source]
Parses particle images using data from a STAR file containing information about them.
- Parameters:
particlesSTAR (str) – a filename for a STAR file.
block_indices (list,
ndarray) – indices for data blocks containing rows corresponding to images of interest The indexing scheme is similar to that for numpy arrays. Default behavior is use all data blocks about imagesrow_indices (list,
ndarray) – indices for rows corresponding to images of interest The indexing scheme is similar to that for numpy arrays. row_indices should be a 1D or 2D array-like. 2D row_indices should contain an entry for each relevant loop. If a 1D array-like is given the same row indices will be applied to all loops. Default behavior is to use all rows about imagesparticle_indices (list,
ndarray) – indices for particles regardless of STAR structure default is take all particles Please note: this acts after block_indices and row_indicessaveImageArrays (bool) – whether to save the numpy array for each image to file default is False
saveDirectory (str) – directory where numpy image arrays are saved default is None, which means save to the current working directory
rotateImages (bool) – whether to apply in plane translations and rotations using provided psi and origin data, default is True
- prody.proteins.starfile.parseSTAR(filename, **kwargs)[source]
Returns a dictionary containing data parsed from a STAR file.
- Parameters:
filename (str) – a filename The .star extension can be omitted.
start (int, None) – line number for starting Default is None, meaning start at the beginning
stop (int, None) – line number for stopping Default is None, meaning don’t stop.
shlex (bool) – whether to use shlex for splitting lines so as to preserve quoted substrings Default is False
- prody.proteins.starfile.parseSTARSection(lines, key, report=True)[source]
Parse a section of data from lines from a STAR file corresponding to a key (part before the dot). This can be a loop or data block.
Returns data encapulated in a list and the associated fields.
- Parameters:
report (bool) – whether to report warnings about not finding data default True