Bio.PDB.PICIO module
PICIO: read and write Protein Internal Coordinate (.pic) data files.
- Bio.PDB.PICIO.read_PIC(file: TextIO, verbose: bool = False, quick: bool = False, defaults: bool = False) Bio.PDB.Structure.Structure
Load Protein Internal Coordinate (.pic) data from file.
- PIC file format:
comment lines start with #
- (optional) PDB HEADER record
idcode and deposition date recommended but optional
deposition date in PDB format or as changed by Biopython
(optional) PDB TITLE record
- repeat:
Biopython Residue Full ID - sets residue IDs of returned structure
(optional) PDB N, CA, C ATOM records for chain start
(optional) PIC Hedra records for residue
(optional) PIC Dihedra records for residue
(optional) BFAC records listing AtomKeys and b-factors
An improvement would define relative positions for HOH (water) entries.
Defaults will be supplied for any value if defaults=True. Default values are supplied in ic_data.py, but structures degrade quickly with any deviation from true coordinates. Experiment with
Bio.PDB.internal_coords.IC_Residue.pic_flags
options towrite_PIC()
to verify this.N.B. dihedron (i-1)C-N-CA-CB is ignored in assembly if O exists.
C-beta is by default placed using O-C-CA-CB, but O is missing in some PDB file residues, which means the sidechain cannot be placed. The alternate CB path (i-1)C-N-CA-CB is provided to circumvent this, but if this is needed then it must be adjusted in conjunction with PHI ((i-1)C-N-CA-C) as they overlap (see
bond_set()
andbond_rotate()
to handle this automatically).- Parameters
file (Bio.File) –
as_handle()
file name or handleverbose (bool) – complain when lines not as expected
quick (bool) – don’t check residues for all dihedra (no default values)
defaults (bool) – create di/hedra as needed from reference database. Amide proton created if ‘H’ is in IC_Residue.accept_atoms
- Returns
Biopython Structure object, Residues with .internal_coord attributes but no coordinates except for chain start N, CA, C atoms if supplied, OR None on parse fail (silent unless verbose=True)
- Bio.PDB.PICIO.read_PIC_seq(seqRec: Bio.SeqRecord.SeqRecord, pdbid: Optional[str] = None, title: Optional[str] = None, chain: Optional[str] = None) Bio.PDB.Structure.Structure
Read
SeqRecord
into Structure with default internal coords.
- Bio.PDB.PICIO.enumerate_atoms(entity)
Ensure all atoms in entity have serial_number set.
- Bio.PDB.PICIO.pdb_date(datestr: str) str
Convert yyyy-mm-dd date to dd-month-yy.
- Bio.PDB.PICIO.write_PIC(entity, file, pdbid=None, chainid=None, picFlags: int = IC_Residue.picFlagsDefault, hCut: Optional[float] = None, pCut: Optional[float] = None)
Write Protein Internal Coordinates (PIC) to file.
See
read_PIC()
for file format. SeeIC_Residue.pic_accuracy
to vary numeric accuracy. Recurses to lower entity levels (M, C, R).- Parameters
entity (Entity) – Biopython PDB Entity object: S, M, C or R
file (Bio.File) –
as_handle()
file name or handlepdbid (str) – PDB idcode, read from entity if not supplied
chainid (char) – PDB Chain ID, set from C level entity.id if needed
picFlags (int) –
boolean flags controlling output, defined in
Bio.PDB.internal_coords.IC_Residue.pic_flags
”psi”,
”omg”,
”phi”,
”tau”, # tau hedron (N-Ca-C)
”chi1”,
”chi2”,
”chi3”,
”chi4”,
”chi5”,
”pomg”, # proline omega
”chi”, # chi1 through chi5
”classic_b”, # psi | phi | tau | pomg
”classic”, # classic_b | chi
”hedra”, # all hedra including bond lengths
”primary”, # all primary dihedra
”secondary”, # all secondary dihedra (fixed angle from primary dihedra)
”all”, # hedra | primary | secondary
”initAtoms”, # XYZ coordinates of initial Tau (N-Ca-C)
”bFactors”
default is everything:
picFlagsDefault = ( pic_flags.all | pic_flags.initAtoms | pic_flags.bFactors )
Usage in your code:
# just primary dihedra and all hedra picFlags = ( IC_Residue.pic_flags.primary | IC_Residue.pic_flags.hedra ) # no B-factors: picFlags = IC_Residue.picFlagsDefault picFlags &= ~IC_Residue.pic_flags.bFactors
read_PIC()
with (defaults=True) will use default values for anything left outhCut (float) – default None only write hedra with ref db angle std dev greater than this value
pCut (float) – default None only write primary dihedra with ref db angle std dev greater than this value
Default values:
Data averaged from Sep 2019 Dunbrack cullpdb_pc20_res2.2_R1.0.
Please see
PISCES: A Protein Sequence Culling Server
‘G. Wang and R. L. Dunbrack, Jr. PISCES: a protein sequence culling server. Bioinformatics, 19:1589-1591, 2003.’
‘primary’ and ‘secondary’ dihedra are defined in ic_data.py. Specifically, secondary dihedra can be determined as a fixed rotation from another known angle, for example N-Ca-C-O can be estimated from N-Ca-C-N (psi).
Standard deviations are listed in <biopython distribution>/Bio/PDB/ic_data.py for default values, and can be used to limit which hedra and dihedra are defaulted vs. output exact measurements from structure (see hCut and pCut above). Default values for primary dihedra (psi, phi, omega, chi1, etc.) are chosen as the most common integer value, not an average.
- Raises
PDBException – if entity level is A (Atom)
Exception – if entity does not have .level attribute