Bio.PDB.PICIO module

PICIO: read and write Protein Internal Coordinate (.pic) data files.

Bio.PDB.PICIO.read_PIC(file: TextIO, verbose: bool = False, quick: bool = False, defaults: bool = False) Bio.PDB.Structure.Structure

Load Protein Internal Coordinate (.pic) data from file.

PIC file format:
  • comment lines start with #

  • (optional) PDB HEADER record
    • idcode and deposition date recommended but optional

    • deposition date in PDB format or as changed by Biopython

  • (optional) PDB TITLE record

  • repeat:
    • Biopython Residue Full ID - sets residue IDs of returned structure

    • (optional) PDB N, CA, C ATOM records for chain start

    • (optional) PIC Hedra records for residue

    • (optional) PIC Dihedra records for residue

    • (optional) BFAC records listing AtomKeys and b-factors

An improvement would define relative positions for HOH (water) entries.

Defaults will be supplied for any value if defaults=True. Default values are supplied in ic_data.py, but structures degrade quickly with any deviation from true coordinates. Experiment with Bio.PDB.internal_coords.IC_Residue.pic_flags options to write_PIC() to verify this.

N.B. dihedron (i-1)C-N-CA-CB is ignored in assembly if O exists.

C-beta is by default placed using O-C-CA-CB, but O is missing in some PDB file residues, which means the sidechain cannot be placed. The alternate CB path (i-1)C-N-CA-CB is provided to circumvent this, but if this is needed then it must be adjusted in conjunction with PHI ((i-1)C-N-CA-C) as they overlap (see bond_set() and bond_rotate() to handle this automatically).

Parameters
  • file (Bio.File) – as_handle() file name or handle

  • verbose (bool) – complain when lines not as expected

  • quick (bool) – don’t check residues for all dihedra (no default values)

  • defaults (bool) – create di/hedra as needed from reference database. Amide proton created if ‘H’ is in IC_Residue.accept_atoms

Returns

Biopython Structure object, Residues with .internal_coord attributes but no coordinates except for chain start N, CA, C atoms if supplied, OR None on parse fail (silent unless verbose=True)

Bio.PDB.PICIO.read_PIC_seq(seqRec: Bio.SeqRecord.SeqRecord, pdbid: Optional[str] = None, title: Optional[str] = None, chain: Optional[str] = None) Bio.PDB.Structure.Structure

Read SeqRecord into Structure with default internal coords.

Bio.PDB.PICIO.enumerate_atoms(entity)

Ensure all atoms in entity have serial_number set.

Bio.PDB.PICIO.pdb_date(datestr: str) str

Convert yyyy-mm-dd date to dd-month-yy.

Bio.PDB.PICIO.write_PIC(entity, file, pdbid=None, chainid=None, picFlags: int = IC_Residue.picFlagsDefault, hCut: Optional[float] = None, pCut: Optional[float] = None)

Write Protein Internal Coordinates (PIC) to file.

See read_PIC() for file format. Recurses to lower entity levels (M, C, R).

Parameters
  • entity (Entity) – Biopython PDB Entity object: S, M, C or R

  • file (Bio.File) – as_handle() file name or handle

  • pdbid (str) – PDB idcode, read from entity if not supplied

  • chainid (char) – PDB Chain ID, set from C level entity.id if needed

  • picFlags (int) –

    boolean flags controlling output, defined in Bio.PDB.internal_coords.IC_Residue.pic_flags

    • ”psi”,

    • ”omg”,

    • ”phi”,

    • ”tau”, # tau hedron (N-Ca-C)

    • ”chi1”,

    • ”chi2”,

    • ”chi3”,

    • ”chi4”,

    • ”chi5”,

    • ”pomg”, # proline omega

    • ”chi”, # chi1 through chi5

    • ”classic_b”, # psi | phi | tau | pomg

    • ”classic”, # classic_b | chi

    • ”hedra”, # all hedra including bond lengths

    • ”primary”, # all primary dihedra

    • ”secondary”, # all secondary dihedra

    • ”all”, # hedra | primary | secondary

    • ”initAtoms”, # XYZ coordinates of initial Tau (N-Ca-C)

    • ”bFactors”

    default is everything:

    picFlagsDefault = (
        pic_flags.all | pic_flags.initAtoms | pic_flags.bFactors
    )
    

    Usage in your code:

    # just primary dihedra and all hedra
    picFlags = (
        IC_Residue.pic_flags.primary | IC_Residue.pic_flags.hedra
    )
    
    # no B-factors:
    picFlags = IC_Residue.picFlagsDefault
    picFlags &= ~IC_Residue.pic_flags.bFactors
    

    read_PIC() with (defaults=True) will use default values for anything left out

  • hCut (float) – default None only write hedra with ref db angle std dev greater than this value

  • pCut (float) – default None only write primary dihedra with ref db angle std dev greater than this value

Default values:

Data averaged from Sep 2019 Dunbrack cullpdb_pc20_res2.2_R1.0.

Please see

PISCES: A Protein Sequence Culling Server

‘G. Wang and R. L. Dunbrack, Jr. PISCES: a protein sequence culling server. Bioinformatics, 19:1589-1591, 2003.’

‘primary’ and ‘secondary’ dihedra are defined in ic_data.py. Specifically, secondary dihedra can be determined as a fixed rotation from another known angle, for example N-Ca-C-O can be estimated from N-Ca-C-N (psi).

Standard deviations are listed in <biopython distribution>/Bio/PDB/ic_data.py for default values, and can be used to limit which hedra and dihedra are defaulted vs. output exact measurements from structure (see hCut and pCut above). Default values for primary dihedra (psi, phi, omega, chi1, etc.) are chosen as the most common integer value, not an average.

Raises
  • PDBException – if entity level is A (Atom)

  • Exception – if entity does not have .level attribute