Bio.PDB.internal_coords module

Classes to support internal coordinates for protein structures.

Internal coordinates comprise Psi, Omega and Phi dihedral angles along the protein backbone, Chi angles along the sidechains, and all 3-atom angles and bond lengths defining a protein chain. These routines can compute internal coordinates from atom XYZ coordinates, and compute atom XYZ coordinates from internal coordinates.

Secondary benefits include the ability to align and compare residue environments in 3D structures, support for 2D atom distance plots, converting a distance plot plus chirality information to a structure, generating an OpenSCAD description of a structure for 3D printing, and reading/writing structures as internal coordinate data files.

Usage:

from Bio.PDB.PDBParser import PDBParser
from Bio.PDB.Chain import Chain
from Bio.PDB.internal_coords import *
from Bio.PDB.PICIO import write_PIC, read_PIC, read_PIC_seq
from Bio.PDB.ic_rebuild import write_PDB, IC_duplicate, structure_rebuild_test
from Bio.PDB.SCADIO import write_SCAD
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
from Bio.PDB.PDBIO import PDBIO
import numpy as np

# load a structure as normal, get first chain
parser = PDBParser()
myProtein = parser.get_structure("7rsa", "pdb7rsa.ent")
myChain = myProtein[0]["A"]

# compute bond lengths, angles, dihedral angles
myChain.atom_to_internal_coordinates(verbose=True)

# check myChain makes sense (can get angles and rebuild same structure)
resultDict = structure_rebuild_test(myChain)
assert resultDict['pass'] == True

# get residue 1 chi2 angle
r1 = next(myChain.get_residues())
r1chi2 = r1.internal_coord.get_angle("chi2")

# rotate residue 1 chi2 angle by 120 degrees (loops w/in +/-180)
r1.internal_coord.set_angle("chi2", r1chi2 + 120.0)
# or
r1.internal_coord.bond_rotate("chi2", 120.0)
# update myChain XYZ coordinates with chi2 changed
myChain.internal_to_atom_coordinates()
# write new conformation with PDBIO
write_PDB(myProtein, "myChain.pdb")
# or just the ATOM records without headers:
io = PDBIO()
io.set_structure(myProtein)
io.save("myChain2.pdb")

# write chain as 'protein internal coordinates' (.pic) file
write_PIC(myProtein, "myChain.pic")
# read .pic file
myProtein2 = read_PIC("myChain.pic")

# create default structure for random sequence by reading as .pic file
myProtein3 = read_PIC_seq(
    SeqRecord(
        Seq("GAVLIMFPSTCNQYWDEHKR"),
        id="1RND",
        description="my random sequence",
    )
)
myProtein3.internal_to_atom_coordinates()
write_PDB(myProtein3, "myRandom.pdb")

# access the all-dihedrals array for the chain, e.g. residue 1 chi2 angle:
r1chi2_obj = r1.internal_coord.pick_angle("chi2")
# or same thing: r1chi2_obj = r1.internal_coord.pick_angle("CA:CB:CG:CD")
r1chi2_key = r1chi2_obj.atomkeys
# r1chi2_key is tuple of AtomKeys (1_K_CA, 1_K_CB, 1_K_CG, 1_K_CD)
r1chi2_index = myChain.internal_coord.dihedraNdx[r1chi2_key]
# or same thing: r1chi2_index = r1chi2_obj.ndx
r1chi2_value = myChain.internal_coord.dihedraAngle[r1chi2_index]
# also true: r1chi2_obj == myChain.internal_coord.dihedra[r1chi2_index]

# access the array of all atoms for the chain, e.g. residue 1 C-beta
r1_cBeta_index = myChain.internal_coord.atomArrayIndex[AtomKey("1_K_CB")]
r1_cBeta_coords = myChain.internal_coord.atomArray[r1_cBeta_index]
# r1_cBeta_coords = [ x, y, z, 1.0 ]

# the Biopython Atom coord array is now a view into atomArray, so
assert r1_cBeta_coords[1] == r1["CB"].coord[1]
r1_cBeta_coords[1] += 1.0  # change the Y coord 1 angstrom
assert r1_cBeta_coords[1] == r1["CB"].coord[1]
# they are always the same (they share the same memory)
r1_cBeta_coords[1] -= 1.0  # restore

# create a selector to filter just the C-alpha atoms from the all atom array
atmNameNdx = AtomKey.fields.atm
atomArrayIndex = myChain.internal_coord.atomArrayIndex
CaSelect = [
    atomArrayIndex.get(k) for k in atomArrayIndex.keys() if k.akl[atmNameNdx] == "CA"
]
# now the ordered array of C-alpha atom coordinates is:
CA_coords = myChain.internal_coord.atomArray[CaSelect]
# note this uses Numpy fancy indexing, so CA_coords is a new copy

# create a C-alpha distance plot
caDistances = myChain.internal_coord.distance_plot(CaSelect)
# display with e.g. MatPlotLib:
# import matplotlib.pyplot as plt
# plt.imshow(caDistances, cmap="hot", interpolation="nearest")
# plt.show()

# build structure from distance plot:
## create the all-atom distance plot
distances = myChain.internal_coord.distance_plot()
## get the sign of the dihedral angles
chirality = myChain.internal_coord.dihedral_signs()
## get new, empty data structure : copy data structure from myChain
myChain2 = IC_duplicate(myChain)[0]["A"]
cic2 = myChain2.internal_coord
## clear the new atomArray and di/hedra value arrays, just for proof
cic2.atomArray = np.zeros((cic2.AAsiz, 4), dtype=np.float64)
cic2.dihedraAngle[:] = 0.0
cic2.hedraAngle[:] = 0.0
cic2.hedraL12[:] = 0.0
cic2.hedraL23[:] = 0.0
## copy just the first N-Ca-C coords so structures will superimpose:
cic2.copy_initNCaCs(myChain.internal_coord)
## copy distances to chain arrays:
cic2.distplot_to_dh_arrays(distances, chirality)
## compute angles and dihedral angles from distances:
cic2.distance_to_internal_coordinates()
## generate XYZ coordinates from internal coordinates:
myChain2.internal_to_atom_coordinates()
## confirm result atomArray matches original structure:
assert np.allclose(cic2.atomArray, myChain.internal_coord.atomArray)

# superimpose all phe-phe pairs - quick hack just to demonstrate concept
# for analyzing pairwise residue interactions.  Generates PDB ATOM records
# placing each PHE at origin and showing all other PHEs in environment
## shorthand for key variables:
cic = myChain.internal_coord
resNameNdx = AtomKey.fields.resname
aaNdx = cic.atomArrayIndex
## select just PHE atoms:
pheAtomSelect = [aaNdx.get(k) for k in aaNdx.keys() if k.akl[resNameNdx] == "F"]
aaF = cic.atomArray[ pheAtomSelect ]  # numpy fancy indexing makes COPY not view

for ric in cic.ordered_aa_ic_list:  # internal_coords version of get_residues()
    if ric.rbase[2] == "F":  # if PHE, get transform matrices for chi1 dihedral
        chi1 = ric.pick_angle("N:CA:CB:CG")  # chi1 space has C-alpha at origin
        cst = np.transpose(chi1.cst)  # transform TO chi1 space
        # rcst = np.transpose(chi1.rcst)  # transform FROM chi1 space
        cic.atomArray[pheAtomSelect] = aaF.dot(cst)  # transform just the PHEs
        for res in myChain.get_residues():  # print PHEs in new coordinate space
            if res.resname in ["PHE"]:
                print(res.internal_coord.pdb_residue_string())
        cic.atomArray[pheAtomSelect] = aaF  # restore coordinate space from copy

# write OpenSCAD program of spheres and cylinders to 3d print myChain backbone
## set atom load filter to accept backbone only:
IC_Residue.accept_atoms = IC_Residue.accept_backbone
## delete existing data to force re-read of all atoms:
myChain.internal_coord = None
write_SCAD(myChain, "myChain.scad", scale=10.0)

See the ‘’Internal coordinates module’’ section of the Biopython Tutorial and Cookbook for further discussion.

Terms and key data structures: Internal coordinates are defined on sequences of atoms which span residues or follow accepted nomenclature along sidechains. To manage these sequences and support Biopython’s disorder mechanisms, AtomKey specifiers are implemented to capture residue, atom and variant identification in a single object. A Hedron object is specified as three sequential AtomKeys, comprising two bond lengths and the bond angle between them. A Dihedron consists of four sequential AtomKeys, linking two Hedra with a dihedral angle between them.

Algorithmic overview: The Internal Coordinates module combines a specification of connected atoms as hedra and dihedra in the ic_data file with routines here to transform XYZ coordinates of these atom sets between a local coordinate system and the world coordinates supplied in e.g. a PDB or mmCif data file. The local coordinate system places the center atom of a hedron at the origin (0,0,0), one leg on the +Z axis, and the other leg on the XZ plane (see Hedron). Measurement and creation or manipulation of hedra and dihedra in the local coordinate space is straightforward, and the calculated transformation matrices enable assembling these subunits into a protein chain starting from supplied (PDB) coordinates for the initial N-Ca-C atoms.

Psi and Phi angles are defined on atoms from adjacent residues in a protein chain, see e.g. pick_angle() and ic_data for the relevant mapping between residues and backbone dihedral angles.

Transforms to and from the dihedron local coordinate space described above are accessible via IC_Chain.dCoordSpace and Dihedron attributes .cst and .rcst, and may be applied in the alignment and comparison of residues and their environments with code along the lines of:

chi1 = ric0.pick_angle("chi1") # chi1 space defined with CA at origin
cst = np.transpose(chi1.cst) # transform TO chi1 local space
newAtomCoords = oldAtomCoords.dot(cst)

The core algorithms were developed independently during 1993-4 for ‘’Development and Application of a Three-dimensional Description of Amino Acid Environments in Protein,’’ Miller, Douthart, and Dunker, Advances in Molecular Bioinformatics, IOS Press, 1994, ISBN 90 5199 172 x, pp. 9-30.

A Protein Internal Coordinate (.pic) file format is defined to capture sufficient detail to reproduce a PDB file from chain starting coordinates (first residue N, Ca, C XYZ coordinates) and remaining internal coordinates. These files are used internally to verify that a given structure can be regenerated from its internal coordinates. See PICIO for reading and writing .pic files and structure_rebuild_test() to determine if a specific PDB or mmCif datafile has sufficient information to interconvert between cartesian and internal coordinates.

Internal coordinates may also be exported as OpenSCAD data arrays for generating 3D printed protein models. OpenSCAD software is provided as a starting point and proof-of-concept for generating such models. See SCADIO and this Thingiverse project for a more advanced example.

Refer to distance_plot() and distance_to_internal_coordinates() for converting structure data to/from 2D distance plots.

The following classes comprise the core functionality for processing internal coordinates and are sufficiently related and coupled to place them together in this module:

IC_Chain: Extends Biopython Chain on .internal_coord attribute.: Manages connected sequence of residues and chain breaks; holds numpy arrays for all atom coordinates and bond geometries. For ‘parallel’ processing IC_Chain methods operate on these arrays with single numpy commands.
IC_Residue: Extends Biopython Residue on .internal_coord attribute.: Access for per residue views on internal coordinates and methods for serial (residue by residue) assembly.
Dihedron: four joined atoms forming a dihedral angle.: Dihedral angle, homogeneous atom coordinates in local coordinate space, references to relevant Hedra and IC_Residue. Getter methods for residue dihedral angles, bond angles and bond lengths.
Hedron: three joined atoms forming a plane.: Contains homogeneous atom coordinates in local coordinate space as well as bond lengths and angle between them.
Edron: base class for Hedron and Dihedron classes.: Tuple of AtomKeys comprising child, string ID, mainchain membership boolean and other routines common for both Hedra and Dihedra. Implements rich comparison.
AtomKey: keys (dictionary and string) for referencing atom sequences.: Capture residue and disorder/occupancy information, provides a no-whitespace key for .pic files, and implements rich comparison.

Custom exception classes: HedronMatchError and MissingAtomError

class Bio.PDB.internal_coords.IC_Chain(parent, verbose: bool = False)

Bases: object

Class to extend Biopython Chain with internal coordinate data.

Attributes:

chain: object reference: The Biopython Bio.PDB.Chain.Chain object this extends
MaxPeptideBond: float: Class attribute to detect chain breaks. Override for fully contiguous chains with some very long bonds - e.g. for 3D printing (OpenSCAD output) a structure with missing residues. MaxPeptideBond
ParallelAssembleResidues: bool: Class attribute affecting internal_to_atom_coords. Short (50 residue and less) chains are faster to assemble without the overhead of creating numpy arrays, and the algorithm is easier to understand and trace processing a single residue at a time. Clearing (set to False) this flag will switch to the serial algorithm
ordered_aa_ic_list: list: IC_Residue objects internal_coords algorithms can process (e.g. no waters)
initNCaC: List of N, Ca, C AtomKey tuples (NCaCKeys).: NCaCKeys start chain segments (first residue or after chain break). These 3 atoms define the coordinate space for a contiguous chain segment, as initially specified by PDB or mmCIF file.
AAsiz = int: AtomArray size, number of atoms in this chain
atomArray: numpy array: homogeneous atom coords ([x,, y, z, 1.0]) for every atom in chain
atomArrayIndex: dict: maps AtomKeys to atomArray indexes
hedra: dict: Hedra forming residues in this chain; indexed by 3-tuples of AtomKeys.
hedraLen: int: length of hedra dict
hedraNdx: dict: maps hedra AtomKeys to numeric index into hedra data arrays e.g. hedraL12 below
a2ha_map: [hedraLen x 3]: atom indexes in hedraNdx order
dihedra: dict: Dihedra forming residues in this chain; indexed by 4-tuples of AtomKeys.
dihedraLen: int: length of dihedra dict
dihedraNdx: dict: maps dihedra AtomKeys to dihedra data arrays e.g. dihedraAngle
a2da_map[dihedraLen x 4]: AtomNdx’s in dihedraNdx order
d2a_map[dihedraLen x [4]]: AtomNdx’s for each dihedron (reshaped a2da_map)
Numpy arrays for vector processing of chain di/hedra:
hedraL12: numpy array: bond length between hedron 1st and 2nd atom
hedraAngle: numpy array: bond angle for each hedron, in degrees
hedraL23: numpy array: bond length between hedron 2nd and 3rd atom
id3_dh_index: dict: maps hedron key to list of dihedra starting with hedron, used by assemble and bond_rotate to find dihedra with h1 key
id32_dh_index: dict: like id3_dh_index, find dihedra from h2 key
hAtoms: numpy array: homogeneous atom coordinates (3x4) of hedra, central atom at origin
hAtomsR: numpy array: hAtoms in reverse orientation
hAtoms_needs_update: numpy array of bool: indicates whether hAtoms represent hedraL12/A/L23
dihedraAngle: numpy array: dihedral angles (degrees) for each dihedron
dAtoms: numpy array: homogeneous atom coordinates (4x4) of dihedra, second atom at origin
dAtoms_needs_update: numpy array of bool: indicates whether dAtoms represent dihedraAngle
dCoordSpace: numpy array: forward and reverse transform matrices standardising positions of first hedron. See dCoordSpace.
dcsValid: bool: indicates dCoordSpace up to date
See also attributes generated by :meth:`build_edraArrays` for indexing
di/hedra data elements.

Methods

internal_to_atom_coordinates:	Process ic data to Residue/Atom coordinates; calls assemble_residues()
assemble_residues:	Generate IC_Chain atom coords from internal coordinates (parallel)
assemble_residues_ser:	Generate IC_Residue atom coords from internal coordinates (serial)
atom_to_internal_coordinates:	Calculate dihedrals, angles, bond lengths (internal coordinates) for Atom data
write_SCAD:	Write OpenSCAD matrices for internal coordinate data comprising chain; this is a support routine, see `SCADIO.write_SCAD()` to generate OpenSCAD description of a protein chain.
distance_plot:	Generate 2D plot of interatomic distances with optional filter
distance_to_internal_coordinates:	Compute internal coordinates from distance plot and array of dihedral angle signs.
make_extended:	Arbitrarily sets all psi and phi backbone angles to 123 and -104 degrees.

MaxPeptideBond = 1.4: Larger C-N distance than this will be chain break

ParallelAssembleResidues = True: Enable parallel internal_to_atom algorithm, is slower for short chains

AAsiz = 0: Number of atoms in this chain (size of atomArray)

atomArray: array = None: AAsiz x [4] of float np.float64 homogeneous atom coordinates, all atoms in chain.

dCoordSpace = None: [2][dihedraLen][4][4] : 2 arrays of 4x4 coordinate space transforms for each dihedron. The first [0] converts TO standard space with first atom on the XZ plane, the second atom at the origin, the third on the +Z axis, and the fourth placed according to the dihedral angle. The second [1] transform returns FROM the standard space to world coordinates (PDB file input or whatever is current). Also accessible as .cst (forward transform) and .rcst (reverse transform) in Dihedron.

dcsValid = None: True if dCoordSpace is up to date. Use update_dCoordSpace() if needed.

__init__(parent, verbose: bool = False) → None

Initialize IC_Chain object, with or without residue/Atom data.

Parameters:: parent (Bio.PDB.Chain) – Biopython Chain object Chain object this extends

__deepcopy__(memo) → IC_Chain: Implement deepcopy for IC_Chain.

clear_ic(): Clear residue internal_coord settings for this chain.

build_atomArray() → None

Build IC_Chain numpy coordinate array from biopython atoms.

See also init_edra() for more complete initialization of IC_Chain.

Inputs:

self.aksetset: AtomKey s in this chain

Generates:

self.AAsizint: number of atoms in chain (len(akset))
self.aktupleAAsiz x AtomKeys: sorted akset AtomKeys
self.atomArrayIndex[AAsiz] of int: numerical index for each AtomKey in aktuple
self.atomArrayValidAAsiz x bool: atomArray coordinates current with internal coordinates if True
self.atomArrayAAsiz x np.float64[4]: homogeneous atom coordinates; Biopython Atom coordinates are view into this array after execution
rak_cachedict: lookup cache for AtomKeys for each residue

build_edraArrays() → None

Build chain level hedra and dihedra arrays.

Used by init_edra() and _hedraDict2chain(). Should be private method but exposed for documentation.

Inputs:

self.dihedraLenint: number of dihedra needed
self.hedraLenint: number of hedra needed
self.AAsizint: length of atomArray
self.hedraNdxdict: maps hedron keys to range(hedraLen)
self.dihedraNdxdict: maps dihedron keys to range(dihedraLen)
self.hedradict: maps Hedra keys to Hedra for chain
self.atomArrayAAsiz x np.float64[4]: homogeneous atom coordinates for chain
self.atomArrayIndexdict: maps AtomKeys to atomArray
self.atomArrayValidAAsiz x bool: indicates coord is up-to-date

Generates:

self.dCoordSpace[2][dihedraLen][4][4]: transforms to/from dihedron coordinate space
self.dcsValiddihedraLen x bool: indicates dCoordSpace is current
self.hAtomshedraLen x 3 x np.float64[4]: atom coordinates in hCoordSpace
self.hAtomsRhedraLen x 3 x np.float64[4]: hAtoms in reverse order (trading space for time)
self.hAtoms_needs_updatehedraLen x bool: indicates hAtoms, hAtoms current
self.a2h_mapAAsiz x [int …]: maps atomArrayIndex to hedraNdx’s with that atom
self.a2ha_map[hedraLen x 3]: AtomNdx’s in hedraNdx order
self.h2aahedraLen x [int …]: maps hedraNdx to atomNdx’s in hedron (reshaped later)
Hedron.ndxint: self.hedraNdx value stored inside Hedron object
self.dRevdihedraLen x bool: dihedron reversed if true
self.dH1ndx, dH2ndx[dihedraLen]: hedraNdx’s for 1st and 2nd hedra
self.h1d_maphedraLen x []: hedraNdx -> [dihedra using hedron]
Dihedron.h1key, h2key[AtomKey …]: hedron keys for dihedron, reversed as needed
Dihedron.hedron1, hedron2Hedron: references inside dihedron to hedra
Dihedron.ndxint: self.dihedraNdx info inside Dihedron object
Dihedron.cst, rcstnp.float64p4][4]: dCoordSpace references inside Dihedron
self.a2da_map[dihedraLen x 4]: AtomNdx’s in dihedraNdx order
self.d2a_map[dihedraLen x [4]]: AtomNdx’s for each dihedron (reshaped a2da_map)
self.dFwdbool: dihedron is not Reversed if True
self.a2d_mapAAsiz x [[dihedraNdx]: [atom ndx 0-3 of atom in dihedron]], maps atom indexes to dihedra and atoms in them
self.dAtoms_needs_updatedihedraLen x bool: atoms in h1, h2 are current if False

assemble_residues(verbose: bool = False) → None

Generate atom coords from internal coords (vectorised).

This is the ‘Numpy parallel’ version of assemble_residues_ser().

Starting with dihedra already formed by init_atom_coords(), transform each from dihedron local coordinate space into protein chain coordinate space. Iterate until all dependencies satisfied.

Does not update dCoordSpace as assemble_residues_ser() does. Call update_dCoordSpace() if needed. Faster to do in single operation once all atom coordinates finished.

Parameters:: verbose (bool) – default False. Report number of iterations to compute changed dihedra

generates:

self.dSet: AAsiz x dihedraLen x 4: maps atoms in dihedra to atomArray
self.dSetValid[dihedraLen][4] of bool: map of valid atoms into dihedra to detect 3 or 4 atoms valid

Output coordinates written to atomArray. Biopython Bio.PDB.Atom coordinates are a view on this data.

assemble_residues_ser(verbose: bool = False, start: int | None = None, fin: int | None = None) → None

Generate IC_Residue atom coords from internal coordinates (serial).

See assemble_residues() for ‘numpy parallel’ version.

Filter positions between start and fin if set, find appropriate start coordinates for each residue and pass to assemble()

Parameters:

verbose (bool) – default False. Describe runtime problems
start,fin (int) – default None. Sequence position for begin, end of subregion to generate coords for.

init_edra(verbose: bool = False) → None

Create chain and residue di/hedra structures, arrays, atomArray.

Inputs:

self.ordered_aa_ic_list : list of IC_Residue

Generates:

edra objects, self.di/hedra (executes _create_edra())
atomArray and support (executes build_atomArray())
self.hedraLen : number of hedra in structure
hedraL12 : numpy arrays for lengths, angles (empty)
hedraAngle ..
hedraL23 ..
self.hedraNdx : dict mapping hedrakeys to hedraL12 etc
self.dihedraLen : number of dihedra in structure
dihedraAngle ..
dihedraAngleRads : np arrays for angles (empty)
self.dihedraNdx : dict mapping dihedrakeys to dihedraAngle

init_atom_coords() → None

Set chain level di/hedra initial coords from angles and distances.

Initializes atom coordinates in local coordinate space for hedra and dihedra, will be transformed appropriately later by dCoordSpace matrices for assembly.

update_dCoordSpace(workSelector: ndarray | None = None) → None

Compute/update coordinate space transforms for chain dihedra.

Requires all atoms updated so calls assemble_residues() (returns immediately if all atoms already assembled).

Parameters:: workSelector ([bool]) – Optional mask to select dihedra for update

propagate_changes() → None: Track through di/hedra to invalidate dependent atoms.

internal_to_atom_coordinates(verbose: bool = False, start: int | None = None, fin: int | None = None) → None

Process IC data to Residue/Atom coords.

Parameters:

verbose (bool) – default False. Describe runtime problems
start,fin (int) – Optional sequence positions for begin, end of subregion to process.

Note

Setting start or fin activates serial assemble_residues_ser() instead of (Numpy parallel) assemble_residues(). Start C-alpha will be at origin.

See also

distance_to_internal_coordinates(), which requires the default all atom distance plot.

dihedral_signs() → ndarray

Get sign array (+1/-1) for each element of chain dihedraAngle array.

Required for distance_to_internal_coordinates()

distplot_to_dh_arrays(distplot: ndarray, dihedra_signs: ndarray) → None

Load di/hedra distance arrays from distplot.

Fill IC_Chain arrays hedraL12, L23, L13 and dihedraL14 distance value arrays from input distplot, dihedra_signs array from input dihedra_signs. Distplot and di/hedra distance arrays must index according to AtomKey mappings in IC_Chain .hedraNdx and .dihedraNdx (created in IC_Chain.init_edra())

Call atom_to_internal_coordinates() (or at least init_edra()) to generate a2ha_map and d2a_map before running this.

Explicitly removed from distance_to_internal_coordinates() so user may populate these chain di/hedra arrays by other methods.

distance_to_internal_coordinates(resetAtoms: bool | None = True) → None

Compute chain di/hedra from from distance and chirality data.

Distance properties on hedra L12, L23, L13 and dihedra L14 configured by distplot_to_dh_arrays() or alternative loader.

dihedraAngles result is multiplied by dihedra_signs at final step recover chirality information lost in distance plot (mirror image of structure has same distances but opposite sign dihedral angles).

Note that chain breaks will cause errors in rebuilt structure, use copy_initNCaCs() to avoid this

Based on Blue, the Hedronometer’s answer to The dihedral angles of a tetrahedron in terms of its edge lengths on math.stackexchange.com. See also: “Heron-like Hedronometric Results for Tetrahedral Volume”.

Other values from that analysis included here as comments for completeness:

oa = hedron1 L12 if reverse else hedron1 L23
ob = hedron1 L23 if reverse else hedron1 L12
ac = hedron2 L12 if reverse else hedron2 L23
ab = hedron1 L13 = law of cosines on OA, OB (hedron1 L12, L23)
oc = hedron2 L13 = law of cosines on OA, AC (hedron2 L12, L23)
bc = dihedron L14

target is OA, the dihedral angle along edge oa.

Parameters:: resetAtoms (bool) – default True. Mark all atoms in di/hedra and atomArray for updating by internal_to_atom_coordinates(). Alternatively set this to False and manipulate atomArrayValid, dAtoms_needs_update and hAtoms_needs_update directly to reduce computation.

copy_initNCaCs(other: IC_Chain) → None

Copy atom coordinates for initNCaC atoms from other IC_Chain.

Copies the coordinates and sets atomArrayValid flags True for initial NCaC and after any chain breaks.

Needed for distance_to_internal_coordinates() if target has chain breaks (otherwise each fragment will start at origin).

Also useful if copying internal coordinates from another chain.

N.B. IC_Residue.set_angle() and IC_Residue.set_length() invalidate their relevant atoms, so apply them before calling this function.

make_extended(): Set all psi and phi angles to extended conformation (123, -104).

__annotations__ = {'atomArray': <built-in function array>, 'atomArrayIndex': 'dict[AtomKey, int]', 'bpAtomArray': 'list[Atom]', 'ordered_aa_ic_list': 'list[IC_Residue]'}

class Bio.PDB.internal_coords.IC_Residue(parent: Residue)

Bases: object

Class to extend Biopython Residue with internal coordinate data.

Parameters:

parent: biopython Residue object this class extends

Attributes:

no_altloc: bool default False

Class variable, disable processing of ALTLOC atoms if True, use only selected atoms.

accept_atoms: tuple

Class variable accept_atoms, list of PDB atom names to use when generating internal coordinates. Default is:

accept_atoms = accept_mainchain + accept_hydrogens

to exclude hydrogens in internal coordinates and generated PDB files, override as:

IC_Residue.accept_atoms = IC_Residue.accept_mainchain

to get only mainchain atoms plus amide proton, use:

IC_Residue.accept_atoms = IC_Residue.accept_mainchain + ('H',)

to convert D atoms to H, set AtomKey.d2h = True and use:

IC_Residue.accept_atoms = (
    accept_mainchain + accept_hydrogens + accept_deuteriums
)

Note that accept_mainchain = accept_backbone + accept_sidechain. Thus to generate sequence-agnostic conformational data for e.g. structure alignment in dihedral angle space, use:

IC_Residue.accept_atoms = accept_backbone

or set gly_Cbeta = True and use:

IC_Residue.accept_atoms = accept_backbone + ('CB',)

Changing accept_atoms will cause the default structure_rebuild_test in ic_rebuild to fail if some atoms are filtered (obviously). Use the quick=True option to test only the coordinates of filtered atoms to avoid this.

There is currently no option to output internal coordinates with D instead of H.

accept_resnames: tuple

Class variable accept_resnames, list of 3-letter residue names for HETATMs to accept when generating internal coordinates from atoms. HETATM sidechain will be ignored, but normal backbone atoms (N, CA, C, O, CB) will be included. Currently only CYG, YCM and UNK; override at your own risk. To generate sidechain, add appropriate entries to ic_data_sidechains in ic_data and support in IC_Chain.atom_to_internal_coordinates().

gly_Cbeta: bool default False

Class variable gly_Cbeta, override to True to generate internal coordinates for glycine CB atoms in IC_Chain.atom_to_internal_coordinates()

IC_Residue.gly_Cbeta = True

pic_accuracy: str default “17.13f”

Class variable pic_accuracy sets accuracy for numeric values (angles, lengths) in .pic files. Default set high to support mmCIF file accuracy in rebuild tests. If you find rebuild tests fail with ‘ERROR -COORDINATES-’ and verbose=True shows only small discrepancies, try raising this value (or lower it to 9.5 if only working with PDB format files).

IC_Residue.pic_accuracy = "9.5f"

residue: Biopython Residue object reference

The Residue object this extends

hedra: dict indexed by 3-tuples of AtomKeys

Hedra forming this residue

dihedra: dict indexed by 4-tuples of AtomKeys

Dihedra forming (overlapping) this residue

rprev, rnext: lists of IC_Residue objects

References to adjacent (bonded, not missing, possibly disordered) residues in chain

atom_coords: AtomKey indexed dict of numpy [4] arrays

removed Use AtomKeys and atomArrayIndex to build if needed

ak_set: set of AtomKeys in dihedra

AtomKeys in all dihedra overlapping this residue (see __contains__())

alt_ids: list of char

AltLoc IDs from PDB file

bfactors: dict

AtomKey indexed B-factors as read from PDB file

NCaCKey: List of tuples of AtomKeys

List of tuples of N, Ca, C backbone atom AtomKeys; usually only 1 but more if backbone altlocs.

is20AA: bool

True if residue is one of 20 standard amino acids, based on Residue resname

isAccept: bool

True if is20AA or in accept_resnames below

rbase: tuple

residue position, insert code or none, resname (1 letter if standard amino acid)

cic: IC_Chain default None

parent chain IC_Chain object

scale: optional float

used for OpenSCAD output to generate gly_Cbeta bond length

Methods

assemble(atomCoordsIn, resetLocation, verbose)	Compute atom coordinates for this residue from internal coordinates
get_angle()	Return angle for passed key
get_length()	Return bond length for specified pair
pick_angle()	Find Hedron or Dihedron for passed key
pick_length()	Find hedra for passed AtomKey pair
set_angle()	Set angle for passed key (no position updates)
set_length()	Set bond length in all relevant hedra for specified pair
bond_rotate(delta)	adjusts related dihedra angles by delta, e.g. rotating psi (N-Ca-C-N) will adjust the adjacent N-Ca-C-O by the same amount to avoid clashes
bond_set(angle)	uses bond_rotate to set specified dihedral to angle and adjust related dihedra accordingly
rak(atom info)	cached AtomKeys for this residue

accept_resnames = ('CYG', 'YCM', 'UNK'): Add 3-letter residue name here for non-standard residues with normal backbone. CYG included for test case 4LGY (1305 residue contiguous chain). Safe to add more names for N-CA-C-O backbones, any more complexity will need additions to accept_atoms, ic_data_sidechains in ic_data and support in IC_Chain.atom_to_internal_coordinates()

no_altloc: bool = False: Set True to filter altloc atoms on input and only work with Biopython default Atoms

gly_Cbeta: bool = False

Create beta carbons on all Gly residues.

Setting this to True will generate internal coordinates for Gly C-beta carbons in atom_to_internal_coordinates().

Data averaged from Sep 2019 Dunbrack cullpdb_pc20_res2.2_R1.0 restricted to structures with amide protons. Please see

PISCES: A Protein Sequence Culling Server

‘G. Wang and R. L. Dunbrack, Jr. PISCES: a protein sequence culling server. Bioinformatics, 19:1589-1591, 2003.’

Ala avg rotation of OCCACB from NCACO query:

select avg(g.rslt) as avg_rslt, stddev(g.rslt) as sd_rslt, count(*)
from
(select f.d1d, f.d2d,
(case when f.rslt > 0 then f.rslt-360.0 else f.rslt end) as rslt
from (select d1.angle as d1d, d2.angle as d2d,
(d2.angle - d1.angle) as rslt from dihedron d1,
dihedron d2 where d1.re_class='AOACACAACB' and
d2.re_class='ANACAACAO' and d1.pdb=d2.pdb and d1.chn=d2.chn
and d1.res=d2.res) as f) as g

results:

| avg_rslt          | sd_rslt          | count   |
| -122.682194862932 | 5.04403040513919 | 14098   |

pic_accuracy: str = '17.13f'

accept_backbone = ('N', 'CA', 'C', 'O', 'OXT')

accept_sidechain = ('CB', 'CG', 'CG1', 'OG1', 'OG', 'SG', 'CG2', 'CD', 'CD1', 'SD', 'OD1', 'ND1', 'CD2', 'ND2', 'CE', 'CE1', 'NE', 'OE1', 'NE1', 'CE2', 'OE2', 'NE2', 'CE3', 'CZ', 'NZ', 'CZ2', 'CZ3', 'OD2', 'OH', 'CH2', 'NH1', 'NH2')

accept_mainchain = ('N', 'CA', 'C', 'O', 'OXT', 'CB', 'CG', 'CG1', 'OG1', 'OG', 'SG', 'CG2', 'CD', 'CD1', 'SD', 'OD1', 'ND1', 'CD2', 'ND2', 'CE', 'CE1', 'NE', 'OE1', 'NE1', 'CE2', 'OE2', 'NE2', 'CE3', 'CZ', 'NZ', 'CZ2', 'CZ3', 'OD2', 'OH', 'CH2', 'NH1', 'NH2')

accept_hydrogens = ('H', 'H1', 'H2', 'H3', 'HA', 'HA2', 'HA3', 'HB', 'HB1', 'HB2', 'HB3', 'HG2', 'HG3', 'HD2', 'HD3', 'HE2', 'HE3', 'HZ1', 'HZ2', 'HZ3', 'HG11', 'HG12', 'HG13', 'HG21', 'HG22', 'HG23', 'HZ', 'HD1', 'HE1', 'HD11', 'HD12', 'HD13', 'HG', 'HG1', 'HD21', 'HD22', 'HD23', 'NH1', 'NH2', 'HE', 'HH11', 'HH12', 'HH21', 'HH22', 'HE21', 'HE22', 'HE2', 'HH', 'HH2')

accept_deuteriums = ('D', 'D1', 'D2', 'D3', 'DA', 'DA2', 'DA3', 'DB', 'DB1', 'DB2', 'DB3', 'DG2', 'DG3', 'DD2', 'DD3', 'DE2', 'DE3', 'DZ1', 'DZ2', 'DZ3', 'DG11', 'DG12', 'DG13', 'DG21', 'DG22', 'DG23', 'DZ', 'DD1', 'DE1', 'DD11', 'DD12', 'DD13', 'DG', 'DG1', 'DD21', 'DD22', 'DD23', 'ND1', 'ND2', 'DE', 'DH11', 'DH12', 'DH21', 'DH22', 'DE21', 'DE22', 'DE2', 'DH', 'DH2')

accept_atoms = ('N', 'CA', 'C', 'O', 'OXT', 'CB', 'CG', 'CG1', 'OG1', 'OG', 'SG', 'CG2', 'CD', 'CD1', 'SD', 'OD1', 'ND1', 'CD2', 'ND2', 'CE', 'CE1', 'NE', 'OE1', 'NE1', 'CE2', 'OE2', 'NE2', 'CE3', 'CZ', 'NZ', 'CZ2', 'CZ3', 'OD2', 'OH', 'CH2', 'NH1', 'NH2', 'H', 'H1', 'H2', 'H3', 'HA', 'HA2', 'HA3', 'HB', 'HB1', 'HB2', 'HB3', 'HG2', 'HG3', 'HD2', 'HD3', 'HE2', 'HE3', 'HZ1', 'HZ2', 'HZ3', 'HG11', 'HG12', 'HG13', 'HG21', 'HG22', 'HG23', 'HZ', 'HD1', 'HE1', 'HD11', 'HD12', 'HD13', 'HG', 'HG1', 'HD21', 'HD22', 'HD23', 'NH1', 'NH2', 'HE', 'HH11', 'HH12', 'HH21', 'HH22', 'HE21', 'HE22', 'HE2', 'HH', 'HH2'): Change accept_atoms to restrict atoms processed. See IC_Residue for usage.

__init__(parent: Residue) → None

Initialize IC_Residue with parent Biopython Residue.

Parameters:: parent (Residue) – Biopython Residue object. The Biopython Residue this object extends

__deepcopy__(memo): Deep copy implementation for IC_Residue.

__contains__(ak: AtomKey) → bool: Return True if atomkey is in this residue.

rak(atm: str | Atom) → AtomKey: Cache calls to AtomKey for this residue.

__repr__() → str: Print string is parent Residue ID.

pretty_str() → str: Nice string for residue ID.

set_flexible() → None

For OpenSCAD, mark N-CA and CA-C bonds to be flexible joints.

See SCADIO.write_SCAD()

set_hbond() → None

For OpenSCAD, mark H-N and C-O bonds to be hbonds (magnets).

See SCADIO.write_SCAD()

clear_transforms()

Invalidate dihedra coordinate space attributes before assemble().

Coordinate space attributes are Dihedron.cst and .rcst, and IC_Chain.dCoordSpace

assemble(resetLocation: bool = False, verbose: bool = False) → dict[AtomKey, array] | dict[tuple[AtomKey, AtomKey, AtomKey], array] | None

Compute atom coordinates for this residue from internal coordinates.

This is the IC_Residue part of the assemble_residues_ser() serial version, see assemble_residues() for numpy vectorized approach which works at the IC_Chain level.

Join prepared dihedra starting from N-CA-C and N-CA-CB hedrons, computing protein space coordinates for backbone and sidechain atoms

Sets forward and reverse transforms on each Dihedron to convert from protein coordinates to dihedron space coordinates for first three atoms (see IC_Chain.dCoordSpace)

Call init_atom_coords() to update any modified di/hedra before coming here, this only assembles dihedra into protein coordinate space.

Algorithm

Form double-ended queue, start with c-ca-n, o-c-ca, n-ca-cb, n-ca-c.

if resetLocation=True, use initial coords from generating dihedron for n-ca-c initial positions (result in dihedron coordinate space)

while queue not empty

get 3-atom hedron key

for each dihedron starting with hedron key (1st hedron of dihedron)

if have coordinates for all 4 atoms already
add 2nd hedron key to back of queue

else if have coordinates for 1st 3 atoms
compute forward and reverse transforms to take 1st 3 atoms to/from dihedron initial coordinate space

use reverse transform to get position of 4th atom in current coordinates from dihedron initial coordinates

add 2nd hedron key to back of queue

else
ordering failed, put hedron key at back of queue and hope next time we have 1st 3 atom positions (should not happen)

loop terminates (queue drains) as hedron keys which do not start any dihedra are removed without action

Parameters:

resetLocation (bool) – default False. - Option to ignore start location and orient so initial N-Ca-C hedron at origin.

Returns:

Dict of AtomKey -> homogeneous atom coords for residue in protein space relative to previous residue

Also directly updates IC_Chain.atomArray as assemble_residues() does.

split_akl(lst: tuple[AtomKey, ...] | list[AtomKey], missingOK: bool = False) → list[tuple[AtomKey, ...]]

Get AtomKeys for this residue (ak_set) for generic list of AtomKeys.

Changes and/or expands a list of ‘generic’ AtomKeys (e.g. ‘N, C, C’) to be specific to this Residue’s altlocs etc., e.g. ‘(N-Ca_A_0.3-C, N-Ca_B_0.7-C)’

Given a list of AtomKeys for a Hedron or Dihedron,

return:

list of matching atomkeys that have id3_dh in this residue (ak may change if occupancy != 1.00)

or: multiple lists of matching atomkeys expanded for all atom altlocs
or: empty list if any of atom_coord(ak) missing and not missingOK

Parameters:

lst (list) – list[3] or [4] of AtomKeys. Non-altloc AtomKeys to match to specific AtomKeys for this residue
missingOK (bool) – default False, see above.

atom_sernum = None

atom_chain = None

pdb_residue_string() → str

Generate PDB ATOM records for this residue as string.

Convenience method for functionality not exposed in PDBIO.py. Increments IC_Residue.atom_sernum if not None

Parameters:

IC_Residue.atom_sernum – Class variable default None. Override and increment atom serial number if not None
IC_Residue.atom_chain – Class variable. Override atom chain id if not None

Todo

move to PDBIO

pic_flags = (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 496, 525, 1021, 1024, 2048, 4096, 7168, 8192, 16384): Used by PICIO.write_PIC() to control classes of values to be defaulted.

picFlagsDefault = 31744: Default is all dihedra + initial tau atoms + bFactors.

picFlagsDict = {'all': 7168, 'bFactors': 16384, 'chi': 496, 'chi1': 16, 'chi2': 32, 'chi3': 64, 'chi4': 128, 'chi5': 256, 'classic': 1021, 'classic_b': 525, 'hedra': 1024, 'initAtoms': 8192, 'omg': 2, 'phi': 4, 'pomg': 512, 'primary': 2048, 'psi': 1, 'secondary': 4096, 'tau': 8}: Dictionary of pic_flags values to use as needed.

pick_angle(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str) → Hedron | Dihedron | None

Get Hedron or Dihedron for angle_key.

Parameters:

angle_key –

tuple of 3 or 4 AtomKeys
string of atom names (‘CA’) separated by :’s
string of [-1, 0, 1]<atom name> separated by ‘:’s. -1 is previous residue, 0 is this residue, 1 is next residue
psi, phi, omg, omega, chi1, chi2, chi3, chi4, chi5
tau (N-CA-C angle) see Richardson1981
tuples of AtomKeys is only access for alternate disordered atoms

Observe that a residue’s phi and omega dihedrals, as well as the hedra comprising them (including the N:Ca:C tau hedron), are stored in the n-1 di/hedra sets; this overlap is handled here, but may be an issue if accessing directly.

The following print commands are equivalent (except for sidechains with non-carbon atoms for chi2):

ric = r.internal_coord
print(
    r,
    ric.get_angle("psi"),
    ric.get_angle("phi"),
    ric.get_angle("omg"),
    ric.get_angle("tau"),
    ric.get_angle("chi2"),
)
print(
    r,
    ric.get_angle("N:CA:C:1N"),
    ric.get_angle("-1C:N:CA:C"),
    ric.get_angle("-1CA:-1C:N:CA"),
    ric.get_angle("N:CA:C"),
    ric.get_angle("CA:CB:CG:CD"),
)

See ic_data.py for detail of atoms in the enumerated sidechain angles and the backbone angles which do not span the peptide bond. Using ‘s’ for current residue (‘self’) and ‘n’ for next residue, the spanning (overlapping) angles are:

(sN, sCA, sC, nN)   # psi
(sCA, sC, nN, nCA)  # omega i+1
(sC, nN, nCA, nC)   # phi i+1
(sCA, sC, nN)
(sC, nN, nCA)
(nN, nCA, nC)       # tau i+1

Returns:: Matching Hedron, Dihedron, or None.

get_angle(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str) → float | None

Get dihedron or hedron angle for specified key.

See pick_angle() for key specifications.

set_angle(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str, v: float, overlap=True)

Set dihedron or hedron angle for specified key.

If angle is a Dihedron and overlap is True (default), overlapping dihedra are also changed as appropriate. The overlap is a result of protein chain definitions in ic_data and _create_edra() (e.g. psi overlaps N-CA-C-O).

The default overlap=True is probably what you want for: set_angle(“chi1”, val)

The default is probably NOT what you want when processing all dihedrals in a chain or residue (such as copying from another structure), as the overlapping dihedra will likely be in the set as well.

N.B. setting e.g. PRO chi2 is permitted without error or warning!

See pick_angle() for angle_key specifications. See bond_rotate() to change a dihedral by a number of degrees

Parameters:

angle_key – angle identifier.
v (float) – new angle in degrees (result adjusted to +/-180).
overlap (bool) – default True. Modify overlapping dihedra as needed

bond_rotate(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str, delta: float)

Rotate set of overlapping dihedrals by delta degrees.

Changes a dihedral angle by a given delta, i.e. new_angle = current_angle + delta Values are adjusted so new_angle will be within +/-180.

Changes overlapping dihedra as in set_angle()

See pick_angle() for key specifications.

bond_set(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str, val: float)

Set dihedron to val, update overlapping dihedra by same amount.

Redundant to set_angle(), retained for compatibility. Unlike set_angle() this is for dihedra only and no option to not update overlapping dihedra.

See pick_angle() for key specifications.

pick_length(ak_spec: str | tuple[AtomKey, AtomKey]) → tuple[list[Hedron] | None, tuple[AtomKey, AtomKey] | None]

Get list of hedra containing specified atom pair.

Parameters:

ak_spec –

tuple of two AtomKeys
string: two atom names separated by ‘:’, e.g. ‘N:CA’ with optional position specifier relative to self, e.g. ‘-1C:N’ for preceding peptide bond. Position specifiers are -1, 0, 1.

The following are equivalent:

ric = r.internal_coord
print(
    r,
    ric.get_length("0C:1N"),
)
print(
    r,
    None
    if not ric.rnext
    else ric.get_length((ric.rak("C"), ric.rnext[0].rak("N"))),
)

If atom not found on current residue then will look on rprev[0] to handle cases like Gly N:CA. For finer control please access IC_Chain.hedra directly.

Returns:: list of hedra containing specified atom pair as tuples of AtomKeys

get_length(ak_spec: str | tuple[AtomKey, AtomKey]) → float | None

Get bond length for specified atom pair.

See pick_length() for ak_spec and details.

set_length(ak_spec: str | tuple[AtomKey, AtomKey], val: float) → None

Set bond length for specified atom pair.

See pick_length() for ak_spec.

applyMtx(mtx: array) → None: Apply matrix to atom_coords for this IC_Residue.

__annotations__ = {'_AllBonds': <class 'bool'>, 'ak_set': 'set[AtomKey]', 'akc': 'dict[str | Atom, AtomKey]', 'alt_ids': 'list[str] | None', 'bfactors': 'dict[str, float]', 'cic': 'IC_Chain', 'dihedra': 'dict[DKT, Dihedron]', 'gly_Cbeta': <class 'bool'>, 'hedra': 'dict[HKT, Hedron]', 'no_altloc': <class 'bool'>, 'pic_accuracy': <class 'str'>, 'rnext': 'list[IC_Residue]', 'rprev': 'list[IC_Residue]'}

class Bio.PDB.internal_coords.Edron(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str)

Bases: object

Base class for Hedron and Dihedron classes.

Supports rich comparison based on lists of AtomKeys.

Attributes:

atomkeys: tuple: 3 (hedron) or 4 (dihedron) AtomKey s defining this di/hedron
id: str: ‘:’-joined string of AtomKeys for this di/hedron
needs_update: bool: indicates di/hedron local atom_coords do NOT reflect current di/hedron angle and length values in hedron local coordinate space
e_class: str: sequence of atoms (no position or residue) comprising di/hedron for statistics
re_class: str: sequence of residue, atoms comprising di/hedron for statistics
cre_class: str: sequence of covalent radii classes comprising di/hedron for statistics
edron_re: compiled regex (Class Attribute): A compiled regular expression matching string IDs for Hedron and Dihedron objects
cic: IC_Chain reference: Chain internal coords object containing this hedron
ndx: int: index into IC_Chain level numpy data arrays for di/hedra. Set in IC_Chain.init_edra()
rc: int: number of residues involved in this edron

Methods

gen_key([AtomKey, …] or AtomKey, …) (Static Method)	generate a ‘:’-joined string of AtomKey Ids
is_backbone()	Return True if all atomkeys atoms are N, Ca, C or O

edron_re = re.compile('^(?P<pdbid>\\w+)?\\s(?P<chn>[\\w|\\s])?\\s(?P<a1>[\\w\\-\\.]+):(?P<a2>[\\w\\-\\.]+):(?P<a3>[\\w\\-\\.]+)(:(?P<a4>[\\w\\-\\.]+))?\\s+(((?P<len12>\\S+)\\s+(?P<angle>\\S+)\\s+(?P<len23>\\S+)\\s*$)|((?P<): A compiled regular expression matching string IDs for Hedron and Dihedron objects

static gen_key(lst: list[AtomKey]) → str

Generate string of ‘:’-joined AtomKey strings from input.

Generate ‘2_A_C:3_P_N:3_P_CA’ from (2_A_C, 3_P_N, 3_P_CA) :param list lst: list of AtomKey objects

static gen_tuple(akstr: str) → tuple

Generate AtomKey tuple for ‘:’-joined AtomKey string.

Generate (2_A_C, 3_P_N, 3_P_CA) from ‘2_A_C:3_P_N:3_P_CA’ :param str akstr: string of ‘:’-separated AtomKey strings

__init__(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str) → None

Initialize Edron with sequence of AtomKeys.

Acceptable input:

[ AtomKey, … ] : list of AtomKeys AtomKey, … : sequence of AtomKeys as args {‘a1’: str, ‘a2’: str, … } : dict of AtomKeys as ‘a1’, ‘a2’ …

__deepcopy__(memo): Deep copy implementation for Edron.

__contains__(ak: AtomKey) → bool: Return True if atomkey is in this edron.

is_backbone() → bool: Report True for contains only N, C, CA, O, H atoms.

__repr__() → str: Tuple of AtomKeys is default repr string.

__hash__() → int: Hash calculated at init from atomkeys tuple.

__eq__(other: object) → bool: Test for equality.

__ne__(other: object) → bool: Test for inequality.

__gt__(other: object) → bool: Test greater than.

__ge__(other: object) → bool: Test greater or equal.

__lt__(other: object) → bool: Test less than.

__le__(other: object) → bool: Test less or equal.

class Bio.PDB.internal_coords.Hedron(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey], **kwargs: str)

Bases: Edron

Class to represent three joined atoms forming a plane.

Contains atom coordinates in local coordinate space: central atom at origin, one terminal atom on XZ plane, and the other on the +Z axis. Stored in two orientations, with the 3rd (forward) or first (reversed) atom on the +Z axis. See Dihedron for use of forward and reverse orientations.

Attributes:

len12: float: distance between first and second atoms
len23: float: distance between second and third atoms
angle: float: angle (degrees) formed by three atoms in hedron
xrh_class: string: only for hedron spanning 2 residues, will have ‘X’ for residue contributing only one atom

Methods

get_length()	get bond length for specified atom pair
set_length()	set bond length for specified atom pair
angle(), len12(), len23()	setters for relevant attributes (angle in degrees)

__init__(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey], **kwargs: str) → None

Initialize Hedron with sequence of AtomKeys, kwargs.

Acceptable input:: As for Edron, plus optional ‘len12’, ‘angle’, ‘len23’ keyworded values.

__repr__() → str: Print string for Hedron object.

property angle: float: Get this hedron angle.

property len12: Get first length for Hedron.

property len23: float: Get second length for Hedron.

get_length(ak_tpl: tuple[AtomKey, AtomKey]) → float | None

Get bond length for specified atom pair.

Parameters:: ak_tpl (tuple) – tuple of AtomKeys. Pair of atoms in this Hedron

set_length(ak_tpl: tuple[AtomKey, AtomKey], newLength: float)

Set bond length for specified atom pair; sets needs_update.

Parameters:: .ak_tpl (tuple) – tuple of AtomKeys Pair of atoms in this Hedron

__annotations__ = {}

class Bio.PDB.internal_coords.Dihedron(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str)

Bases: Edron

Class to represent four joined atoms forming a dihedral angle.

Attributes:

angle: float: Measurement or specification of dihedral angle in degrees; prefer IC_Residue.bond_set() to set
hedron1, hedron2: Hedron object references: The two hedra which form the dihedral angle
h1key, h2key: tuples of AtomKeys: Hash keys for hedron1 and hedron2
id3,id32: tuples of AtomKeys: First 3 and second 3 atoms comprising dihedron; hxkey orders may differ
ric: IC_Residue object reference: IC_Residue object containing this dihedral
reverse: bool: Indicates order of atoms in dihedron is reversed from order of atoms in hedra
primary: bool: True if this is psi, phi, omega or a sidechain chi angle
pclass: string (primary angle class): re_class with X for adjacent residue according to nomenclature (psi, omega, phi)
cst, rcst: numpy [4][4] arrays: transformations to (cst) and from (rcst) Dihedron coordinate space defined with atom 2 (Hedron 1 center atom) at the origin. Views on IC_Chain.dCoordSpace.

Methods

angle()	getter/setter for dihdral angle in degrees; prefer `IC_Residue.bond_set()`
bits()	return `IC_Residue.pic_flags` bitmask for dihedron psi, omega, etc

__init__(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str) → None

Init Dihedron with sequence of AtomKeys and optional dihedral angle.

Acceptable input:: As for Edron, plus optional ‘dihedral’ keyworded angle value.

__repr__() → str: Print string for Dihedron object.

property angle: float: Get dihedral angle.

static angle_dif(a1: float | ndarray, a2: float | ndarray)

Get angle difference between two +/- 180 angles.

https://stackoverflow.com/a/36001014/2783487

static angle_avg(alst: list, in_rads: bool = False, out_rads: bool = False)

Get average of list of +/-180 angles.

Parameters:

alst (List) – list of angles to average
in_rads (bool) – input values are in radians
out_rads (bool) – report result in radians

static angle_pop_sd(alst: list, avg: float)

Get population standard deviation for list of +/-180 angles.

should be sample std dev but avoid len(alst)=1 -> div by 0

difference(other: Dihedron) → float: Get angle difference between this and other +/- 180 angles.

bits() → int: Get IC_Residue.pic_flags bitmasks for self is psi, omg, phi, pomg, chiX.

__annotations__ = {}

class Bio.PDB.internal_coords.AtomKey(*args: IC_Residue | Atom | list | dict | str, **kwargs: str)

Bases: object

Class for dict keys to reference atom coordinates.

AtomKeys capture residue and disorder information together, and provide a no-whitespace string key for .pic files.

Supports rich comparison and multiple ways to instantiate.

AtomKeys contain:: residue position (respos), insertion code (icode), 1 or 3 char residue name (resname), atom name (atm), altloc (altloc), and occupancy (occ)

Use AtomKey.fields to get the index to the component of interest by name:

Get C-alpha atoms from IC_Chain atomArray and atomArrayIndex with AtomKeys:

atmNameNdx = internal_coords.AtomKey.fields.atm
CaSelection = [
    atomArrayIndex.get(k)
    for k in atomArrayIndex.keys()
    if k.akl[atmNameNdx] == "CA"
]
AtomArrayCa = atomArray[CaSelection]

Get all phenylalanine atoms in a chain:

resNameNdx = internal_coords.AtomKey.fields.resname
PheSelection = [
    atomArrayIndex.get(k)
    for k in atomArrayIndex.keys()
    if k.akl[resNameNdx] == "F"
]
AtomArrayPhe = atomArray[PheSelection]

‘resname’ will be the uppercase 1-letter amino acid code if one of the 20 standard residues, otherwise the supplied 3-letter code. Supplied as input or read from .rbase attribute of IC_Residue.

Attributes:

akl: tuple: All six fields of AtomKey
fieldNames: tuple (Class Attribute): Mapping of key index positions to names
fields: namedtuple (Class Attribute): Mapping of field names to index positions.
id: str: ‘_’-joined AtomKey fields, excluding ‘None’ fields
atom_re: compiled regex (Class Attribute): A compiled regular expression matching the string form of the key
d2h: bool (Class Attribute) default False: Convert D atoms to H on input if True; must also modify IC_Residue.accept_atoms
missing: bool default False: AtomKey __init__’d from string is probably missing, set this flag to note the issue. Set by IC_Residue.rak()
ric: IC_Residue default None: If initialised with IC_Residue, this references the IC_residue

Methods

altloc_match(other)	Returns True if this AtomKey matches other AtomKey excluding altloc and occupancy fields
is_backbone()	Returns True if atom is N, CA, C, O or H
atm()	Returns atom name, e.g. N, CA, CB, etc.
cr_class()	Returns covalent radii class e.g. Csb

atom_re = re.compile('^(?P<respos>-?\\d+)(?P<icode>[A-Za-z])?_(?P<resname>[a-zA-Z]+)_(?P<atm>[A-Za-z0-9]+)(?:_(?P<altloc>\\w))?(?:_(?P<occ>-?\\d\\.\\d+?))?$'): Pre-compiled regular expression to match an AtomKey string.

fieldNames = ('respos', 'icode', 'resname', 'atm', 'altloc', 'occ')

fields = (0, 1, 2, 3, 4, 5): Use this namedtuple to access AtomKey fields. See AtomKey

d2h = False: Set True to convert D Deuterium to H Hydrogen on input.

__init__(*args: IC_Residue | Atom | list | dict | str, **kwargs: str) → None

Initialize AtomKey with residue and atom data.

Examples of acceptable input:

(<IC_Residue>, 'CA', ...)    : IC_Residue with atom info
(<IC_Residue>, <Atom>)       : IC_Residue with Biopython Atom
([52, None, 'G', 'CA', ...])  : list of ordered data fields
(52, None, 'G', 'CA', ...)    : multiple ordered arguments
({respos: 52, icode: None, atm: 'CA', ...}) : dict with fieldNames
(respos: 52, icode: None, atm: 'CA', ...) : kwargs with fieldNames
52_G_CA, 52B_G_CA, 52_G_CA_0.33, 52_G_CA_B_0.33  : id strings

__deepcopy__(memo): Deep copy implementation for AtomKey.

__repr__() → str: Repr string from id.

__hash__() → int: Hash calculated at init from akl tuple.

altloc_match(other: AtomKey) → bool: Test AtomKey match to other discounting occupancy and altloc.

is_backbone() → bool: Return True if is N, C, CA, O, or H.

atm() → str: Return atom name : N, CA, CB, O etc.

cr_class() → str | None: Return covalent radii class for atom or None.

__ne__(other: object) → bool: Test for inequality.

__eq__(other: object) → bool: Test for equality.

__gt__(other: object) → bool: Test greater than.

__ge__(other: object) → bool: Test greater or equal.

__lt__(other: object) → bool: Test less than.

__le__(other: object) → bool: Test less or equal.

Bio.PDB.internal_coords.set_accuracy_95(num: float) → float

Reduce floating point accuracy to 9.5 (xxxx.xxxxx).

Used by IC_Residue class writing PIC and SCAD files.

Parameters:: num (float) – input number
Returns:: float with specified accuracy

exception Bio.PDB.internal_coords.HedronMatchError

Bases: Exception

Cannot find hedron in residue for given key.

exception Bio.PDB.internal_coords.MissingAtomError

Bases: Exception

Missing atom coordinates for hedron or dihedron.