Bio.PDB.internal_coords module

Classes to support internal coordinates for protein structures.

Internal coordinates comprise Psi, Omega and Phi dihedral angles along the protein backbone, Chi angles along the sidechains, and all 3-atom angles and bond lengths defining a protein chain. These routines can compute internal coordinates from atom XYZ coordinates, and compute atom XYZ coordinates from internal coordinates.

Secondary benefits include the ability to align and compare residue environments in 3D structures, support for 2D atom distance plots, converting a distance plot plus chirality information to a structure, generating an OpenSCAD description of a structure for 3D printing, and reading/writing structures as internal coordinate data files.

Usage:

from Bio.PDB.PDBParser import PDBParser
from Bio.PDB.Chain import Chain
from Bio.PDB.internal_coords import *
from Bio.PDB.PICIO import write_PIC, read_PIC, read_PIC_seq
from Bio.PDB.ic_rebuild import write_PDB, IC_duplicate, structure_rebuild_test
from Bio.PDB.SCADIO import write_SCAD
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
from Bio.PDB.PDBIO import PDBIO
import numpy as np

# load a structure as normal, get first chain
parser = PDBParser()
myProtein = parser.get_structure("7rsa", "pdb7rsa.ent")
myChain = myProtein[0]["A"]

# compute bond lengths, angles, dihedral angles
myChain.atom_to_internal_coordinates(verbose=True)

# check myChain makes sense (can get angles and rebuild same structure)
resultDict = structure_rebuild_test(myChain)
assert resultDict['pass'] == True

# get residue 1 chi2 angle
r1 = next(myChain.get_residues())
r1chi2 = r1.internal_coord.get_angle("chi2")

# rotate residue 1 chi2 angle by 120 degrees (loops w/in +/-180)
r1.internal_coord.set_angle("chi2", r1chi2 + 120.0)
# or
r1.internal_coord.bond_rotate("chi2", 120.0)
# update myChain XYZ coordinates with chi2 changed
myChain.internal_to_atom_coordinates()
# write new conformation with PDBIO
write_PDB(myProtein, "myChain.pdb")
# or just the ATOM records without headers:
io = PDBIO()
io.set_structure(myProtein)
io.save("myChain2.pdb")

# write chain as 'protein internal coordinates' (.pic) file
write_PIC(myProtein, "myChain.pic")
# read .pic file
myProtein2 = read_PIC("myChain.pic")

# create default structure for random sequence by reading as .pic file
myProtein3 = read_PIC_seq(
    SeqRecord(
        Seq("GAVLIMFPSTCNQYWDEHKR"),
        id="1RND",
        description="my random sequence",
    )
)
myProtein3.internal_to_atom_coordinates()
write_PDB(myProtein3, "myRandom.pdb")

# access the all-dihedrals array for the chain, e.g. residue 1 chi2 angle:
r1chi2_obj = r1.internal_coord.pick_angle("chi2")
# or same thing: r1chi2_obj = r1.internal_coord.pick_angle("CA:CB:CG:CD")
r1chi2_key = r1chi2_obj.atomkeys
# r1chi2_key is tuple of AtomKeys (1_K_CA, 1_K_CB, 1_K_CG, 1_K_CD)
r1chi2_index = myChain.internal_coord.dihedraNdx[r1chi2_key]
# or same thing: r1chi2_index = r1chi2_obj.ndx
r1chi2_value = myChain.internal_coord.dihedraAngle[r1chi2_index]
# also true: r1chi2_obj == myChain.internal_coord.dihedra[r1chi2_index]

# access the array of all atoms for the chain, e.g. residue 1 C-beta
r1_cBeta_index = myChain.internal_coord.atomArrayIndex[AtomKey("1_K_CB")]
r1_cBeta_coords = myChain.internal_coord.atomArray[r1_cBeta_index]
# r1_cBeta_coords = [ x, y, z, 1.0 ]

# the Biopython Atom coord array is now a view into atomArray, so
assert r1_cBeta_coords[1] == r1["CB"].coord[1]
r1_cBeta_coords[1] += 1.0  # change the Y coord 1 angstrom
assert r1_cBeta_coords[1] == r1["CB"].coord[1]
# they are always the same (they share the same memory)
r1_cBeta_coords[1] -= 1.0  # restore

# create a selector to filter just the C-alpha atoms from the all atom array
atmNameNdx = AtomKey.fields.atm
atomArrayIndex = myChain.internal_coord.atomArrayIndex
CaSelect = [
    atomArrayIndex.get(k) for k in atomArrayIndex.keys() if k.akl[atmNameNdx] == "CA"
]
# now the ordered array of C-alpha atom coordinates is:
CA_coords = myChain.internal_coord.atomArray[CaSelect]
# note this uses Numpy fancy indexing, so CA_coords is a new copy

# create a C-alpha distance plot
caDistances = myChain.internal_coord.distance_plot(CaSelect)
# display with e.g. MatPlotLib:
# import matplotlib.pyplot as plt
# plt.imshow(caDistances, cmap="hot", interpolation="nearest")
# plt.show()

# build structure from distance plot:
## create the all-atom distance plot
distances = myChain.internal_coord.distance_plot()
## get the sign of the dihedral angles
chirality = myChain.internal_coord.dihedral_signs()
## get new, empty data structure : copy data structure from myChain
myChain2 = IC_duplicate(myChain)[0]["A"]
cic2 = myChain2.internal_coord
## clear the new atomArray and di/hedra value arrays, just for proof
cic2.atomArray = np.zeros((cic2.AAsiz, 4), dtype=np.float64)
cic2.dihedraAngle[:] = 0.0
cic2.hedraAngle[:] = 0.0
cic2.hedraL12[:] = 0.0
cic2.hedraL23[:] = 0.0
## copy just the first N-Ca-C coords so structures will superimpose:
cic2.copy_initNCaCs(myChain.internal_coord)
## copy distances to chain arrays:
cic2.distplot_to_dh_arrays(distances, chirality)
## compute angles and dihedral angles from distances:
cic2.distance_to_internal_coordinates()
## generate XYZ coordinates from internal coordinates:
myChain2.internal_to_atom_coordinates()
## confirm result atomArray matches original structure:
assert np.allclose(cic2.atomArray, myChain.internal_coord.atomArray)

# superimpose all phe-phe pairs - quick hack just to demonstrate concept
# for analyzing pairwise residue interactions.  Generates PDB ATOM records
# placing each PHE at origin and showing all other PHEs in environment
## shorthand for key variables:
cic = myChain.internal_coord
resNameNdx = AtomKey.fields.resname
aaNdx = cic.atomArrayIndex
## select just PHE atoms:
pheAtomSelect = [aaNdx.get(k) for k in aaNdx.keys() if k.akl[resNameNdx] == "F"]
aaF = cic.atomArray[ pheAtomSelect ]  # numpy fancy indexing makes COPY not view

for ric in cic.ordered_aa_ic_list:  # internal_coords version of get_residues()
    if ric.rbase[2] == "F":  # if PHE, get transform matrices for chi1 dihedral
        chi1 = ric.pick_angle("N:CA:CB:CG")  # chi1 space has C-alpha at origin
        cst = np.transpose(chi1.cst)  # transform TO chi1 space
        # rcst = np.transpose(chi1.rcst)  # transform FROM chi1 space
        cic.atomArray[pheAtomSelect] = aaF.dot(cst)  # transform just the PHEs
        for res in myChain.get_residues():  # print PHEs in new coordinate space
            if res.resname in ["PHE"]:
                print(res.internal_coord.pdb_residue_string())
        cic.atomArray[pheAtomSelect] = aaF  # restore coordinate space from copy

# write OpenSCAD program of spheres and cylinders to 3d print myChain backbone
## set atom load filter to accept backbone only:
IC_Residue.accept_atoms = IC_Residue.accept_backbone
## delete existing data to force re-read of all atoms:
myChain.internal_coord = None
write_SCAD(myChain, "myChain.scad", scale=10.0)

See the ‘’Internal coordinates module’’ section of the Biopython Tutorial and Cookbook for further discussion.

Terms and key data structures: Internal coordinates are defined on sequences of atoms which span residues or follow accepted nomenclature along sidechains. To manage these sequences and support Biopython’s disorder mechanisms, AtomKey specifiers are implemented to capture residue, atom and variant identification in a single object. A Hedron object is specified as three sequential AtomKeys, comprising two bond lengths and the bond angle between them. A Dihedron consists of four sequential AtomKeys, linking two Hedra with a dihedral angle between them.

Algorithmic overview: The Internal Coordinates module combines a specification of connected atoms as hedra and dihedra in the ic_data file with routines here to transform XYZ coordinates of these atom sets between a local coordinate system and the world coordinates supplied in e.g. a PDB or mmCif data file. The local coordinate system places the center atom of a hedron at the origin (0,0,0), one leg on the +Z axis, and the other leg on the XZ plane (see Hedron). Measurement and creation or manipulation of hedra and dihedra in the local coordinate space is straightforward, and the calculated transformation matrices enable assembling these subunits into a protein chain starting from supplied (PDB) coordinates for the initial N-Ca-C atoms.

Psi and Phi angles are defined on atoms from adjacent residues in a protein chain, see e.g. pick_angle() and ic_data for the relevant mapping between residues and backbone dihedral angles.

Transforms to and from the dihedron local coordinate space described above are accessible via IC_Chain.dCoordSpace and Dihedron attributes .cst and .rcst, and may be applied in the alignment and comparison of residues and their environments with code along the lines of:

chi1 = ric0.pick_angle("chi1") # chi1 space defined with CA at origin
cst = np.transpose(chi1.cst) # transform TO chi1 local space
newAtomCoords = oldAtomCoords.dot(cst)

The core algorithms were developed independently during 1993-4 for ‘’Development and Application of a Three-dimensional Description of Amino Acid Environments in Protein,’’ Miller, Douthart, and Dunker, Advances in Molecular Bioinformatics, IOS Press, 1994, ISBN 90 5199 172 x, pp. 9-30.

A Protein Internal Coordinate (.pic) file format is defined to capture sufficient detail to reproduce a PDB file from chain starting coordinates (first residue N, Ca, C XYZ coordinates) and remaining internal coordinates. These files are used internally to verify that a given structure can be regenerated from its internal coordinates. See PICIO for reading and writing .pic files and structure_rebuild_test() to determine if a specific PDB or mmCif datafile has sufficient information to interconvert between cartesian and internal coordinates.

Internal coordinates may also be exported as OpenSCAD data arrays for generating 3D printed protein models. OpenSCAD software is provided as a starting point and proof-of-concept for generating such models. See SCADIO and this Thingiverse project for a more advanced example.

Refer to distance_plot() and distance_to_internal_coordinates() for converting structure data to/from 2D distance plots.

The following classes comprise the core functionality for processing internal coordinates and are sufficiently related and coupled to place them together in this module:

IC_Chain: Extends Biopython Chain on .internal_coord attribute.

Manages connected sequence of residues and chain breaks; holds numpy arrays for all atom coordinates and bond geometries. For ‘parallel’ processing IC_Chain methods operate on these arrays with single numpy commands.

IC_Residue: Extends Biopython Residue on .internal_coord attribute.

Access for per residue views on internal coordinates and methods for serial (residue by residue) assembly.

Dihedron: four joined atoms forming a dihedral angle.

Dihedral angle, homogeneous atom coordinates in local coordinate space, references to relevant Hedra and IC_Residue. Getter methods for residue dihedral angles, bond angles and bond lengths.

Hedron: three joined atoms forming a plane.

Contains homogeneous atom coordinates in local coordinate space as well as bond lengths and angle between them.

Edron: base class for Hedron and Dihedron classes.

Tuple of AtomKeys comprising child, string ID, mainchain membership boolean and other routines common for both Hedra and Dihedra. Implements rich comparison.

AtomKey: keys (dictionary and string) for referencing atom sequences.

Capture residue and disorder/occupancy information, provides a no-whitespace key for .pic files, and implements rich comparison.

Custom exception classes: HedronMatchError and MissingAtomError

class Bio.PDB.internal_coords.IC_Chain(parent, verbose: bool = False)

Bases: object

Class to extend Biopython Chain with internal coordinate data.

Attributes:
chain: object reference

The Biopython Bio.PDB.Chain.Chain object this extends

MaxPeptideBond: float

Class attribute to detect chain breaks. Override for fully contiguous chains with some very long bonds - e.g. for 3D printing (OpenSCAD output) a structure with missing residues. MaxPeptideBond

ParallelAssembleResidues: bool

Class attribute affecting internal_to_atom_coords. Short (50 residue and less) chains are faster to assemble without the overhead of creating numpy arrays, and the algorithm is easier to understand and trace processing a single residue at a time. Clearing (set to False) this flag will switch to the serial algorithm

ordered_aa_ic_list: list

IC_Residue objects internal_coords algorithms can process (e.g. no waters)

initNCaC: List of N, Ca, C AtomKey tuples (NCaCKeys).

NCaCKeys start chain segments (first residue or after chain break). These 3 atoms define the coordinate space for a contiguous chain segment, as initially specified by PDB or mmCIF file.

AAsiz = int

AtomArray size, number of atoms in this chain

atomArray: numpy array

homogeneous atom coords ([x,, y, z, 1.0]) for every atom in chain

atomArrayIndex: dict

maps AtomKeys to atomArray indexes

hedra: dict

Hedra forming residues in this chain; indexed by 3-tuples of AtomKeys.

hedraLen: int

length of hedra dict

hedraNdx: dict

maps hedra AtomKeys to numeric index into hedra data arrays e.g. hedraL12 below

a2ha_map: [hedraLen x 3]

atom indexes in hedraNdx order

dihedra: dict

Dihedra forming residues in this chain; indexed by 4-tuples of AtomKeys.

dihedraLen: int

length of dihedra dict

dihedraNdx: dict

maps dihedra AtomKeys to dihedra data arrays e.g. dihedraAngle

a2da_map[dihedraLen x 4]

AtomNdx’s in dihedraNdx order

d2a_map[dihedraLen x [4]]

AtomNdx’s for each dihedron (reshaped a2da_map)

Numpy arrays for vector processing of chain di/hedra:
hedraL12: numpy array

bond length between hedron 1st and 2nd atom

hedraAngle: numpy array

bond angle for each hedron, in degrees

hedraL23: numpy array

bond length between hedron 2nd and 3rd atom

id3_dh_index: dict

maps hedron key to list of dihedra starting with hedron, used by assemble and bond_rotate to find dihedra with h1 key

id32_dh_index: dict

like id3_dh_index, find dihedra from h2 key

hAtoms: numpy array

homogeneous atom coordinates (3x4) of hedra, central atom at origin

hAtomsR: numpy array

hAtoms in reverse orientation

hAtoms_needs_update: numpy array of bool

indicates whether hAtoms represent hedraL12/A/L23

dihedraAngle: numpy array

dihedral angles (degrees) for each dihedron

dAtoms: numpy array

homogeneous atom coordinates (4x4) of dihedra, second atom at origin

dAtoms_needs_update: numpy array of bool

indicates whether dAtoms represent dihedraAngle

dCoordSpace: numpy array

forward and reverse transform matrices standardising positions of first hedron. See dCoordSpace.

dcsValid: bool

indicates dCoordSpace up to date

See also attributes generated by :meth:`build_edraArrays` for indexing
di/hedra data elements.

Methods

internal_to_atom_coordinates:

Process ic data to Residue/Atom coordinates; calls assemble_residues()

assemble_residues:

Generate IC_Chain atom coords from internal coordinates (parallel)

assemble_residues_ser:

Generate IC_Residue atom coords from internal coordinates (serial)

atom_to_internal_coordinates:

Calculate dihedrals, angles, bond lengths (internal coordinates) for Atom data

write_SCAD:

Write OpenSCAD matrices for internal coordinate data comprising chain; this is a support routine, see SCADIO.write_SCAD() to generate OpenSCAD description of a protein chain.

distance_plot:

Generate 2D plot of interatomic distances with optional filter

distance_to_internal_coordinates:

Compute internal coordinates from distance plot and array of dihedral angle signs.

make_extended:

Arbitrarily sets all psi and phi backbone angles to 123 and -104 degrees.

MaxPeptideBond = 1.4

Larger C-N distance than this will be chain break

ParallelAssembleResidues = True

Enable parallel internal_to_atom algorithm, is slower for short chains

AAsiz = 0

Number of atoms in this chain (size of atomArray)

atomArray: array = None

AAsiz x [4] of float np.float64 homogeneous atom coordinates, all atoms in chain.

dCoordSpace = None

[2][dihedraLen][4][4] : 2 arrays of 4x4 coordinate space transforms for each dihedron. The first [0] converts TO standard space with first atom on the XZ plane, the second atom at the origin, the third on the +Z axis, and the fourth placed according to the dihedral angle. The second [1] transform returns FROM the standard space to world coordinates (PDB file input or whatever is current). Also accessible as .cst (forward transform) and .rcst (reverse transform) in Dihedron.

dcsValid = None

True if dCoordSpace is up to date. Use update_dCoordSpace() if needed.

__init__(parent, verbose: bool = False) None

Initialize IC_Chain object, with or without residue/Atom data.

Parameters:

parent (Bio.PDB.Chain) – Biopython Chain object Chain object this extends

__deepcopy__(memo) IC_Chain

Implement deepcopy for IC_Chain.

clear_ic()

Clear residue internal_coord settings for this chain.

build_atomArray() None

Build IC_Chain numpy coordinate array from biopython atoms.

See also init_edra() for more complete initialization of IC_Chain.

Inputs:
self.aksetset

AtomKey s in this chain

Generates:
self.AAsizint

number of atoms in chain (len(akset))

self.aktupleAAsiz x AtomKeys

sorted akset AtomKeys

self.atomArrayIndex[AAsiz] of int

numerical index for each AtomKey in aktuple

self.atomArrayValidAAsiz x bool

atomArray coordinates current with internal coordinates if True

self.atomArrayAAsiz x np.float64[4]

homogeneous atom coordinates; Biopython Atom coordinates are view into this array after execution

rak_cachedict

lookup cache for AtomKeys for each residue

build_edraArrays() None

Build chain level hedra and dihedra arrays.

Used by init_edra() and _hedraDict2chain(). Should be private method but exposed for documentation.

Inputs:
self.dihedraLenint

number of dihedra needed

self.hedraLenint

number of hedra needed

self.AAsizint

length of atomArray

self.hedraNdxdict

maps hedron keys to range(hedraLen)

self.dihedraNdxdict

maps dihedron keys to range(dihedraLen)

self.hedradict

maps Hedra keys to Hedra for chain

self.atomArrayAAsiz x np.float64[4]

homogeneous atom coordinates for chain

self.atomArrayIndexdict

maps AtomKeys to atomArray

self.atomArrayValidAAsiz x bool

indicates coord is up-to-date

Generates:
self.dCoordSpace[2][dihedraLen][4][4]

transforms to/from dihedron coordinate space

self.dcsValiddihedraLen x bool

indicates dCoordSpace is current

self.hAtomshedraLen x 3 x np.float64[4]

atom coordinates in hCoordSpace

self.hAtomsRhedraLen x 3 x np.float64[4]

hAtoms in reverse order (trading space for time)

self.hAtoms_needs_updatehedraLen x bool

indicates hAtoms, hAtoms current

self.a2h_mapAAsiz x [int …]

maps atomArrayIndex to hedraNdx’s with that atom

self.a2ha_map[hedraLen x 3]

AtomNdx’s in hedraNdx order

self.h2aahedraLen x [int …]

maps hedraNdx to atomNdx’s in hedron (reshaped later)

Hedron.ndxint

self.hedraNdx value stored inside Hedron object

self.dRevdihedraLen x bool

dihedron reversed if true

self.dH1ndx, dH2ndx[dihedraLen]

hedraNdx’s for 1st and 2nd hedra

self.h1d_maphedraLen x []

hedraNdx -> [dihedra using hedron]

Dihedron.h1key, h2key[AtomKey …]

hedron keys for dihedron, reversed as needed

Dihedron.hedron1, hedron2Hedron

references inside dihedron to hedra

Dihedron.ndxint

self.dihedraNdx info inside Dihedron object

Dihedron.cst, rcstnp.float64p4][4]

dCoordSpace references inside Dihedron

self.a2da_map[dihedraLen x 4]

AtomNdx’s in dihedraNdx order

self.d2a_map[dihedraLen x [4]]

AtomNdx’s for each dihedron (reshaped a2da_map)

self.dFwdbool

dihedron is not Reversed if True

self.a2d_mapAAsiz x [[dihedraNdx]

[atom ndx 0-3 of atom in dihedron]], maps atom indexes to dihedra and atoms in them

self.dAtoms_needs_updatedihedraLen x bool

atoms in h1, h2 are current if False

assemble_residues(verbose: bool = False) None

Generate atom coords from internal coords (vectorised).

This is the ‘Numpy parallel’ version of assemble_residues_ser().

Starting with dihedra already formed by init_atom_coords(), transform each from dihedron local coordinate space into protein chain coordinate space. Iterate until all dependencies satisfied.

Does not update dCoordSpace as assemble_residues_ser() does. Call update_dCoordSpace() if needed. Faster to do in single operation once all atom coordinates finished.

Parameters:

verbose (bool) – default False. Report number of iterations to compute changed dihedra

generates:
self.dSet: AAsiz x dihedraLen x 4

maps atoms in dihedra to atomArray

self.dSetValid[dihedraLen][4] of bool

map of valid atoms into dihedra to detect 3 or 4 atoms valid

Output coordinates written to atomArray. Biopython Bio.PDB.Atom coordinates are a view on this data.

assemble_residues_ser(verbose: bool = False, start: int | None = None, fin: int | None = None) None

Generate IC_Residue atom coords from internal coordinates (serial).

See assemble_residues() for ‘numpy parallel’ version.

Filter positions between start and fin if set, find appropriate start coordinates for each residue and pass to assemble()

Parameters:
  • verbose (bool) – default False. Describe runtime problems

  • start,fin (int) – default None. Sequence position for begin, end of subregion to generate coords for.

init_edra(verbose: bool = False) None

Create chain and residue di/hedra structures, arrays, atomArray.

Inputs:

self.ordered_aa_ic_list : list of IC_Residue

Generates:
  • edra objects, self.di/hedra (executes _create_edra())

  • atomArray and support (executes build_atomArray())

  • self.hedraLen : number of hedra in structure

  • hedraL12 : numpy arrays for lengths, angles (empty)

  • hedraAngle ..

  • hedraL23 ..

  • self.hedraNdx : dict mapping hedrakeys to hedraL12 etc

  • self.dihedraLen : number of dihedra in structure

  • dihedraAngle ..

  • dihedraAngleRads : np arrays for angles (empty)

  • self.dihedraNdx : dict mapping dihedrakeys to dihedraAngle

init_atom_coords() None

Set chain level di/hedra initial coords from angles and distances.

Initializes atom coordinates in local coordinate space for hedra and dihedra, will be transformed appropriately later by dCoordSpace matrices for assembly.

update_dCoordSpace(workSelector: ndarray | None = None) None

Compute/update coordinate space transforms for chain dihedra.

Requires all atoms updated so calls assemble_residues() (returns immediately if all atoms already assembled).

Parameters:

workSelector ([bool]) – Optional mask to select dihedra for update

propagate_changes() None

Track through di/hedra to invalidate dependent atoms.

internal_to_atom_coordinates(verbose: bool = False, start: int | None = None, fin: int | None = None) None

Process IC data to Residue/Atom coords.

Parameters:
  • verbose (bool) – default False. Describe runtime problems

  • start,fin (int) – Optional sequence positions for begin, end of subregion to process.

Note

Setting start or fin activates serial assemble_residues_ser() instead of (Numpy parallel) assemble_residues(). Start C-alpha will be at origin.

atom_to_internal_coordinates(verbose: bool = False) None

Calculate dihedrals, angles, bond lengths for Atom data.

Generates atomArray (through init_edra), value arrays for hedra and dihedra, and coordinate space transforms for dihedra.

Generates Gly C-beta if specified, see IC_Residue.gly_Cbeta

Parameters:

verbose (bool) – default False. describe runtime problems

distance_plot(filter: ndarray | None = None) ndarray

Generate 2D distance plot from atomArray.

Default is to calculate distances for all atoms. To generate the classic C-alpha distance plot, pass a boolean mask array like:

atmNameNdx = internal_coords.AtomKey.fields.atm
CaSelect = [
    atomArrayIndex.get(k)
    for k in atomArrayIndex.keys()
    if k.akl[atmNameNdx] == "CA"
]
plot = cic.distance_plot(CaSelect)

Alternatively, this will select all backbone atoms:

backboneSelect = [
    atomArrayIndex.get(k)
    for k in atomArrayIndex.keys()
    if k.is_backbone()
]
Parameters:

filter ([bool]) – restrict atoms for calculation

See also

distance_to_internal_coordinates(), which requires the default all atom distance plot.

dihedral_signs() ndarray

Get sign array (+1/-1) for each element of chain dihedraAngle array.

Required for distance_to_internal_coordinates()

distplot_to_dh_arrays(distplot: ndarray, dihedra_signs: ndarray) None

Load di/hedra distance arrays from distplot.

Fill IC_Chain arrays hedraL12, L23, L13 and dihedraL14 distance value arrays from input distplot, dihedra_signs array from input dihedra_signs. Distplot and di/hedra distance arrays must index according to AtomKey mappings in IC_Chain .hedraNdx and .dihedraNdx (created in IC_Chain.init_edra())

Call atom_to_internal_coordinates() (or at least init_edra()) to generate a2ha_map and d2a_map before running this.

Explicitly removed from distance_to_internal_coordinates() so user may populate these chain di/hedra arrays by other methods.

distance_to_internal_coordinates(resetAtoms: bool | None = True) None

Compute chain di/hedra from from distance and chirality data.

Distance properties on hedra L12, L23, L13 and dihedra L14 configured by distplot_to_dh_arrays() or alternative loader.

dihedraAngles result is multiplied by dihedra_signs at final step recover chirality information lost in distance plot (mirror image of structure has same distances but opposite sign dihedral angles).

Note that chain breaks will cause errors in rebuilt structure, use copy_initNCaCs() to avoid this

Based on Blue, the Hedronometer’s answer to The dihedral angles of a tetrahedron in terms of its edge lengths on math.stackexchange.com. See also: “Heron-like Hedronometric Results for Tetrahedral Volume”.

Other values from that analysis included here as comments for completeness:

  • oa = hedron1 L12 if reverse else hedron1 L23

  • ob = hedron1 L23 if reverse else hedron1 L12

  • ac = hedron2 L12 if reverse else hedron2 L23

  • ab = hedron1 L13 = law of cosines on OA, OB (hedron1 L12, L23)

  • oc = hedron2 L13 = law of cosines on OA, AC (hedron2 L12, L23)

  • bc = dihedron L14

target is OA, the dihedral angle along edge oa.

Parameters:

resetAtoms (bool) – default True. Mark all atoms in di/hedra and atomArray for updating by internal_to_atom_coordinates(). Alternatively set this to False and manipulate atomArrayValid, dAtoms_needs_update and hAtoms_needs_update directly to reduce computation.

copy_initNCaCs(other: IC_Chain) None

Copy atom coordinates for initNCaC atoms from other IC_Chain.

Copies the coordinates and sets atomArrayValid flags True for initial NCaC and after any chain breaks.

Needed for distance_to_internal_coordinates() if target has chain breaks (otherwise each fragment will start at origin).

Also useful if copying internal coordinates from another chain.

N.B. IC_Residue.set_angle() and IC_Residue.set_length() invalidate their relevant atoms, so apply them before calling this function.

make_extended()

Set all psi and phi angles to extended conformation (123, -104).

__annotations__ = {'atomArray': <built-in function array>, 'atomArrayIndex': "dict['AtomKey', int]", 'bpAtomArray': "list['Atom']", 'ordered_aa_ic_list': 'list[IC_Residue]'}
class Bio.PDB.internal_coords.IC_Residue(parent: Residue)

Bases: object

Class to extend Biopython Residue with internal coordinate data.

Parameters:
parent: biopython Residue object this class extends
Attributes:
no_altloc: bool default False

Class variable, disable processing of ALTLOC atoms if True, use only selected atoms.

accept_atoms: tuple

Class variable accept_atoms, list of PDB atom names to use when generating internal coordinates. Default is:

accept_atoms = accept_mainchain + accept_hydrogens

to exclude hydrogens in internal coordinates and generated PDB files, override as:

IC_Residue.accept_atoms = IC_Residue.accept_mainchain

to get only mainchain atoms plus amide proton, use:

IC_Residue.accept_atoms = IC_Residue.accept_mainchain + ('H',)

to convert D atoms to H, set AtomKey.d2h = True and use:

IC_Residue.accept_atoms = (
    accept_mainchain + accept_hydrogens + accept_deuteriums
)

Note that accept_mainchain = accept_backbone + accept_sidechain. Thus to generate sequence-agnostic conformational data for e.g. structure alignment in dihedral angle space, use:

IC_Residue.accept_atoms = accept_backbone

or set gly_Cbeta = True and use:

IC_Residue.accept_atoms = accept_backbone + ('CB',)

Changing accept_atoms will cause the default structure_rebuild_test in ic_rebuild to fail if some atoms are filtered (obviously). Use the quick=True option to test only the coordinates of filtered atoms to avoid this.

There is currently no option to output internal coordinates with D instead of H.

accept_resnames: tuple

Class variable accept_resnames, list of 3-letter residue names for HETATMs to accept when generating internal coordinates from atoms. HETATM sidechain will be ignored, but normal backbone atoms (N, CA, C, O, CB) will be included. Currently only CYG, YCM and UNK; override at your own risk. To generate sidechain, add appropriate entries to ic_data_sidechains in ic_data and support in IC_Chain.atom_to_internal_coordinates().

gly_Cbeta: bool default False

Class variable gly_Cbeta, override to True to generate internal coordinates for glycine CB atoms in IC_Chain.atom_to_internal_coordinates()

IC_Residue.gly_Cbeta = True
pic_accuracy: str default “17.13f”

Class variable pic_accuracy sets accuracy for numeric values (angles, lengths) in .pic files. Default set high to support mmCIF file accuracy in rebuild tests. If you find rebuild tests fail with ‘ERROR -COORDINATES-’ and verbose=True shows only small discrepancies, try raising this value (or lower it to 9.5 if only working with PDB format files).

IC_Residue.pic_accuracy = "9.5f"
residue: Biopython Residue object reference

The Residue object this extends

hedra: dict indexed by 3-tuples of AtomKeys

Hedra forming this residue

dihedra: dict indexed by 4-tuples of AtomKeys

Dihedra forming (overlapping) this residue

rprev, rnext: lists of IC_Residue objects

References to adjacent (bonded, not missing, possibly disordered) residues in chain

atom_coords: AtomKey indexed dict of numpy [4] arrays

removed Use AtomKeys and atomArrayIndex to build if needed

ak_set: set of AtomKeys in dihedra

AtomKeys in all dihedra overlapping this residue (see __contains__())

alt_ids: list of char

AltLoc IDs from PDB file

bfactors: dict

AtomKey indexed B-factors as read from PDB file

NCaCKey: List of tuples of AtomKeys

List of tuples of N, Ca, C backbone atom AtomKeys; usually only 1 but more if backbone altlocs.

is20AA: bool

True if residue is one of 20 standard amino acids, based on Residue resname

isAccept: bool

True if is20AA or in accept_resnames below

rbase: tuple

residue position, insert code or none, resname (1 letter if standard amino acid)

cic: IC_Chain default None

parent chain IC_Chain object

scale: optional float

used for OpenSCAD output to generate gly_Cbeta bond length

Methods

assemble(atomCoordsIn, resetLocation, verbose)

Compute atom coordinates for this residue from internal coordinates

get_angle()

Return angle for passed key

get_length()

Return bond length for specified pair

pick_angle()

Find Hedron or Dihedron for passed key

pick_length()

Find hedra for passed AtomKey pair

set_angle()

Set angle for passed key (no position updates)

set_length()

Set bond length in all relevant hedra for specified pair

bond_rotate(delta)

adjusts related dihedra angles by delta, e.g. rotating psi (N-Ca-C-N) will adjust the adjacent N-Ca-C-O by the same amount to avoid clashes

bond_set(angle)

uses bond_rotate to set specified dihedral to angle and adjust related dihedra accordingly

rak(atom info)

cached AtomKeys for this residue

accept_resnames = ('CYG', 'YCM', 'UNK')

Add 3-letter residue name here for non-standard residues with normal backbone. CYG included for test case 4LGY (1305 residue contiguous chain). Safe to add more names for N-CA-C-O backbones, any more complexity will need additions to accept_atoms, ic_data_sidechains in ic_data and support in IC_Chain.atom_to_internal_coordinates()

no_altloc: bool = False

Set True to filter altloc atoms on input and only work with Biopython default Atoms

gly_Cbeta: bool = False

Create beta carbons on all Gly residues.

Setting this to True will generate internal coordinates for Gly C-beta carbons in atom_to_internal_coordinates().

Data averaged from Sep 2019 Dunbrack cullpdb_pc20_res2.2_R1.0 restricted to structures with amide protons. Please see

PISCES: A Protein Sequence Culling Server

‘G. Wang and R. L. Dunbrack, Jr. PISCES: a protein sequence culling server. Bioinformatics, 19:1589-1591, 2003.’

Ala avg rotation of OCCACB from NCACO query:

select avg(g.rslt) as avg_rslt, stddev(g.rslt) as sd_rslt, count(*)
from
(select f.d1d, f.d2d,
(case when f.rslt > 0 then f.rslt-360.0 else f.rslt end) as rslt
from (select d1.angle as d1d, d2.angle as d2d,
(d2.angle - d1.angle) as rslt from dihedron d1,
dihedron d2 where d1.re_class='AOACACAACB' and
d2.re_class='ANACAACAO' and d1.pdb=d2.pdb and d1.chn=d2.chn
and d1.res=d2.res) as f) as g

results:

| avg_rslt          | sd_rslt          | count   |
| -122.682194862932 | 5.04403040513919 | 14098   |
pic_accuracy: str = '17.13f'
accept_backbone = ('N', 'CA', 'C', 'O', 'OXT')
accept_sidechain = ('CB', 'CG', 'CG1', 'OG1', 'OG', 'SG', 'CG2', 'CD', 'CD1', 'SD', 'OD1', 'ND1', 'CD2', 'ND2', 'CE', 'CE1', 'NE', 'OE1', 'NE1', 'CE2', 'OE2', 'NE2', 'CE3', 'CZ', 'NZ', 'CZ2', 'CZ3', 'OD2', 'OH', 'CH2', 'NH1', 'NH2')
accept_mainchain = ('N', 'CA', 'C', 'O', 'OXT', 'CB', 'CG', 'CG1', 'OG1', 'OG', 'SG', 'CG2', 'CD', 'CD1', 'SD', 'OD1', 'ND1', 'CD2', 'ND2', 'CE', 'CE1', 'NE', 'OE1', 'NE1', 'CE2', 'OE2', 'NE2', 'CE3', 'CZ', 'NZ', 'CZ2', 'CZ3', 'OD2', 'OH', 'CH2', 'NH1', 'NH2')
accept_hydrogens = ('H', 'H1', 'H2', 'H3', 'HA', 'HA2', 'HA3', 'HB', 'HB1', 'HB2', 'HB3', 'HG2', 'HG3', 'HD2', 'HD3', 'HE2', 'HE3', 'HZ1', 'HZ2', 'HZ3', 'HG11', 'HG12', 'HG13', 'HG21', 'HG22', 'HG23', 'HZ', 'HD1', 'HE1', 'HD11', 'HD12', 'HD13', 'HG', 'HG1', 'HD21', 'HD22', 'HD23', 'NH1', 'NH2', 'HE', 'HH11', 'HH12', 'HH21', 'HH22', 'HE21', 'HE22', 'HE2', 'HH', 'HH2')
accept_deuteriums = ('D', 'D1', 'D2', 'D3', 'DA', 'DA2', 'DA3', 'DB', 'DB1', 'DB2', 'DB3', 'DG2', 'DG3', 'DD2', 'DD3', 'DE2', 'DE3', 'DZ1', 'DZ2', 'DZ3', 'DG11', 'DG12', 'DG13', 'DG21', 'DG22', 'DG23', 'DZ', 'DD1', 'DE1', 'DD11', 'DD12', 'DD13', 'DG', 'DG1', 'DD21', 'DD22', 'DD23', 'ND1', 'ND2', 'DE', 'DH11', 'DH12', 'DH21', 'DH22', 'DE21', 'DE22', 'DE2', 'DH', 'DH2')
accept_atoms = ('N', 'CA', 'C', 'O', 'OXT', 'CB', 'CG', 'CG1', 'OG1', 'OG', 'SG', 'CG2', 'CD', 'CD1', 'SD', 'OD1', 'ND1', 'CD2', 'ND2', 'CE', 'CE1', 'NE', 'OE1', 'NE1', 'CE2', 'OE2', 'NE2', 'CE3', 'CZ', 'NZ', 'CZ2', 'CZ3', 'OD2', 'OH', 'CH2', 'NH1', 'NH2', 'H', 'H1', 'H2', 'H3', 'HA', 'HA2', 'HA3', 'HB', 'HB1', 'HB2', 'HB3', 'HG2', 'HG3', 'HD2', 'HD3', 'HE2', 'HE3', 'HZ1', 'HZ2', 'HZ3', 'HG11', 'HG12', 'HG13', 'HG21', 'HG22', 'HG23', 'HZ', 'HD1', 'HE1', 'HD11', 'HD12', 'HD13', 'HG', 'HG1', 'HD21', 'HD22', 'HD23', 'NH1', 'NH2', 'HE', 'HH11', 'HH12', 'HH21', 'HH22', 'HE21', 'HE22', 'HE2', 'HH', 'HH2')

Change accept_atoms to restrict atoms processed. See IC_Residue for usage.

__init__(parent: Residue) None

Initialize IC_Residue with parent Biopython Residue.

Parameters:

parent (Residue) – Biopython Residue object. The Biopython Residue this object extends

__deepcopy__(memo)

Deep copy implementation for IC_Residue.

__contains__(ak: AtomKey) bool

Return True if atomkey is in this residue.

rak(atm: str | Atom) AtomKey

Cache calls to AtomKey for this residue.

__repr__() str

Print string is parent Residue ID.

pretty_str() str

Nice string for residue ID.

set_flexible() None

For OpenSCAD, mark N-CA and CA-C bonds to be flexible joints.

See SCADIO.write_SCAD()

set_hbond() None

For OpenSCAD, mark H-N and C-O bonds to be hbonds (magnets).

See SCADIO.write_SCAD()

clear_transforms()

Invalidate dihedra coordinate space attributes before assemble().

Coordinate space attributes are Dihedron.cst and .rcst, and IC_Chain.dCoordSpace

assemble(resetLocation: bool = False, verbose: bool = False) dict[AtomKey, array] | dict[tuple[AtomKey, AtomKey, AtomKey], array] | None

Compute atom coordinates for this residue from internal coordinates.

This is the IC_Residue part of the assemble_residues_ser() serial version, see assemble_residues() for numpy vectorized approach which works at the IC_Chain level.

Join prepared dihedra starting from N-CA-C and N-CA-CB hedrons, computing protein space coordinates for backbone and sidechain atoms

Sets forward and reverse transforms on each Dihedron to convert from protein coordinates to dihedron space coordinates for first three atoms (see IC_Chain.dCoordSpace)

Call init_atom_coords() to update any modified di/hedra before coming here, this only assembles dihedra into protein coordinate space.

Algorithm

Form double-ended queue, start with c-ca-n, o-c-ca, n-ca-cb, n-ca-c.

if resetLocation=True, use initial coords from generating dihedron for n-ca-c initial positions (result in dihedron coordinate space)

while queue not empty

get 3-atom hedron key

for each dihedron starting with hedron key (1st hedron of dihedron)

if have coordinates for all 4 atoms already

add 2nd hedron key to back of queue

else if have coordinates for 1st 3 atoms

compute forward and reverse transforms to take 1st 3 atoms to/from dihedron initial coordinate space

use reverse transform to get position of 4th atom in current coordinates from dihedron initial coordinates

add 2nd hedron key to back of queue

else

ordering failed, put hedron key at back of queue and hope next time we have 1st 3 atom positions (should not happen)

loop terminates (queue drains) as hedron keys which do not start any dihedra are removed without action

Parameters:

resetLocation (bool) – default False. - Option to ignore start location and orient so initial N-Ca-C hedron at origin.

Returns:

Dict of AtomKey -> homogeneous atom coords for residue in protein space relative to previous residue

Also directly updates IC_Chain.atomArray as assemble_residues() does.

split_akl(lst: tuple[AtomKey, ...] | list[AtomKey], missingOK: bool = False) list[tuple[AtomKey, ...]]

Get AtomKeys for this residue (ak_set) for generic list of AtomKeys.

Changes and/or expands a list of ‘generic’ AtomKeys (e.g. ‘N, C, C’) to be specific to this Residue’s altlocs etc., e.g. ‘(N-Ca_A_0.3-C, N-Ca_B_0.7-C)’

Given a list of AtomKeys for a Hedron or Dihedron,
return:

list of matching atomkeys that have id3_dh in this residue (ak may change if occupancy != 1.00)

or

multiple lists of matching atomkeys expanded for all atom altlocs

or

empty list if any of atom_coord(ak) missing and not missingOK

Parameters:
  • lst (list) – list[3] or [4] of AtomKeys. Non-altloc AtomKeys to match to specific AtomKeys for this residue

  • missingOK (bool) – default False, see above.

atom_sernum = None
atom_chain = None
pdb_residue_string() str

Generate PDB ATOM records for this residue as string.

Convenience method for functionality not exposed in PDBIO.py. Increments IC_Residue.atom_sernum if not None

Parameters:
  • IC_Residue.atom_sernum – Class variable default None. Override and increment atom serial number if not None

  • IC_Residue.atom_chain – Class variable. Override atom chain id if not None

Todo

move to PDBIO

pic_flags = (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 496, 525, 1021, 1024, 2048, 4096, 7168, 8192, 16384)

Used by PICIO.write_PIC() to control classes of values to be defaulted.

picFlagsDefault = 31744

Default is all dihedra + initial tau atoms + bFactors.

picFlagsDict = {'all': 7168, 'bFactors': 16384, 'chi': 496, 'chi1': 16, 'chi2': 32, 'chi3': 64, 'chi4': 128, 'chi5': 256, 'classic': 1021, 'classic_b': 525, 'hedra': 1024, 'initAtoms': 8192, 'omg': 2, 'phi': 4, 'pomg': 512, 'primary': 2048, 'psi': 1, 'secondary': 4096, 'tau': 8}

Dictionary of pic_flags values to use as needed.

pick_angle(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str) Hedron | Dihedron | None

Get Hedron or Dihedron for angle_key.

Parameters:

angle_key

  • tuple of 3 or 4 AtomKeys

  • string of atom names (‘CA’) separated by :’s

  • string of [-1, 0, 1]<atom name> separated by ‘:’s. -1 is previous residue, 0 is this residue, 1 is next residue

  • psi, phi, omg, omega, chi1, chi2, chi3, chi4, chi5

  • tau (N-CA-C angle) see Richardson1981

  • tuples of AtomKeys is only access for alternate disordered atoms

Observe that a residue’s phi and omega dihedrals, as well as the hedra comprising them (including the N:Ca:C tau hedron), are stored in the n-1 di/hedra sets; this overlap is handled here, but may be an issue if accessing directly.

The following print commands are equivalent (except for sidechains with non-carbon atoms for chi2):

ric = r.internal_coord
print(
    r,
    ric.get_angle("psi"),
    ric.get_angle("phi"),
    ric.get_angle("omg"),
    ric.get_angle("tau"),
    ric.get_angle("chi2"),
)
print(
    r,
    ric.get_angle("N:CA:C:1N"),
    ric.get_angle("-1C:N:CA:C"),
    ric.get_angle("-1CA:-1C:N:CA"),
    ric.get_angle("N:CA:C"),
    ric.get_angle("CA:CB:CG:CD"),
)

See ic_data.py for detail of atoms in the enumerated sidechain angles and the backbone angles which do not span the peptide bond. Using ‘s’ for current residue (‘self’) and ‘n’ for next residue, the spanning (overlapping) angles are:

(sN, sCA, sC, nN)   # psi
(sCA, sC, nN, nCA)  # omega i+1
(sC, nN, nCA, nC)   # phi i+1
(sCA, sC, nN)
(sC, nN, nCA)
(nN, nCA, nC)       # tau i+1
Returns:

Matching Hedron, Dihedron, or None.

get_angle(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str) float | None

Get dihedron or hedron angle for specified key.

See pick_angle() for key specifications.

set_angle(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str, v: float, overlap=True)

Set dihedron or hedron angle for specified key.

If angle is a Dihedron and overlap is True (default), overlapping dihedra are also changed as appropriate. The overlap is a result of protein chain definitions in ic_data and _create_edra() (e.g. psi overlaps N-CA-C-O).

The default overlap=True is probably what you want for: set_angle(“chi1”, val)

The default is probably NOT what you want when processing all dihedrals in a chain or residue (such as copying from another structure), as the overlapping dihedra will likely be in the set as well.

N.B. setting e.g. PRO chi2 is permitted without error or warning!

See pick_angle() for angle_key specifications. See bond_rotate() to change a dihedral by a number of degrees

Parameters:
  • angle_key – angle identifier.

  • v (float) – new angle in degrees (result adjusted to +/-180).

  • overlap (bool) – default True. Modify overlapping dihedra as needed

bond_rotate(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str, delta: float)

Rotate set of overlapping dihedrals by delta degrees.

Changes a dihedral angle by a given delta, i.e. new_angle = current_angle + delta Values are adjusted so new_angle will be within +/-180.

Changes overlapping dihedra as in set_angle()

See pick_angle() for key specifications.

bond_set(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str, val: float)

Set dihedron to val, update overlapping dihedra by same amount.

Redundant to set_angle(), retained for compatibility. Unlike set_angle() this is for dihedra only and no option to not update overlapping dihedra.

See pick_angle() for key specifications.

pick_length(ak_spec: str | tuple[AtomKey, AtomKey]) tuple[list[Hedron] | None, tuple[AtomKey, AtomKey] | None]

Get list of hedra containing specified atom pair.

Parameters:

ak_spec

  • tuple of two AtomKeys

  • string: two atom names separated by ‘:’, e.g. ‘N:CA’ with optional position specifier relative to self, e.g. ‘-1C:N’ for preceding peptide bond. Position specifiers are -1, 0, 1.

The following are equivalent:

ric = r.internal_coord
print(
    r,
    ric.get_length("0C:1N"),
)
print(
    r,
    None
    if not ric.rnext
    else ric.get_length((ric.rak("C"), ric.rnext[0].rak("N"))),
)

If atom not found on current residue then will look on rprev[0] to handle cases like Gly N:CA. For finer control please access IC_Chain.hedra directly.

Returns:

list of hedra containing specified atom pair as tuples of AtomKeys

get_length(ak_spec: str | tuple[AtomKey, AtomKey]) float | None

Get bond length for specified atom pair.

See pick_length() for ak_spec and details.

set_length(ak_spec: str | tuple[AtomKey, AtomKey], val: float) None

Set bond length for specified atom pair.

See pick_length() for ak_spec.

applyMtx(mtx: array) None

Apply matrix to atom_coords for this IC_Residue.

__annotations__ = {'_AllBonds': <class 'bool'>, 'ak_set': 'set[AtomKey]', 'akc': 'dict[Union[str, Atom], AtomKey]', 'alt_ids': 'Union[list[str], None]', 'bfactors': 'dict[str, float]', 'cic': 'IC_Chain', 'dihedra': 'dict[DKT, Dihedron]', 'gly_Cbeta': <class 'bool'>, 'hedra': 'dict[HKT, Hedron]', 'no_altloc': <class 'bool'>, 'pic_accuracy': <class 'str'>, 'rnext': 'list[IC_Residue]', 'rprev': 'list[IC_Residue]'}
class Bio.PDB.internal_coords.Edron(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str)

Bases: object

Base class for Hedron and Dihedron classes.

Supports rich comparison based on lists of AtomKeys.

Attributes:
atomkeys: tuple

3 (hedron) or 4 (dihedron) AtomKey s defining this di/hedron

id: str

‘:’-joined string of AtomKeys for this di/hedron

needs_update: bool

indicates di/hedron local atom_coords do NOT reflect current di/hedron angle and length values in hedron local coordinate space

e_class: str

sequence of atoms (no position or residue) comprising di/hedron for statistics

re_class: str

sequence of residue, atoms comprising di/hedron for statistics

cre_class: str

sequence of covalent radii classes comprising di/hedron for statistics

edron_re: compiled regex (Class Attribute)

A compiled regular expression matching string IDs for Hedron and Dihedron objects

cic: IC_Chain reference

Chain internal coords object containing this hedron

ndx: int

index into IC_Chain level numpy data arrays for di/hedra. Set in IC_Chain.init_edra()

rc: int

number of residues involved in this edron

Methods

gen_key([AtomKey, …] or AtomKey, …) (Static Method)

generate a ‘:’-joined string of AtomKey Ids

is_backbone()

Return True if all atomkeys atoms are N, Ca, C or O

edron_re = re.compile('^(?P<pdbid>\\w+)?\\s(?P<chn>[\\w|\\s])?\\s(?P<a1>[\\w\\-\\.]+):(?P<a2>[\\w\\-\\.]+):(?P<a3>[\\w\\-\\.]+)(:(?P<a4>[\\w\\-\\.]+))?\\s+(((?P<len12>\\S+)\\s+(?P<angle>\\S+)\\s+(?P<len23>\\S+)\\s*$)|((?P<)

A compiled regular expression matching string IDs for Hedron and Dihedron objects

static gen_key(lst: list[AtomKey]) str

Generate string of ‘:’-joined AtomKey strings from input.

Generate ‘2_A_C:3_P_N:3_P_CA’ from (2_A_C, 3_P_N, 3_P_CA) :param list lst: list of AtomKey objects

static gen_tuple(akstr: str) tuple

Generate AtomKey tuple for ‘:’-joined AtomKey string.

Generate (2_A_C, 3_P_N, 3_P_CA) from ‘2_A_C:3_P_N:3_P_CA’ :param str akstr: string of ‘:’-separated AtomKey strings

__init__(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str) None

Initialize Edron with sequence of AtomKeys.

Acceptable input:

[ AtomKey, … ] : list of AtomKeys AtomKey, … : sequence of AtomKeys as args {‘a1’: str, ‘a2’: str, … } : dict of AtomKeys as ‘a1’, ‘a2’ …

__deepcopy__(memo)

Deep copy implementation for Edron.

__contains__(ak: AtomKey) bool

Return True if atomkey is in this edron.

is_backbone() bool

Report True for contains only N, C, CA, O, H atoms.

__repr__() str

Tuple of AtomKeys is default repr string.

__hash__() int

Hash calculated at init from atomkeys tuple.

__eq__(other: object) bool

Test for equality.

__ne__(other: object) bool

Test for inequality.

__gt__(other: object) bool

Test greater than.

__ge__(other: object) bool

Test greater or equal.

__lt__(other: object) bool

Test less than.

__le__(other: object) bool

Test less or equal.

class Bio.PDB.internal_coords.Hedron(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey], **kwargs: str)

Bases: Edron

Class to represent three joined atoms forming a plane.

Contains atom coordinates in local coordinate space: central atom at origin, one terminal atom on XZ plane, and the other on the +Z axis. Stored in two orientations, with the 3rd (forward) or first (reversed) atom on the +Z axis. See Dihedron for use of forward and reverse orientations.

Attributes:
len12: float

distance between first and second atoms

len23: float

distance between second and third atoms

angle: float

angle (degrees) formed by three atoms in hedron

xrh_class: string

only for hedron spanning 2 residues, will have ‘X’ for residue contributing only one atom

Methods

get_length()

get bond length for specified atom pair

set_length()

set bond length for specified atom pair

angle(), len12(), len23()

setters for relevant attributes (angle in degrees)

__init__(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey], **kwargs: str) None

Initialize Hedron with sequence of AtomKeys, kwargs.

Acceptable input:

As for Edron, plus optional ‘len12’, ‘angle’, ‘len23’ keyworded values.

__repr__() str

Print string for Hedron object.

property angle: float

Get this hedron angle.

property len12

Get first length for Hedron.

property len23: float

Get second length for Hedron.

get_length(ak_tpl: tuple[AtomKey, AtomKey]) float | None

Get bond length for specified atom pair.

Parameters:

ak_tpl (tuple) – tuple of AtomKeys. Pair of atoms in this Hedron

set_length(ak_tpl: tuple[AtomKey, AtomKey], newLength: float)

Set bond length for specified atom pair; sets needs_update.

Parameters:

.ak_tpl (tuple) – tuple of AtomKeys Pair of atoms in this Hedron

class Bio.PDB.internal_coords.Dihedron(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str)

Bases: Edron

Class to represent four joined atoms forming a dihedral angle.

Attributes:
angle: float

Measurement or specification of dihedral angle in degrees; prefer IC_Residue.bond_set() to set

hedron1, hedron2: Hedron object references

The two hedra which form the dihedral angle

h1key, h2key: tuples of AtomKeys

Hash keys for hedron1 and hedron2

id3,id32: tuples of AtomKeys

First 3 and second 3 atoms comprising dihedron; hxkey orders may differ

ric: IC_Residue object reference

IC_Residue object containing this dihedral

reverse: bool

Indicates order of atoms in dihedron is reversed from order of atoms in hedra

primary: bool

True if this is psi, phi, omega or a sidechain chi angle

pclass: string (primary angle class)

re_class with X for adjacent residue according to nomenclature (psi, omega, phi)

cst, rcst: numpy [4][4] arrays

transformations to (cst) and from (rcst) Dihedron coordinate space defined with atom 2 (Hedron 1 center atom) at the origin. Views on IC_Chain.dCoordSpace.

Methods

angle()

getter/setter for dihdral angle in degrees; prefer IC_Residue.bond_set()

bits()

return IC_Residue.pic_flags bitmask for dihedron psi, omega, etc

__init__(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str) None

Init Dihedron with sequence of AtomKeys and optional dihedral angle.

Acceptable input:

As for Edron, plus optional ‘dihedral’ keyworded angle value.

__repr__() str

Print string for Dihedron object.

property angle: float

Get dihedral angle.

static angle_dif(a1: float | ndarray, a2: float | ndarray)

Get angle difference between two +/- 180 angles.

https://stackoverflow.com/a/36001014/2783487

static angle_avg(alst: list, in_rads: bool = False, out_rads: bool = False)

Get average of list of +/-180 angles.

Parameters:
  • alst (List) – list of angles to average

  • in_rads (bool) – input values are in radians

  • out_rads (bool) – report result in radians

static angle_pop_sd(alst: list, avg: float)

Get population standard deviation for list of +/-180 angles.

should be sample std dev but avoid len(alst)=1 -> div by 0

difference(other: Dihedron) float

Get angle difference between this and other +/- 180 angles.

bits() int

Get IC_Residue.pic_flags bitmasks for self is psi, omg, phi, pomg, chiX.

class Bio.PDB.internal_coords.AtomKey(*args: IC_Residue | Atom | list | dict | str, **kwargs: str)

Bases: object

Class for dict keys to reference atom coordinates.

AtomKeys capture residue and disorder information together, and provide a no-whitespace string key for .pic files.

Supports rich comparison and multiple ways to instantiate.

AtomKeys contain:

residue position (respos), insertion code (icode), 1 or 3 char residue name (resname), atom name (atm), altloc (altloc), and occupancy (occ)

Use AtomKey.fields to get the index to the component of interest by name:

Get C-alpha atoms from IC_Chain atomArray and atomArrayIndex with AtomKeys:

atmNameNdx = internal_coords.AtomKey.fields.atm
CaSelection = [
    atomArrayIndex.get(k)
    for k in atomArrayIndex.keys()
    if k.akl[atmNameNdx] == "CA"
]
AtomArrayCa = atomArray[CaSelection]

Get all phenylalanine atoms in a chain:

resNameNdx = internal_coords.AtomKey.fields.resname
PheSelection = [
    atomArrayIndex.get(k)
    for k in atomArrayIndex.keys()
    if k.akl[resNameNdx] == "F"
]
AtomArrayPhe = atomArray[PheSelection]

‘resname’ will be the uppercase 1-letter amino acid code if one of the 20 standard residues, otherwise the supplied 3-letter code. Supplied as input or read from .rbase attribute of IC_Residue.

Attributes:
akl: tuple

All six fields of AtomKey

fieldNames: tuple (Class Attribute)

Mapping of key index positions to names

fields: namedtuple (Class Attribute)

Mapping of field names to index positions.

id: str

‘_’-joined AtomKey fields, excluding ‘None’ fields

atom_re: compiled regex (Class Attribute)

A compiled regular expression matching the string form of the key

d2h: bool (Class Attribute) default False

Convert D atoms to H on input if True; must also modify IC_Residue.accept_atoms

missing: bool default False

AtomKey __init__’d from string is probably missing, set this flag to note the issue. Set by IC_Residue.rak()

ric: IC_Residue default None

If initialised with IC_Residue, this references the IC_residue

Methods

altloc_match(other)

Returns True if this AtomKey matches other AtomKey excluding altloc and occupancy fields

is_backbone()

Returns True if atom is N, CA, C, O or H

atm()

Returns atom name, e.g. N, CA, CB, etc.

cr_class()

Returns covalent radii class e.g. Csb

atom_re = re.compile('^(?P<respos>-?\\d+)(?P<icode>[A-Za-z])?_(?P<resname>[a-zA-Z]+)_(?P<atm>[A-Za-z0-9]+)(?:_(?P<altloc>\\w))?(?:_(?P<occ>-?\\d\\.\\d+?))?$')

Pre-compiled regular expression to match an AtomKey string.

fieldNames = ('respos', 'icode', 'resname', 'atm', 'altloc', 'occ')
fields = (0, 1, 2, 3, 4, 5)

Use this namedtuple to access AtomKey fields. See AtomKey

d2h = False

Set True to convert D Deuterium to H Hydrogen on input.

__init__(*args: IC_Residue | Atom | list | dict | str, **kwargs: str) None

Initialize AtomKey with residue and atom data.

Examples of acceptable input:

(<IC_Residue>, 'CA', ...)    : IC_Residue with atom info
(<IC_Residue>, <Atom>)       : IC_Residue with Biopython Atom
([52, None, 'G', 'CA', ...])  : list of ordered data fields
(52, None, 'G', 'CA', ...)    : multiple ordered arguments
({respos: 52, icode: None, atm: 'CA', ...}) : dict with fieldNames
(respos: 52, icode: None, atm: 'CA', ...) : kwargs with fieldNames
52_G_CA, 52B_G_CA, 52_G_CA_0.33, 52_G_CA_B_0.33  : id strings
__deepcopy__(memo)

Deep copy implementation for AtomKey.

__repr__() str

Repr string from id.

__hash__() int

Hash calculated at init from akl tuple.

altloc_match(other: AtomKey) bool

Test AtomKey match to other discounting occupancy and altloc.

is_backbone() bool

Return True if is N, C, CA, O, or H.

atm() str

Return atom name : N, CA, CB, O etc.

cr_class() str | None

Return covalent radii class for atom or None.

__ne__(other: object) bool

Test for inequality.

__eq__(other: object) bool

Test for equality.

__gt__(other: object) bool

Test greater than.

__ge__(other: object) bool

Test greater or equal.

__lt__(other: object) bool

Test less than.

__le__(other: object) bool

Test less or equal.

Bio.PDB.internal_coords.set_accuracy_95(num: float) float

Reduce floating point accuracy to 9.5 (xxxx.xxxxx).

Used by IC_Residue class writing PIC and SCAD files.

Parameters:

num (float) – input number

Returns:

float with specified accuracy

exception Bio.PDB.internal_coords.HedronMatchError

Bases: Exception

Cannot find hedron in residue for given key.

exception Bio.PDB.internal_coords.MissingAtomError

Bases: Exception

Missing atom coordinates for hedron or dihedron.