Bio.PDB.internal_coords module
Classes to support internal coordinates for protein structures.
Internal coordinates comprise Psi, Omega and Phi dihedral angles along the protein backbone, Chi angles along the sidechains, and all 3-atom angles and bond lengths defining a protein chain. These routines can compute internal coordinates from atom XYZ coordinates, and compute atom XYZ coordinates from internal coordinates.
Secondary benefits include the ability to align and compare residue environments in 3D structures, support for 2D atom distance plots, converting a distance plot plus chirality information to a structure, generating an OpenSCAD description of a structure for 3D printing, and reading/writing structures as internal coordinate data files.
Usage:
from Bio.PDB.PDBParser import PDBParser
from Bio.PDB.Chain import Chain
from Bio.PDB.internal_coords import *
from Bio.PDB.PICIO import write_PIC, read_PIC, read_PIC_seq
from Bio.PDB.ic_rebuild import write_PDB, IC_duplicate, structure_rebuild_test
from Bio.PDB.SCADIO import write_SCAD
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
from Bio.PDB.PDBIO import PDBIO
import numpy as np
# load a structure as normal, get first chain
parser = PDBParser()
myProtein = parser.get_structure("7rsa", "pdb7rsa.ent")
myChain = myProtein[0]["A"]
# compute bond lengths, angles, dihedral angles
myChain.atom_to_internal_coordinates(verbose=True)
# check myChain makes sense (can get angles and rebuild same structure)
resultDict = structure_rebuild_test(myChain)
assert resultDict['pass'] == True
# get residue 1 chi2 angle
r1 = next(myChain.get_residues())
r1chi2 = r1.internal_coord.get_angle("chi2")
# rotate residue 1 chi2 angle by 120 degrees (loops w/in +/-180)
r1.internal_coord.set_angle("chi2", r1chi2 + 120.0)
# or
r1.internal_coord.bond_rotate("chi2", 120.0)
# update myChain XYZ coordinates with chi2 changed
myChain.internal_to_atom_coordinates()
# write new conformation with PDBIO
write_PDB(myProtein, "myChain.pdb")
# or just the ATOM records without headers:
io = PDBIO()
io.set_structure(myProtein)
io.save("myChain2.pdb")
# write chain as 'protein internal coordinates' (.pic) file
write_PIC(myProtein, "myChain.pic")
# read .pic file
myProtein2 = read_PIC("myChain.pic")
# create default structure for random sequence by reading as .pic file
myProtein3 = read_PIC_seq(
SeqRecord(
Seq("GAVLIMFPSTCNQYWDEHKR"),
id="1RND",
description="my random sequence",
)
)
myProtein3.internal_to_atom_coordinates()
write_PDB(myProtein3, "myRandom.pdb")
# access the all-dihedrals array for the chain, e.g. residue 1 chi2 angle:
r1chi2_obj = r1.internal_coord.pick_angle("chi2")
# or same thing: r1chi2_obj = r1.internal_coord.pick_angle("CA:CB:CG:CD")
r1chi2_key = r1chi2_obj.atomkeys
# r1chi2_key is tuple of AtomKeys (1_K_CA, 1_K_CB, 1_K_CG, 1_K_CD)
r1chi2_index = myChain.internal_coord.dihedraNdx[r1chi2_key]
# or same thing: r1chi2_index = r1chi2_obj.ndx
r1chi2_value = myChain.internal_coord.dihedraAngle[r1chi2_index]
# also true: r1chi2_obj == myChain.internal_coord.dihedra[r1chi2_index]
# access the array of all atoms for the chain, e.g. residue 1 C-beta
r1_cBeta_index = myChain.internal_coord.atomArrayIndex[AtomKey("1_K_CB")]
r1_cBeta_coords = myChain.internal_coord.atomArray[r1_cBeta_index]
# r1_cBeta_coords = [ x, y, z, 1.0 ]
# the Biopython Atom coord array is now a view into atomArray, so
assert r1_cBeta_coords[1] == r1["CB"].coord[1]
r1_cBeta_coords[1] += 1.0 # change the Y coord 1 angstrom
assert r1_cBeta_coords[1] == r1["CB"].coord[1]
# they are always the same (they share the same memory)
r1_cBeta_coords[1] -= 1.0 # restore
# create a selector to filter just the C-alpha atoms from the all atom array
atmNameNdx = AtomKey.fields.atm
atomArrayIndex = myChain.internal_coord.atomArrayIndex
CaSelect = [
atomArrayIndex.get(k) for k in atomArrayIndex.keys() if k.akl[atmNameNdx] == "CA"
]
# now the ordered array of C-alpha atom coordinates is:
CA_coords = myChain.internal_coord.atomArray[CaSelect]
# note this uses Numpy fancy indexing, so CA_coords is a new copy
# create a C-alpha distance plot
caDistances = myChain.internal_coord.distance_plot(CaSelect)
# display with e.g. MatPlotLib:
# import matplotlib.pyplot as plt
# plt.imshow(caDistances, cmap="hot", interpolation="nearest")
# plt.show()
# build structure from distance plot:
## create the all-atom distance plot
distances = myChain.internal_coord.distance_plot()
## get the sign of the dihedral angles
chirality = myChain.internal_coord.dihedral_signs()
## get new, empty data structure : copy data structure from myChain
myChain2 = IC_duplicate(myChain)[0]["A"]
cic2 = myChain2.internal_coord
## clear the new atomArray and di/hedra value arrays, just for proof
cic2.atomArray = np.zeros((cic2.AAsiz, 4), dtype=np.float64)
cic2.dihedraAngle[:] = 0.0
cic2.hedraAngle[:] = 0.0
cic2.hedraL12[:] = 0.0
cic2.hedraL23[:] = 0.0
## copy just the first N-Ca-C coords so structures will superimpose:
cic2.copy_initNCaCs(myChain.internal_coord)
## copy distances to chain arrays:
cic2.distplot_to_dh_arrays(distances, chirality)
## compute angles and dihedral angles from distances:
cic2.distance_to_internal_coordinates()
## generate XYZ coordinates from internal coordinates:
myChain2.internal_to_atom_coordinates()
## confirm result atomArray matches original structure:
assert np.allclose(cic2.atomArray, myChain.internal_coord.atomArray)
# superimpose all phe-phe pairs - quick hack just to demonstrate concept
# for analyzing pairwise residue interactions. Generates PDB ATOM records
# placing each PHE at origin and showing all other PHEs in environment
## shorthand for key variables:
cic = myChain.internal_coord
resNameNdx = AtomKey.fields.resname
aaNdx = cic.atomArrayIndex
## select just PHE atoms:
pheAtomSelect = [aaNdx.get(k) for k in aaNdx.keys() if k.akl[resNameNdx] == "F"]
aaF = cic.atomArray[ pheAtomSelect ] # numpy fancy indexing makes COPY not view
for ric in cic.ordered_aa_ic_list: # internal_coords version of get_residues()
if ric.rbase[2] == "F": # if PHE, get transform matrices for chi1 dihedral
chi1 = ric.pick_angle("N:CA:CB:CG") # chi1 space has C-alpha at origin
cst = np.transpose(chi1.cst) # transform TO chi1 space
# rcst = np.transpose(chi1.rcst) # transform FROM chi1 space
cic.atomArray[pheAtomSelect] = aaF.dot(cst) # transform just the PHEs
for res in myChain.get_residues(): # print PHEs in new coordinate space
if res.resname in ["PHE"]:
print(res.internal_coord.pdb_residue_string())
cic.atomArray[pheAtomSelect] = aaF # restore coordinate space from copy
# write OpenSCAD program of spheres and cylinders to 3d print myChain backbone
## set atom load filter to accept backbone only:
IC_Residue.accept_atoms = IC_Residue.accept_backbone
## delete existing data to force re-read of all atoms:
myChain.internal_coord = None
write_SCAD(myChain, "myChain.scad", scale=10.0)
See the ‘’Internal coordinates module’’ section of the Biopython Tutorial and Cookbook for further discussion.
Terms and key data structures:
Internal coordinates are defined on sequences of atoms which span
residues or follow accepted nomenclature along sidechains. To manage these
sequences and support Biopython’s disorder mechanisms, AtomKey
specifiers are implemented to capture residue, atom and variant identification
in a single object. A Hedron
object is specified as three sequential
AtomKeys, comprising two bond lengths and the bond angle between them. A
Dihedron
consists of four sequential AtomKeys, linking two Hedra with
a dihedral angle between them.
Algorithmic overview:
The Internal Coordinates module combines a specification of connected atoms as
hedra and dihedra in the ic_data
file with routines here to transform
XYZ coordinates of these atom sets between a local coordinate system and the
world coordinates supplied in e.g. a PDB or mmCif data file. The local
coordinate system places the center atom of a hedron at the origin (0,0,0), one
leg on the +Z axis, and the other leg on the XZ plane (see Hedron
).
Measurement and creation or manipulation of hedra and dihedra in the local
coordinate space is straightforward, and the calculated transformation matrices
enable assembling these subunits into a protein chain starting from supplied
(PDB) coordinates for the initial N-Ca-C atoms.
Psi and Phi angles are defined on atoms from adjacent residues in a protein
chain, see e.g. pick_angle()
and ic_data
for the relevant
mapping between residues and backbone dihedral angles.
Transforms to and from the dihedron local coordinate space described above are
accessible via IC_Chain.dCoordSpace
and Dihedron
attributes
.cst and .rcst, and may be applied in the alignment and comparison of residues
and their environments with code along the lines of:
chi1 = ric0.pick_angle("chi1") # chi1 space defined with CA at origin
cst = np.transpose(chi1.cst) # transform TO chi1 local space
newAtomCoords = oldAtomCoords.dot(cst)
The core algorithms were developed independently during 1993-4 for ‘’Development and Application of a Three-dimensional Description of Amino Acid Environments in Protein,’’ Miller, Douthart, and Dunker, Advances in Molecular Bioinformatics, IOS Press, 1994, ISBN 90 5199 172 x, pp. 9-30.
A Protein Internal Coordinate (.pic) file format is defined to capture
sufficient detail to reproduce a PDB file from chain starting coordinates
(first residue N, Ca, C XYZ coordinates) and remaining internal coordinates.
These files are used internally to verify that a given structure can be
regenerated from its internal coordinates. See PICIO
for reading and
writing .pic files and structure_rebuild_test()
to determine if a
specific PDB or mmCif datafile has sufficient information to interconvert
between cartesian and internal coordinates.
Internal coordinates may also be exported as OpenSCAD
data arrays for generating 3D printed protein models. OpenSCAD software is
provided as a starting point and proof-of-concept for generating such models.
See SCADIO
and this Thingiverse project
for a more advanced example.
Refer to distance_plot()
and distance_to_internal_coordinates()
for converting structure data to/from 2D distance plots.
The following classes comprise the core functionality for processing internal coordinates and are sufficiently related and coupled to place them together in this module:
IC_Chain
: Extends Biopython Chain on .internal_coord attribute.Manages connected sequence of residues and chain breaks; holds numpy arrays for all atom coordinates and bond geometries. For ‘parallel’ processing IC_Chain methods operate on these arrays with single numpy commands.
IC_Residue
: Extends Biopython Residue on .internal_coord attribute.Access for per residue views on internal coordinates and methods for serial (residue by residue) assembly.
Dihedron
: four joined atoms forming a dihedral angle.Dihedral angle, homogeneous atom coordinates in local coordinate space, references to relevant Hedra and IC_Residue. Getter methods for residue dihedral angles, bond angles and bond lengths.
Hedron
: three joined atoms forming a plane.Contains homogeneous atom coordinates in local coordinate space as well as bond lengths and angle between them.
Edron
: base class for Hedron and Dihedron classes.Tuple of AtomKeys comprising child, string ID, mainchain membership boolean and other routines common for both Hedra and Dihedra. Implements rich comparison.
AtomKey
: keys (dictionary and string) for referencing atom sequences.Capture residue and disorder/occupancy information, provides a no-whitespace key for .pic files, and implements rich comparison.
Custom exception classes: HedronMatchError
and
MissingAtomError
- class Bio.PDB.internal_coords.IC_Chain(parent, verbose: bool = False)
Bases:
object
Class to extend Biopython Chain with internal coordinate data.
- Attributes:
- chain: object reference
The Biopython
Bio.PDB.Chain.Chain
object this extends- MaxPeptideBond: float
Class attribute to detect chain breaks. Override for fully contiguous chains with some very long bonds - e.g. for 3D printing (OpenSCAD output) a structure with missing residues.
MaxPeptideBond
- ParallelAssembleResidues: bool
Class attribute affecting internal_to_atom_coords. Short (50 residue and less) chains are faster to assemble without the overhead of creating numpy arrays, and the algorithm is easier to understand and trace processing a single residue at a time. Clearing (set to False) this flag will switch to the serial algorithm
- ordered_aa_ic_list: list
IC_Residue objects internal_coords algorithms can process (e.g. no waters)
- initNCaC: List of N, Ca, C AtomKey tuples (NCaCKeys).
NCaCKeys start chain segments (first residue or after chain break). These 3 atoms define the coordinate space for a contiguous chain segment, as initially specified by PDB or mmCIF file.
- AAsiz = int
AtomArray size, number of atoms in this chain
- atomArray: numpy array
homogeneous atom coords ([x,, y, z, 1.0]) for every atom in chain
- atomArrayIndex: dict
maps AtomKeys to atomArray indexes
- hedra: dict
Hedra forming residues in this chain; indexed by 3-tuples of AtomKeys.
- hedraLen: int
length of hedra dict
- hedraNdx: dict
maps hedra AtomKeys to numeric index into hedra data arrays e.g. hedraL12 below
- a2ha_map: [hedraLen x 3]
atom indexes in hedraNdx order
- dihedra: dict
Dihedra forming residues in this chain; indexed by 4-tuples of AtomKeys.
- dihedraLen: int
length of dihedra dict
- dihedraNdx: dict
maps dihedra AtomKeys to dihedra data arrays e.g. dihedraAngle
- a2da_map[dihedraLen x 4]
AtomNdx’s in dihedraNdx order
- d2a_map[dihedraLen x [4]]
AtomNdx’s for each dihedron (reshaped a2da_map)
- Numpy arrays for vector processing of chain di/hedra:
- hedraL12: numpy array
bond length between hedron 1st and 2nd atom
- hedraAngle: numpy array
bond angle for each hedron, in degrees
- hedraL23: numpy array
bond length between hedron 2nd and 3rd atom
- id3_dh_index: dict
maps hedron key to list of dihedra starting with hedron, used by assemble and bond_rotate to find dihedra with h1 key
- id32_dh_index: dict
like id3_dh_index, find dihedra from h2 key
- hAtoms: numpy array
homogeneous atom coordinates (3x4) of hedra, central atom at origin
- hAtomsR: numpy array
hAtoms in reverse orientation
- hAtoms_needs_update: numpy array of bool
indicates whether hAtoms represent hedraL12/A/L23
- dihedraAngle: numpy array
dihedral angles (degrees) for each dihedron
- dAtoms: numpy array
homogeneous atom coordinates (4x4) of dihedra, second atom at origin
- dAtoms_needs_update: numpy array of bool
indicates whether dAtoms represent dihedraAngle
- dCoordSpace: numpy array
forward and reverse transform matrices standardising positions of first hedron. See
dCoordSpace
.- dcsValid: bool
indicates dCoordSpace up to date
- See also attributes generated by :meth:`build_edraArrays` for indexing
- di/hedra data elements.
Methods
internal_to_atom_coordinates:
Process ic data to Residue/Atom coordinates; calls assemble_residues()
assemble_residues:
Generate IC_Chain atom coords from internal coordinates (parallel)
assemble_residues_ser:
Generate IC_Residue atom coords from internal coordinates (serial)
atom_to_internal_coordinates:
Calculate dihedrals, angles, bond lengths (internal coordinates) for Atom data
write_SCAD:
Write OpenSCAD matrices for internal coordinate data comprising chain; this is a support routine, see
SCADIO.write_SCAD()
to generate OpenSCAD description of a protein chain.distance_plot:
Generate 2D plot of interatomic distances with optional filter
distance_to_internal_coordinates:
Compute internal coordinates from distance plot and array of dihedral angle signs.
make_extended:
Arbitrarily sets all psi and phi backbone angles to 123 and -104 degrees.
- MaxPeptideBond = 1.4
Larger C-N distance than this will be chain break
- ParallelAssembleResidues = True
Enable parallel internal_to_atom algorithm, is slower for short chains
- AAsiz = 0
Number of atoms in this chain (size of atomArray)
- atomArray: array = None
AAsiz x [4] of float np.float64 homogeneous atom coordinates, all atoms in chain.
- dCoordSpace = None
[2][dihedraLen][4][4] : 2 arrays of 4x4 coordinate space transforms for each dihedron. The first [0] converts TO standard space with first atom on the XZ plane, the second atom at the origin, the third on the +Z axis, and the fourth placed according to the dihedral angle. The second [1] transform returns FROM the standard space to world coordinates (PDB file input or whatever is current). Also accessible as .cst (forward transform) and .rcst (reverse transform) in
Dihedron
.
- dcsValid = None
True if dCoordSpace is up to date. Use
update_dCoordSpace()
if needed.
- __init__(parent, verbose: bool = False) None
Initialize IC_Chain object, with or without residue/Atom data.
- Parameters:
parent (Bio.PDB.Chain) – Biopython Chain object Chain object this extends
- clear_ic()
Clear residue internal_coord settings for this chain.
- build_atomArray() None
Build
IC_Chain
numpy coordinate array from biopython atoms.See also
init_edra()
for more complete initialization of IC_Chain.- Inputs:
- self.aksetset
AtomKey
s in this chain
- Generates:
- self.AAsizint
number of atoms in chain (len(akset))
- self.aktupleAAsiz x AtomKeys
sorted akset AtomKeys
- self.atomArrayIndex[AAsiz] of int
numerical index for each AtomKey in aktuple
- self.atomArrayValidAAsiz x bool
atomArray coordinates current with internal coordinates if True
- self.atomArrayAAsiz x np.float64[4]
homogeneous atom coordinates; Biopython
Atom
coordinates are view into this array after execution- rak_cachedict
lookup cache for AtomKeys for each residue
- build_edraArrays() None
Build chain level hedra and dihedra arrays.
Used by
init_edra()
and_hedraDict2chain()
. Should be private method but exposed for documentation.- Inputs:
- self.dihedraLenint
number of dihedra needed
- self.hedraLenint
number of hedra needed
- self.AAsizint
length of atomArray
- self.hedraNdxdict
maps hedron keys to range(hedraLen)
- self.dihedraNdxdict
maps dihedron keys to range(dihedraLen)
- self.hedradict
maps Hedra keys to Hedra for chain
- self.atomArrayAAsiz x np.float64[4]
homogeneous atom coordinates for chain
- self.atomArrayIndexdict
maps AtomKeys to atomArray
- self.atomArrayValidAAsiz x bool
indicates coord is up-to-date
- Generates:
- self.dCoordSpace[2][dihedraLen][4][4]
transforms to/from dihedron coordinate space
- self.dcsValiddihedraLen x bool
indicates dCoordSpace is current
- self.hAtomshedraLen x 3 x np.float64[4]
atom coordinates in hCoordSpace
- self.hAtomsRhedraLen x 3 x np.float64[4]
hAtoms in reverse order (trading space for time)
- self.hAtoms_needs_updatehedraLen x bool
indicates hAtoms, hAtoms current
- self.a2h_mapAAsiz x [int …]
maps atomArrayIndex to hedraNdx’s with that atom
- self.a2ha_map[hedraLen x 3]
AtomNdx’s in hedraNdx order
- self.h2aahedraLen x [int …]
maps hedraNdx to atomNdx’s in hedron (reshaped later)
- Hedron.ndxint
self.hedraNdx value stored inside Hedron object
- self.dRevdihedraLen x bool
dihedron reversed if true
- self.dH1ndx, dH2ndx[dihedraLen]
hedraNdx’s for 1st and 2nd hedra
- self.h1d_maphedraLen x []
hedraNdx -> [dihedra using hedron]
- Dihedron.h1key, h2key[AtomKey …]
hedron keys for dihedron, reversed as needed
- Dihedron.hedron1, hedron2Hedron
references inside dihedron to hedra
- Dihedron.ndxint
self.dihedraNdx info inside Dihedron object
- Dihedron.cst, rcstnp.float64p4][4]
dCoordSpace references inside Dihedron
- self.a2da_map[dihedraLen x 4]
AtomNdx’s in dihedraNdx order
- self.d2a_map[dihedraLen x [4]]
AtomNdx’s for each dihedron (reshaped a2da_map)
- self.dFwdbool
dihedron is not Reversed if True
- self.a2d_mapAAsiz x [[dihedraNdx]
[atom ndx 0-3 of atom in dihedron]], maps atom indexes to dihedra and atoms in them
- self.dAtoms_needs_updatedihedraLen x bool
atoms in h1, h2 are current if False
- assemble_residues(verbose: bool = False) None
Generate atom coords from internal coords (vectorised).
This is the ‘Numpy parallel’ version of
assemble_residues_ser()
.Starting with dihedra already formed by
init_atom_coords()
, transform each from dihedron local coordinate space into protein chain coordinate space. Iterate until all dependencies satisfied.Does not update
dCoordSpace
asassemble_residues_ser()
does. Callupdate_dCoordSpace()
if needed. Faster to do in single operation once all atom coordinates finished.- Parameters:
verbose (bool) – default False. Report number of iterations to compute changed dihedra
- generates:
- self.dSet: AAsiz x dihedraLen x 4
maps atoms in dihedra to atomArray
- self.dSetValid[dihedraLen][4] of bool
map of valid atoms into dihedra to detect 3 or 4 atoms valid
Output coordinates written to
atomArray
. BiopythonBio.PDB.Atom
coordinates are a view on this data.
- assemble_residues_ser(verbose: bool = False, start: int | None = None, fin: int | None = None) None
Generate IC_Residue atom coords from internal coordinates (serial).
See
assemble_residues()
for ‘numpy parallel’ version.Filter positions between start and fin if set, find appropriate start coordinates for each residue and pass to
assemble()
- Parameters:
verbose (bool) – default False. Describe runtime problems
start,fin (int) – default None. Sequence position for begin, end of subregion to generate coords for.
- init_edra(verbose: bool = False) None
Create chain and residue di/hedra structures, arrays, atomArray.
- Inputs:
self.ordered_aa_ic_list : list of IC_Residue
- Generates:
edra objects, self.di/hedra (executes
_create_edra()
)atomArray and support (executes
build_atomArray()
)self.hedraLen : number of hedra in structure
hedraL12 : numpy arrays for lengths, angles (empty)
hedraAngle ..
hedraL23 ..
self.hedraNdx : dict mapping hedrakeys to hedraL12 etc
self.dihedraLen : number of dihedra in structure
dihedraAngle ..
dihedraAngleRads : np arrays for angles (empty)
self.dihedraNdx : dict mapping dihedrakeys to dihedraAngle
- init_atom_coords() None
Set chain level di/hedra initial coords from angles and distances.
Initializes atom coordinates in local coordinate space for hedra and dihedra, will be transformed appropriately later by
dCoordSpace
matrices for assembly.
- update_dCoordSpace(workSelector: ndarray | None = None) None
Compute/update coordinate space transforms for chain dihedra.
Requires all atoms updated so calls
assemble_residues()
(returns immediately if all atoms already assembled).- Parameters:
workSelector ([bool]) – Optional mask to select dihedra for update
- propagate_changes() None
Track through di/hedra to invalidate dependent atoms.
- internal_to_atom_coordinates(verbose: bool = False, start: int | None = None, fin: int | None = None) None
Process IC data to Residue/Atom coords.
- Parameters:
verbose (bool) – default False. Describe runtime problems
start,fin (int) – Optional sequence positions for begin, end of subregion to process.
Note
Setting start or fin activates serial
assemble_residues_ser()
instead of (Numpy parallel)assemble_residues()
. Start C-alpha will be at origin.See also
- atom_to_internal_coordinates(verbose: bool = False) None
Calculate dihedrals, angles, bond lengths for Atom data.
Generates atomArray (through init_edra), value arrays for hedra and dihedra, and coordinate space transforms for dihedra.
Generates Gly C-beta if specified, see
IC_Residue.gly_Cbeta
- Parameters:
verbose (bool) – default False. describe runtime problems
- distance_plot(filter: ndarray | None = None) ndarray
Generate 2D distance plot from atomArray.
Default is to calculate distances for all atoms. To generate the classic C-alpha distance plot, pass a boolean mask array like:
atmNameNdx = internal_coords.AtomKey.fields.atm CaSelect = [ atomArrayIndex.get(k) for k in atomArrayIndex.keys() if k.akl[atmNameNdx] == "CA" ] plot = cic.distance_plot(CaSelect)
Alternatively, this will select all backbone atoms:
backboneSelect = [ atomArrayIndex.get(k) for k in atomArrayIndex.keys() if k.is_backbone() ]
- Parameters:
filter ([bool]) – restrict atoms for calculation
See also
distance_to_internal_coordinates()
, which requires the default all atom distance plot.
- dihedral_signs() ndarray
Get sign array (+1/-1) for each element of chain dihedraAngle array.
Required for
distance_to_internal_coordinates()
- distplot_to_dh_arrays(distplot: ndarray, dihedra_signs: ndarray) None
Load di/hedra distance arrays from distplot.
Fill
IC_Chain
arrays hedraL12, L23, L13 and dihedraL14 distance value arrays from input distplot, dihedra_signs array from input dihedra_signs. Distplot and di/hedra distance arrays must index according to AtomKey mappings inIC_Chain
.hedraNdx and .dihedraNdx (created inIC_Chain.init_edra()
)Call
atom_to_internal_coordinates()
(or at leastinit_edra()
) to generate a2ha_map and d2a_map before running this.Explicitly removed from
distance_to_internal_coordinates()
so user may populate these chain di/hedra arrays by other methods.
- distance_to_internal_coordinates(resetAtoms: bool | None = True) None
Compute chain di/hedra from from distance and chirality data.
Distance properties on hedra L12, L23, L13 and dihedra L14 configured by
distplot_to_dh_arrays()
or alternative loader.dihedraAngles result is multiplied by dihedra_signs at final step recover chirality information lost in distance plot (mirror image of structure has same distances but opposite sign dihedral angles).
Note that chain breaks will cause errors in rebuilt structure, use
copy_initNCaCs()
to avoid thisBased on Blue, the Hedronometer’s answer to The dihedral angles of a tetrahedron in terms of its edge lengths on math.stackexchange.com. See also: “Heron-like Hedronometric Results for Tetrahedral Volume”.
Other values from that analysis included here as comments for completeness:
oa = hedron1 L12 if reverse else hedron1 L23
ob = hedron1 L23 if reverse else hedron1 L12
ac = hedron2 L12 if reverse else hedron2 L23
ab = hedron1 L13 = law of cosines on OA, OB (hedron1 L12, L23)
oc = hedron2 L13 = law of cosines on OA, AC (hedron2 L12, L23)
bc = dihedron L14
target is OA, the dihedral angle along edge oa.
- Parameters:
resetAtoms (bool) – default True. Mark all atoms in di/hedra and atomArray for updating by
internal_to_atom_coordinates()
. Alternatively set this to False and manipulate atomArrayValid, dAtoms_needs_update and hAtoms_needs_update directly to reduce computation.
- copy_initNCaCs(other: IC_Chain) None
Copy atom coordinates for initNCaC atoms from other IC_Chain.
Copies the coordinates and sets atomArrayValid flags True for initial NCaC and after any chain breaks.
Needed for
distance_to_internal_coordinates()
if target has chain breaks (otherwise each fragment will start at origin).Also useful if copying internal coordinates from another chain.
N.B.
IC_Residue.set_angle()
andIC_Residue.set_length()
invalidate their relevant atoms, so apply them before calling this function.
- make_extended()
Set all psi and phi angles to extended conformation (123, -104).
- __annotations__ = {'atomArray': <built-in function array>, 'atomArrayIndex': "dict['AtomKey', int]", 'bpAtomArray': "list['Atom']", 'ordered_aa_ic_list': 'list[IC_Residue]'}
- class Bio.PDB.internal_coords.IC_Residue(parent: Residue)
Bases:
object
Class to extend Biopython Residue with internal coordinate data.
- Parameters:
- parent: biopython Residue object this class extends
- Attributes:
- no_altloc: bool default False
Class variable, disable processing of ALTLOC atoms if True, use only selected atoms.
- accept_atoms: tuple
Class variable
accept_atoms
, list of PDB atom names to use when generating internal coordinates. Default is:accept_atoms = accept_mainchain + accept_hydrogens
to exclude hydrogens in internal coordinates and generated PDB files, override as:
IC_Residue.accept_atoms = IC_Residue.accept_mainchain
to get only mainchain atoms plus amide proton, use:
IC_Residue.accept_atoms = IC_Residue.accept_mainchain + ('H',)
to convert D atoms to H, set
AtomKey.d2h
= True and use:IC_Residue.accept_atoms = ( accept_mainchain + accept_hydrogens + accept_deuteriums )
Note that accept_mainchain = accept_backbone + accept_sidechain. Thus to generate sequence-agnostic conformational data for e.g. structure alignment in dihedral angle space, use:
IC_Residue.accept_atoms = accept_backbone
or set gly_Cbeta = True and use:
IC_Residue.accept_atoms = accept_backbone + ('CB',)
Changing accept_atoms will cause the default structure_rebuild_test in
ic_rebuild
to fail if some atoms are filtered (obviously). Use the quick=True option to test only the coordinates of filtered atoms to avoid this.There is currently no option to output internal coordinates with D instead of H.
- accept_resnames: tuple
Class variable
accept_resnames
, list of 3-letter residue names for HETATMs to accept when generating internal coordinates from atoms. HETATM sidechain will be ignored, but normal backbone atoms (N, CA, C, O, CB) will be included. Currently only CYG, YCM and UNK; override at your own risk. To generate sidechain, add appropriate entries to ic_data_sidechains inic_data
and support inIC_Chain.atom_to_internal_coordinates()
.- gly_Cbeta: bool default False
Class variable
gly_Cbeta
, override to True to generate internal coordinates for glycine CB atoms inIC_Chain.atom_to_internal_coordinates()
IC_Residue.gly_Cbeta = True
- pic_accuracy: str default “17.13f”
Class variable
pic_accuracy
sets accuracy for numeric values (angles, lengths) in .pic files. Default set high to support mmCIF file accuracy in rebuild tests. If you find rebuild tests fail with ‘ERROR -COORDINATES-’ and verbose=True shows only small discrepancies, try raising this value (or lower it to 9.5 if only working with PDB format files).IC_Residue.pic_accuracy = "9.5f"
- residue: Biopython Residue object reference
The
Residue
object this extends- hedra: dict indexed by 3-tuples of AtomKeys
Hedra forming this residue
- dihedra: dict indexed by 4-tuples of AtomKeys
Dihedra forming (overlapping) this residue
- rprev, rnext: lists of IC_Residue objects
References to adjacent (bonded, not missing, possibly disordered) residues in chain
- atom_coords: AtomKey indexed dict of numpy [4] arrays
removed Use AtomKeys and atomArrayIndex to build if needed
- ak_set: set of AtomKeys in dihedra
AtomKeys in all dihedra overlapping this residue (see __contains__())
- alt_ids: list of char
AltLoc IDs from PDB file
- bfactors: dict
AtomKey indexed B-factors as read from PDB file
- NCaCKey: List of tuples of AtomKeys
List of tuples of N, Ca, C backbone atom AtomKeys; usually only 1 but more if backbone altlocs.
- is20AA: bool
True if residue is one of 20 standard amino acids, based on Residue resname
- isAccept: bool
True if is20AA or in accept_resnames below
- rbase: tuple
residue position, insert code or none, resname (1 letter if standard amino acid)
- cic: IC_Chain default None
parent chain
IC_Chain
object- scale: optional float
used for OpenSCAD output to generate gly_Cbeta bond length
Methods
assemble(atomCoordsIn, resetLocation, verbose)
Compute atom coordinates for this residue from internal coordinates
get_angle()
Return angle for passed key
get_length()
Return bond length for specified pair
pick_angle()
Find Hedron or Dihedron for passed key
pick_length()
Find hedra for passed AtomKey pair
set_angle()
Set angle for passed key (no position updates)
set_length()
Set bond length in all relevant hedra for specified pair
bond_rotate(delta)
adjusts related dihedra angles by delta, e.g. rotating psi (N-Ca-C-N) will adjust the adjacent N-Ca-C-O by the same amount to avoid clashes
bond_set(angle)
uses bond_rotate to set specified dihedral to angle and adjust related dihedra accordingly
rak(atom info)
cached AtomKeys for this residue
- accept_resnames = ('CYG', 'YCM', 'UNK')
Add 3-letter residue name here for non-standard residues with normal backbone. CYG included for test case 4LGY (1305 residue contiguous chain). Safe to add more names for N-CA-C-O backbones, any more complexity will need additions to
accept_atoms
, ic_data_sidechains inic_data
and support inIC_Chain.atom_to_internal_coordinates()
- no_altloc: bool = False
Set True to filter altloc atoms on input and only work with Biopython default Atoms
- gly_Cbeta: bool = False
Create beta carbons on all Gly residues.
Setting this to True will generate internal coordinates for Gly C-beta carbons in
atom_to_internal_coordinates()
.Data averaged from Sep 2019 Dunbrack cullpdb_pc20_res2.2_R1.0 restricted to structures with amide protons. Please see
PISCES: A Protein Sequence Culling Server
‘G. Wang and R. L. Dunbrack, Jr. PISCES: a protein sequence culling server. Bioinformatics, 19:1589-1591, 2003.’
Ala avg rotation of OCCACB from NCACO query:
select avg(g.rslt) as avg_rslt, stddev(g.rslt) as sd_rslt, count(*) from (select f.d1d, f.d2d, (case when f.rslt > 0 then f.rslt-360.0 else f.rslt end) as rslt from (select d1.angle as d1d, d2.angle as d2d, (d2.angle - d1.angle) as rslt from dihedron d1, dihedron d2 where d1.re_class='AOACACAACB' and d2.re_class='ANACAACAO' and d1.pdb=d2.pdb and d1.chn=d2.chn and d1.res=d2.res) as f) as g
results:
| avg_rslt | sd_rslt | count | | -122.682194862932 | 5.04403040513919 | 14098 |
- pic_accuracy: str = '17.13f'
- accept_backbone = ('N', 'CA', 'C', 'O', 'OXT')
- accept_sidechain = ('CB', 'CG', 'CG1', 'OG1', 'OG', 'SG', 'CG2', 'CD', 'CD1', 'SD', 'OD1', 'ND1', 'CD2', 'ND2', 'CE', 'CE1', 'NE', 'OE1', 'NE1', 'CE2', 'OE2', 'NE2', 'CE3', 'CZ', 'NZ', 'CZ2', 'CZ3', 'OD2', 'OH', 'CH2', 'NH1', 'NH2')
- accept_mainchain = ('N', 'CA', 'C', 'O', 'OXT', 'CB', 'CG', 'CG1', 'OG1', 'OG', 'SG', 'CG2', 'CD', 'CD1', 'SD', 'OD1', 'ND1', 'CD2', 'ND2', 'CE', 'CE1', 'NE', 'OE1', 'NE1', 'CE2', 'OE2', 'NE2', 'CE3', 'CZ', 'NZ', 'CZ2', 'CZ3', 'OD2', 'OH', 'CH2', 'NH1', 'NH2')
- accept_hydrogens = ('H', 'H1', 'H2', 'H3', 'HA', 'HA2', 'HA3', 'HB', 'HB1', 'HB2', 'HB3', 'HG2', 'HG3', 'HD2', 'HD3', 'HE2', 'HE3', 'HZ1', 'HZ2', 'HZ3', 'HG11', 'HG12', 'HG13', 'HG21', 'HG22', 'HG23', 'HZ', 'HD1', 'HE1', 'HD11', 'HD12', 'HD13', 'HG', 'HG1', 'HD21', 'HD22', 'HD23', 'NH1', 'NH2', 'HE', 'HH11', 'HH12', 'HH21', 'HH22', 'HE21', 'HE22', 'HE2', 'HH', 'HH2')
- accept_deuteriums = ('D', 'D1', 'D2', 'D3', 'DA', 'DA2', 'DA3', 'DB', 'DB1', 'DB2', 'DB3', 'DG2', 'DG3', 'DD2', 'DD3', 'DE2', 'DE3', 'DZ1', 'DZ2', 'DZ3', 'DG11', 'DG12', 'DG13', 'DG21', 'DG22', 'DG23', 'DZ', 'DD1', 'DE1', 'DD11', 'DD12', 'DD13', 'DG', 'DG1', 'DD21', 'DD22', 'DD23', 'ND1', 'ND2', 'DE', 'DH11', 'DH12', 'DH21', 'DH22', 'DE21', 'DE22', 'DE2', 'DH', 'DH2')
- accept_atoms = ('N', 'CA', 'C', 'O', 'OXT', 'CB', 'CG', 'CG1', 'OG1', 'OG', 'SG', 'CG2', 'CD', 'CD1', 'SD', 'OD1', 'ND1', 'CD2', 'ND2', 'CE', 'CE1', 'NE', 'OE1', 'NE1', 'CE2', 'OE2', 'NE2', 'CE3', 'CZ', 'NZ', 'CZ2', 'CZ3', 'OD2', 'OH', 'CH2', 'NH1', 'NH2', 'H', 'H1', 'H2', 'H3', 'HA', 'HA2', 'HA3', 'HB', 'HB1', 'HB2', 'HB3', 'HG2', 'HG3', 'HD2', 'HD3', 'HE2', 'HE3', 'HZ1', 'HZ2', 'HZ3', 'HG11', 'HG12', 'HG13', 'HG21', 'HG22', 'HG23', 'HZ', 'HD1', 'HE1', 'HD11', 'HD12', 'HD13', 'HG', 'HG1', 'HD21', 'HD22', 'HD23', 'NH1', 'NH2', 'HE', 'HH11', 'HH12', 'HH21', 'HH22', 'HE21', 'HE22', 'HE2', 'HH', 'HH2')
Change accept_atoms to restrict atoms processed. See
IC_Residue
for usage.
- __init__(parent: Residue) None
Initialize IC_Residue with parent Biopython Residue.
- Parameters:
parent (Residue) – Biopython Residue object. The Biopython Residue this object extends
- __deepcopy__(memo)
Deep copy implementation for IC_Residue.
- __repr__() str
Print string is parent Residue ID.
- pretty_str() str
Nice string for residue ID.
- set_flexible() None
For OpenSCAD, mark N-CA and CA-C bonds to be flexible joints.
- set_hbond() None
For OpenSCAD, mark H-N and C-O bonds to be hbonds (magnets).
- clear_transforms()
Invalidate dihedra coordinate space attributes before assemble().
Coordinate space attributes are Dihedron.cst and .rcst, and
IC_Chain.dCoordSpace
- assemble(resetLocation: bool = False, verbose: bool = False) dict[AtomKey, array] | dict[tuple[AtomKey, AtomKey, AtomKey], array] | None
Compute atom coordinates for this residue from internal coordinates.
This is the IC_Residue part of the
assemble_residues_ser()
serial version, seeassemble_residues()
for numpy vectorized approach which works at theIC_Chain
level.Join prepared dihedra starting from N-CA-C and N-CA-CB hedrons, computing protein space coordinates for backbone and sidechain atoms
Sets forward and reverse transforms on each Dihedron to convert from protein coordinates to dihedron space coordinates for first three atoms (see
IC_Chain.dCoordSpace
)Call
init_atom_coords()
to update any modified di/hedra before coming here, this only assembles dihedra into protein coordinate space.Algorithm
Form double-ended queue, start with c-ca-n, o-c-ca, n-ca-cb, n-ca-c.
if resetLocation=True, use initial coords from generating dihedron for n-ca-c initial positions (result in dihedron coordinate space)
- while queue not empty
get 3-atom hedron key
for each dihedron starting with hedron key (1st hedron of dihedron)
- if have coordinates for all 4 atoms already
add 2nd hedron key to back of queue
- else if have coordinates for 1st 3 atoms
compute forward and reverse transforms to take 1st 3 atoms to/from dihedron initial coordinate space
use reverse transform to get position of 4th atom in current coordinates from dihedron initial coordinates
add 2nd hedron key to back of queue
- else
ordering failed, put hedron key at back of queue and hope next time we have 1st 3 atom positions (should not happen)
loop terminates (queue drains) as hedron keys which do not start any dihedra are removed without action
- Parameters:
resetLocation (bool) – default False. - Option to ignore start location and orient so initial N-Ca-C hedron at origin.
- Returns:
Dict of AtomKey -> homogeneous atom coords for residue in protein space relative to previous residue
Also directly updates
IC_Chain.atomArray
asassemble_residues()
does.
- split_akl(lst: tuple[AtomKey, ...] | list[AtomKey], missingOK: bool = False) list[tuple[AtomKey, ...]]
Get AtomKeys for this residue (ak_set) for generic list of AtomKeys.
Changes and/or expands a list of ‘generic’ AtomKeys (e.g. ‘N, C, C’) to be specific to this Residue’s altlocs etc., e.g. ‘(N-Ca_A_0.3-C, N-Ca_B_0.7-C)’
- Given a list of AtomKeys for a Hedron or Dihedron,
- return:
list of matching atomkeys that have id3_dh in this residue (ak may change if occupancy != 1.00)
- or
multiple lists of matching atomkeys expanded for all atom altlocs
- or
empty list if any of atom_coord(ak) missing and not missingOK
- Parameters:
lst (list) – list[3] or [4] of AtomKeys. Non-altloc AtomKeys to match to specific AtomKeys for this residue
missingOK (bool) – default False, see above.
- atom_sernum = None
- atom_chain = None
- pdb_residue_string() str
Generate PDB ATOM records for this residue as string.
Convenience method for functionality not exposed in PDBIO.py. Increments
IC_Residue.atom_sernum
if not None- Parameters:
IC_Residue.atom_sernum – Class variable default None. Override and increment atom serial number if not None
IC_Residue.atom_chain – Class variable. Override atom chain id if not None
Todo
move to PDBIO
- pic_flags = (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 496, 525, 1021, 1024, 2048, 4096, 7168, 8192, 16384)
Used by
PICIO.write_PIC()
to control classes of values to be defaulted.
- picFlagsDefault = 31744
Default is all dihedra + initial tau atoms + bFactors.
- picFlagsDict = {'all': 7168, 'bFactors': 16384, 'chi': 496, 'chi1': 16, 'chi2': 32, 'chi3': 64, 'chi4': 128, 'chi5': 256, 'classic': 1021, 'classic_b': 525, 'hedra': 1024, 'initAtoms': 8192, 'omg': 2, 'phi': 4, 'pomg': 512, 'primary': 2048, 'psi': 1, 'secondary': 4096, 'tau': 8}
Dictionary of pic_flags values to use as needed.
- pick_angle(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str) Hedron | Dihedron | None
Get Hedron or Dihedron for angle_key.
- Parameters:
angle_key –
tuple of 3 or 4 AtomKeys
string of atom names (‘CA’) separated by :’s
string of [-1, 0, 1]<atom name> separated by ‘:’s. -1 is previous residue, 0 is this residue, 1 is next residue
psi, phi, omg, omega, chi1, chi2, chi3, chi4, chi5
tau (N-CA-C angle) see Richardson1981
tuples of AtomKeys is only access for alternate disordered atoms
Observe that a residue’s phi and omega dihedrals, as well as the hedra comprising them (including the N:Ca:C tau hedron), are stored in the n-1 di/hedra sets; this overlap is handled here, but may be an issue if accessing directly.
The following print commands are equivalent (except for sidechains with non-carbon atoms for chi2):
ric = r.internal_coord print( r, ric.get_angle("psi"), ric.get_angle("phi"), ric.get_angle("omg"), ric.get_angle("tau"), ric.get_angle("chi2"), ) print( r, ric.get_angle("N:CA:C:1N"), ric.get_angle("-1C:N:CA:C"), ric.get_angle("-1CA:-1C:N:CA"), ric.get_angle("N:CA:C"), ric.get_angle("CA:CB:CG:CD"), )
See ic_data.py for detail of atoms in the enumerated sidechain angles and the backbone angles which do not span the peptide bond. Using ‘s’ for current residue (‘self’) and ‘n’ for next residue, the spanning (overlapping) angles are:
(sN, sCA, sC, nN) # psi (sCA, sC, nN, nCA) # omega i+1 (sC, nN, nCA, nC) # phi i+1 (sCA, sC, nN) (sC, nN, nCA) (nN, nCA, nC) # tau i+1
- Returns:
Matching Hedron, Dihedron, or None.
- get_angle(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str) float | None
Get dihedron or hedron angle for specified key.
See
pick_angle()
for key specifications.
- set_angle(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str, v: float, overlap=True)
Set dihedron or hedron angle for specified key.
If angle is a Dihedron and overlap is True (default), overlapping dihedra are also changed as appropriate. The overlap is a result of protein chain definitions in
ic_data
and_create_edra()
(e.g. psi overlaps N-CA-C-O).The default overlap=True is probably what you want for: set_angle(“chi1”, val)
The default is probably NOT what you want when processing all dihedrals in a chain or residue (such as copying from another structure), as the overlapping dihedra will likely be in the set as well.
N.B. setting e.g. PRO chi2 is permitted without error or warning!
See
pick_angle()
for angle_key specifications. Seebond_rotate()
to change a dihedral by a number of degrees- Parameters:
angle_key – angle identifier.
v (float) – new angle in degrees (result adjusted to +/-180).
overlap (bool) – default True. Modify overlapping dihedra as needed
- bond_rotate(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str, delta: float)
Rotate set of overlapping dihedrals by delta degrees.
Changes a dihedral angle by a given delta, i.e. new_angle = current_angle + delta Values are adjusted so new_angle will be within +/-180.
Changes overlapping dihedra as in
set_angle()
See
pick_angle()
for key specifications.
- bond_set(angle_key: tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey] | str, val: float)
Set dihedron to val, update overlapping dihedra by same amount.
Redundant to
set_angle()
, retained for compatibility. Unlikeset_angle()
this is for dihedra only and no option to not update overlapping dihedra.See
pick_angle()
for key specifications.
- pick_length(ak_spec: str | tuple[AtomKey, AtomKey]) tuple[list[Hedron] | None, tuple[AtomKey, AtomKey] | None]
Get list of hedra containing specified atom pair.
- Parameters:
ak_spec –
tuple of two AtomKeys
string: two atom names separated by ‘:’, e.g. ‘N:CA’ with optional position specifier relative to self, e.g. ‘-1C:N’ for preceding peptide bond. Position specifiers are -1, 0, 1.
The following are equivalent:
ric = r.internal_coord print( r, ric.get_length("0C:1N"), ) print( r, None if not ric.rnext else ric.get_length((ric.rak("C"), ric.rnext[0].rak("N"))), )
If atom not found on current residue then will look on rprev[0] to handle cases like Gly N:CA. For finer control please access IC_Chain.hedra directly.
- Returns:
list of hedra containing specified atom pair as tuples of AtomKeys
- get_length(ak_spec: str | tuple[AtomKey, AtomKey]) float | None
Get bond length for specified atom pair.
See
pick_length()
for ak_spec and details.
- set_length(ak_spec: str | tuple[AtomKey, AtomKey], val: float) None
Set bond length for specified atom pair.
See
pick_length()
for ak_spec.
- applyMtx(mtx: array) None
Apply matrix to atom_coords for this IC_Residue.
- __annotations__ = {'_AllBonds': <class 'bool'>, 'ak_set': 'set[AtomKey]', 'akc': 'dict[Union[str, Atom], AtomKey]', 'alt_ids': 'Union[list[str], None]', 'bfactors': 'dict[str, float]', 'cic': 'IC_Chain', 'dihedra': 'dict[DKT, Dihedron]', 'gly_Cbeta': <class 'bool'>, 'hedra': 'dict[HKT, Hedron]', 'no_altloc': <class 'bool'>, 'pic_accuracy': <class 'str'>, 'rnext': 'list[IC_Residue]', 'rprev': 'list[IC_Residue]'}
- class Bio.PDB.internal_coords.Edron(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str)
Bases:
object
Base class for Hedron and Dihedron classes.
Supports rich comparison based on lists of AtomKeys.
- Attributes:
- atomkeys: tuple
3 (hedron) or 4 (dihedron)
AtomKey
s defining this di/hedron- id: str
‘:’-joined string of AtomKeys for this di/hedron
- needs_update: bool
indicates di/hedron local atom_coords do NOT reflect current di/hedron angle and length values in hedron local coordinate space
- e_class: str
sequence of atoms (no position or residue) comprising di/hedron for statistics
- re_class: str
sequence of residue, atoms comprising di/hedron for statistics
- cre_class: str
sequence of covalent radii classes comprising di/hedron for statistics
- edron_re: compiled regex (Class Attribute)
A compiled regular expression matching string IDs for Hedron and Dihedron objects
- cic: IC_Chain reference
Chain internal coords object containing this hedron
- ndx: int
index into IC_Chain level numpy data arrays for di/hedra. Set in
IC_Chain.init_edra()
- rc: int
number of residues involved in this edron
Methods
gen_key([AtomKey, …] or AtomKey, …) (Static Method)
generate a ‘:’-joined string of AtomKey Ids
is_backbone()
Return True if all atomkeys atoms are N, Ca, C or O
- edron_re = re.compile('^(?P<pdbid>\\w+)?\\s(?P<chn>[\\w|\\s])?\\s(?P<a1>[\\w\\-\\.]+):(?P<a2>[\\w\\-\\.]+):(?P<a3>[\\w\\-\\.]+)(:(?P<a4>[\\w\\-\\.]+))?\\s+(((?P<len12>\\S+)\\s+(?P<angle>\\S+)\\s+(?P<len23>\\S+)\\s*$)|((?P<)
A compiled regular expression matching string IDs for Hedron and Dihedron objects
- static gen_key(lst: list[AtomKey]) str
Generate string of ‘:’-joined AtomKey strings from input.
Generate ‘2_A_C:3_P_N:3_P_CA’ from (2_A_C, 3_P_N, 3_P_CA) :param list lst: list of AtomKey objects
- static gen_tuple(akstr: str) tuple
Generate AtomKey tuple for ‘:’-joined AtomKey string.
Generate (2_A_C, 3_P_N, 3_P_CA) from ‘2_A_C:3_P_N:3_P_CA’ :param str akstr: string of ‘:’-separated AtomKey strings
- __init__(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str) None
Initialize Edron with sequence of AtomKeys.
Acceptable input:
[ AtomKey, … ] : list of AtomKeys AtomKey, … : sequence of AtomKeys as args {‘a1’: str, ‘a2’: str, … } : dict of AtomKeys as ‘a1’, ‘a2’ …
- __deepcopy__(memo)
Deep copy implementation for Edron.
- is_backbone() bool
Report True for contains only N, C, CA, O, H atoms.
- __repr__() str
Tuple of AtomKeys is default repr string.
- __hash__() int
Hash calculated at init from atomkeys tuple.
- __eq__(other: object) bool
Test for equality.
- __ne__(other: object) bool
Test for inequality.
- __gt__(other: object) bool
Test greater than.
- __ge__(other: object) bool
Test greater or equal.
- __lt__(other: object) bool
Test less than.
- __le__(other: object) bool
Test less or equal.
- class Bio.PDB.internal_coords.Hedron(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey], **kwargs: str)
Bases:
Edron
Class to represent three joined atoms forming a plane.
Contains atom coordinates in local coordinate space: central atom at origin, one terminal atom on XZ plane, and the other on the +Z axis. Stored in two orientations, with the 3rd (forward) or first (reversed) atom on the +Z axis. See
Dihedron
for use of forward and reverse orientations.- Attributes:
- len12: float
distance between first and second atoms
- len23: float
distance between second and third atoms
- angle: float
angle (degrees) formed by three atoms in hedron
- xrh_class: string
only for hedron spanning 2 residues, will have ‘X’ for residue contributing only one atom
Methods
get_length()
get bond length for specified atom pair
set_length()
set bond length for specified atom pair
angle(), len12(), len23()
setters for relevant attributes (angle in degrees)
- __init__(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey], **kwargs: str) None
Initialize Hedron with sequence of AtomKeys, kwargs.
- Acceptable input:
As for Edron, plus optional ‘len12’, ‘angle’, ‘len23’ keyworded values.
- __repr__() str
Print string for Hedron object.
- property angle: float
Get this hedron angle.
- property len12
Get first length for Hedron.
- property len23: float
Get second length for Hedron.
- class Bio.PDB.internal_coords.Dihedron(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str)
Bases:
Edron
Class to represent four joined atoms forming a dihedral angle.
- Attributes:
- angle: float
Measurement or specification of dihedral angle in degrees; prefer
IC_Residue.bond_set()
to set- hedron1, hedron2: Hedron object references
The two hedra which form the dihedral angle
- h1key, h2key: tuples of AtomKeys
Hash keys for hedron1 and hedron2
- id3,id32: tuples of AtomKeys
First 3 and second 3 atoms comprising dihedron; hxkey orders may differ
- ric: IC_Residue object reference
IC_Residue
object containing this dihedral- reverse: bool
Indicates order of atoms in dihedron is reversed from order of atoms in hedra
- primary: bool
True if this is psi, phi, omega or a sidechain chi angle
- pclass: string (primary angle class)
re_class with X for adjacent residue according to nomenclature (psi, omega, phi)
- cst, rcst: numpy [4][4] arrays
transformations to (cst) and from (rcst) Dihedron coordinate space defined with atom 2 (Hedron 1 center atom) at the origin. Views on
IC_Chain.dCoordSpace
.
Methods
angle()
getter/setter for dihdral angle in degrees; prefer
IC_Residue.bond_set()
bits()
return
IC_Residue.pic_flags
bitmask for dihedron psi, omega, etc- __init__(*args: list[AtomKey] | tuple[AtomKey, AtomKey, AtomKey, AtomKey], **kwargs: str) None
Init Dihedron with sequence of AtomKeys and optional dihedral angle.
- Acceptable input:
As for Edron, plus optional ‘dihedral’ keyworded angle value.
- __repr__() str
Print string for Dihedron object.
- property angle: float
Get dihedral angle.
- static angle_dif(a1: float | ndarray, a2: float | ndarray)
Get angle difference between two +/- 180 angles.
- static angle_avg(alst: list, in_rads: bool = False, out_rads: bool = False)
Get average of list of +/-180 angles.
- Parameters:
alst (List) – list of angles to average
in_rads (bool) – input values are in radians
out_rads (bool) – report result in radians
- static angle_pop_sd(alst: list, avg: float)
Get population standard deviation for list of +/-180 angles.
should be sample std dev but avoid len(alst)=1 -> div by 0
- bits() int
Get
IC_Residue.pic_flags
bitmasks for self is psi, omg, phi, pomg, chiX.
- class Bio.PDB.internal_coords.AtomKey(*args: IC_Residue | Atom | list | dict | str, **kwargs: str)
Bases:
object
Class for dict keys to reference atom coordinates.
AtomKeys capture residue and disorder information together, and provide a no-whitespace string key for .pic files.
Supports rich comparison and multiple ways to instantiate.
- AtomKeys contain:
residue position (respos), insertion code (icode), 1 or 3 char residue name (resname), atom name (atm), altloc (altloc), and occupancy (occ)
Use
AtomKey.fields
to get the index to the component of interest by name:Get C-alpha atoms from IC_Chain atomArray and atomArrayIndex with AtomKeys:
atmNameNdx = internal_coords.AtomKey.fields.atm CaSelection = [ atomArrayIndex.get(k) for k in atomArrayIndex.keys() if k.akl[atmNameNdx] == "CA" ] AtomArrayCa = atomArray[CaSelection]
Get all phenylalanine atoms in a chain:
resNameNdx = internal_coords.AtomKey.fields.resname PheSelection = [ atomArrayIndex.get(k) for k in atomArrayIndex.keys() if k.akl[resNameNdx] == "F" ] AtomArrayPhe = atomArray[PheSelection]
‘resname’ will be the uppercase 1-letter amino acid code if one of the 20 standard residues, otherwise the supplied 3-letter code. Supplied as input or read from .rbase attribute of
IC_Residue
.- Attributes:
- akl: tuple
All six fields of AtomKey
- fieldNames: tuple (Class Attribute)
Mapping of key index positions to names
- fields: namedtuple (Class Attribute)
Mapping of field names to index positions.
- id: str
‘_’-joined AtomKey fields, excluding ‘None’ fields
- atom_re: compiled regex (Class Attribute)
A compiled regular expression matching the string form of the key
- d2h: bool (Class Attribute) default False
Convert D atoms to H on input if True; must also modify
IC_Residue.accept_atoms
- missing: bool default False
AtomKey __init__’d from string is probably missing, set this flag to note the issue. Set by
IC_Residue.rak()
- ric: IC_Residue default None
If initialised with IC_Residue, this references the IC_residue
Methods
altloc_match(other)
Returns True if this AtomKey matches other AtomKey excluding altloc and occupancy fields
is_backbone()
Returns True if atom is N, CA, C, O or H
atm()
Returns atom name, e.g. N, CA, CB, etc.
cr_class()
Returns covalent radii class e.g. Csb
- atom_re = re.compile('^(?P<respos>-?\\d+)(?P<icode>[A-Za-z])?_(?P<resname>[a-zA-Z]+)_(?P<atm>[A-Za-z0-9]+)(?:_(?P<altloc>\\w))?(?:_(?P<occ>-?\\d\\.\\d+?))?$')
Pre-compiled regular expression to match an AtomKey string.
- fieldNames = ('respos', 'icode', 'resname', 'atm', 'altloc', 'occ')
- d2h = False
Set True to convert D Deuterium to H Hydrogen on input.
- __init__(*args: IC_Residue | Atom | list | dict | str, **kwargs: str) None
Initialize AtomKey with residue and atom data.
Examples of acceptable input:
(<IC_Residue>, 'CA', ...) : IC_Residue with atom info (<IC_Residue>, <Atom>) : IC_Residue with Biopython Atom ([52, None, 'G', 'CA', ...]) : list of ordered data fields (52, None, 'G', 'CA', ...) : multiple ordered arguments ({respos: 52, icode: None, atm: 'CA', ...}) : dict with fieldNames (respos: 52, icode: None, atm: 'CA', ...) : kwargs with fieldNames 52_G_CA, 52B_G_CA, 52_G_CA_0.33, 52_G_CA_B_0.33 : id strings
- __deepcopy__(memo)
Deep copy implementation for AtomKey.
- __repr__() str
Repr string from id.
- __hash__() int
Hash calculated at init from akl tuple.
- is_backbone() bool
Return True if is N, C, CA, O, or H.
- atm() str
Return atom name : N, CA, CB, O etc.
- cr_class() str | None
Return covalent radii class for atom or None.
- __ne__(other: object) bool
Test for inequality.
- __eq__(other: object) bool
Test for equality.
- __gt__(other: object) bool
Test greater than.
- __ge__(other: object) bool
Test greater or equal.
- __lt__(other: object) bool
Test less than.
- __le__(other: object) bool
Test less or equal.
- Bio.PDB.internal_coords.set_accuracy_95(num: float) float
Reduce floating point accuracy to 9.5 (xxxx.xxxxx).
Used by
IC_Residue
class writing PIC and SCAD files.- Parameters:
num (float) – input number
- Returns:
float with specified accuracy
- exception Bio.PDB.internal_coords.HedronMatchError
Bases:
Exception
Cannot find hedron in residue for given key.
- exception Bio.PDB.internal_coords.MissingAtomError
Bases:
Exception
Missing atom coordinates for hedron or dihedron.