Bio.PDB.cealign module

Protein Structural Alignment using Combinatorial Extension.

Python code written by Joao Rodrigues. C++ code and Python/C++ interface adapted from open-source Pymol and originally written by Jason Vertrees. The original license and notices are available in cealign folder.

Reference

Shindyalov, I.N., Bourne P.E. (1998). “Protein structure alignment by incremental combinatorial extension (CE) of the optimal path”. Protein Engineering. 11 (9): 739–747. PMID 9796821.

class Bio.PDB.cealign.CEAligner(window_size=8, max_gap=30)

Bases: object

Protein Structure Alignment by Combinatorial Extension.

__init__(window_size=8, max_gap=30)

Superimpose one set of atoms onto another using structural data.

Structures are superimposed using guide atoms, CA and C4’, for protein and nucleic acid molecules respectively.

Parameters:

window_sizefloat, optional: CE algorithm parameter. Used to define paths when building the CE similarity matrix. Default is 8.
max_gapfloat, optional: CE algorithm parameter. Maximum gap size. Default is 30.

get_guide_coord_from_structure(structure)

Return the coordinates of guide atoms in the structure.

We use guide atoms (C-alpha and C4’ atoms) since it is much faster than using all atoms in the calculation without a significant loss in accuracy.

set_reference(structure): Define a reference structure onto which all others will be aligned.

align(structure, transform=True, *, final_optimization=True)

Align the input structure onto the reference structure.

Parameters:

transform: bool, optional: If True (default), apply the rotation/translation that minimizes the RMSD between the two structures to the input structure. If False, the structure is not modified but the optimal RMSD will still be calculated.
final_optimization: bool, optional: If True (default), apply additional optimization to statistically significant alignments.