Bio.PDB.cealign module

Protein Structural Alignment using Combinatorial Extension.

Python code written by Joao Rodrigues. C++ code and Python/C++ interface adapted from open-source Pymol and originally written by Jason Vertrees. The original license and notices are available in cealign folder.

Reference

Shindyalov, I.N., Bourne P.E. (1998). “Protein structure alignment by incremental combinatorial extension (CE) of the optimal path”. Protein Engineering. 11 (9): 739–747. PMID 9796821.

class Bio.PDB.cealign.CEAligner(window_size=8, max_gap=30)

Bases: object

Protein Structure Alignment by Combinatorial Extension.

__init__(window_size=8, max_gap=30)

Superimpose one set of atoms onto another using structural data.

Structures are superimposed using guide atoms, CA and C4’, for protein and nucleic acid molecules respectively.

Parameters:
window_sizefloat, optional

CE algorithm parameter. Used to define paths when building the CE similarity matrix. Default is 8.

max_gapfloat, optional

CE algorithm parameter. Maximum gap size. Default is 30.

get_guide_coord_from_structure(structure)

Return the coordinates of guide atoms in the structure.

We use guide atoms (C-alpha and C4’ atoms) since it is much faster than using all atoms in the calculation without a significant loss in accuracy.

set_reference(structure)

Define a reference structure onto which all others will be aligned.

align(structure, transform=True, *, final_optimization=True)

Align the input structure onto the reference structure.

Parameters:
transform: bool, optional

If True (default), apply the rotation/translation that minimizes the RMSD between the two structures to the input structure. If False, the structure is not modified but the optimal RMSD will still be calculated.

final_optimization: bool, optional

If True (default), apply additional optimization to statistically significant alignments.