Bio.PDB.Polypeptide module

Polypeptide-related classes (construction and representation).

Simple example with multiple chains,

>>> from Bio.PDB.PDBParser import PDBParser
>>> from Bio.PDB.Polypeptide import PPBuilder
>>> structure = PDBParser().get_structure('2BEG', 'PDB/2BEG.pdb')
>>> ppb=PPBuilder()
>>> for pp in ppb.build_peptides(structure):
...     print(pp.get_sequence())
LVFFAEDVGSNKGAIIGLMVGGVVIA
LVFFAEDVGSNKGAIIGLMVGGVVIA
LVFFAEDVGSNKGAIIGLMVGGVVIA
LVFFAEDVGSNKGAIIGLMVGGVVIA
LVFFAEDVGSNKGAIIGLMVGGVVIA

Example with non-standard amino acids using HETATM lines in the PDB file, in this case selenomethionine (MSE):

>>> from Bio.PDB.PDBParser import PDBParser
>>> from Bio.PDB.Polypeptide import PPBuilder
>>> structure = PDBParser().get_structure('1A8O', 'PDB/1A8O.pdb')
>>> ppb=PPBuilder()
>>> for pp in ppb.build_peptides(structure):
...     print(pp.get_sequence())
DIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNW
TETLLVQNANPDCKTILKALGPGATLEE
TACQG

If you want to, you can include non-standard amino acids in the peptides:

>>> for pp in ppb.build_peptides(structure, aa_only=False):
...     print(pp.get_sequence())
...     print("%s %s" % (pp.get_sequence()[0], pp[0].get_resname()))
...     print("%s %s" % (pp.get_sequence()[-7], pp[-7].get_resname()))
...     print("%s %s" % (pp.get_sequence()[-6], pp[-6].get_resname()))
MDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTILKALGPGATLEEMMTACQG
M MSE
M MSE
M MSE

In this case the selenomethionines (the first and also seventh and sixth from last residues) have been shown as M (methionine) by the get_sequence method.

Bio.PDB.Polypeptide.index_to_one(index)

Index to corresponding one letter amino acid name.

>>> index_to_one(0)
'A'
>>> index_to_one(19)
'Y'
Bio.PDB.Polypeptide.one_to_index(s)

One letter code to index.

>>> one_to_index('A')
0
>>> one_to_index('Y')
19
Bio.PDB.Polypeptide.index_to_three(i)

Index to corresponding three letter amino acid name.

>>> index_to_three(0)
'ALA'
>>> index_to_three(19)
'TYR'
Bio.PDB.Polypeptide.three_to_index(s)

Three letter code to index.

>>> three_to_index('ALA')
0
>>> three_to_index('TYR')
19
Bio.PDB.Polypeptide.three_to_one(s)

Three letter code to one letter code.

>>> three_to_one('ALA')
'A'
>>> three_to_one('TYR')
'Y'

For non-standard amino acids, you get a KeyError:

>>> three_to_one('MSE')
Traceback (most recent call last):
   ...
KeyError: 'MSE'
Bio.PDB.Polypeptide.one_to_three(s)

One letter code to three letter code.

>>> one_to_three('A')
'ALA'
>>> one_to_three('Y')
'TYR'
Bio.PDB.Polypeptide.is_aa(residue, standard=False)

Return True if residue object/string is an amino acid.

Parameters
  • residue (L{Residue} or string) – a L{Residue} object OR a three letter amino acid code

  • standard (boolean) – flag to check for the 20 AA (default false)

>>> is_aa('ALA')
True

Known three letter codes for modified amino acids are supported,

>>> is_aa('FME')
True
>>> is_aa('FME', standard=True)
False
Bio.PDB.Polypeptide.is_nucleic(residue, standard=False)

Return True if residue object/string is a nucleic acid.

Parameters
  • residue (L{Residue} or string) – a L{Residue} object OR a three letter code

  • standard (boolean) – flag to check for the 8 (DNA + RNA) canonical bases. Default is False.

>>> is_nucleic('DA ')
True
>>> is_nucleic('A  ')
True

Known three letter codes for modified nucleotides are supported,

>>> is_nucleic('A2L')
True
>>> is_nucleic('A2L', standard=True)
False
class Bio.PDB.Polypeptide.Polypeptide(iterable=(), /)

Bases: list

A polypeptide is simply a list of L{Residue} objects.

get_ca_list()

Get list of C-alpha atoms in the polypeptide.

Returns

the list of C-alpha atoms

Return type

[L{Atom}, L{Atom}, …]

get_phi_psi_list()

Return the list of phi/psi dihedral angles.

get_tau_list()

List of tau torsions angles for all 4 consecutive Calpha atoms.

get_theta_list()

List of theta angles for all 3 consecutive Calpha atoms.

get_sequence()

Return the AA sequence as a Seq object.

Returns

polypeptide sequence

Return type

L{Seq}

__repr__()

Return string representation of the polypeptide.

Return <Polypeptide start=START end=END>, where START and END are sequence identifiers of the outer residues.

class Bio.PDB.Polypeptide.CaPPBuilder(radius=4.3)

Bases: Bio.PDB.Polypeptide._PPBuilder

Use CA–CA distance to find polypeptides.

__init__(radius=4.3)

Initialize the class.

class Bio.PDB.Polypeptide.PPBuilder(radius=1.8)

Bases: Bio.PDB.Polypeptide._PPBuilder

Use C–N distance to find polypeptides.

__init__(radius=1.8)

Initialize the class.