Bio.PDB.Polypeptide module
Polypeptide-related classes (construction and representation).
Simple example with multiple chains,
>>> from Bio.PDB.PDBParser import PDBParser
>>> from Bio.PDB.Polypeptide import PPBuilder
>>> structure = PDBParser().get_structure('2BEG', 'PDB/2BEG.pdb')
>>> ppb=PPBuilder()
>>> for pp in ppb.build_peptides(structure):
... print(pp.get_sequence())
LVFFAEDVGSNKGAIIGLMVGGVVIA
LVFFAEDVGSNKGAIIGLMVGGVVIA
LVFFAEDVGSNKGAIIGLMVGGVVIA
LVFFAEDVGSNKGAIIGLMVGGVVIA
LVFFAEDVGSNKGAIIGLMVGGVVIA
Example with non-standard amino acids using HETATM lines in the PDB file, in this case selenomethionine (MSE):
>>> from Bio.PDB.PDBParser import PDBParser
>>> from Bio.PDB.Polypeptide import PPBuilder
>>> structure = PDBParser().get_structure('1A8O', 'PDB/1A8O.pdb')
>>> ppb=PPBuilder()
>>> for pp in ppb.build_peptides(structure):
... print(pp.get_sequence())
DIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNW
TETLLVQNANPDCKTILKALGPGATLEE
TACQG
If you want to, you can include non-standard amino acids in the peptides:
>>> for pp in ppb.build_peptides(structure, aa_only=False):
... print(pp.get_sequence())
... print("%s %s" % (pp.get_sequence()[0], pp[0].get_resname()))
... print("%s %s" % (pp.get_sequence()[-7], pp[-7].get_resname()))
... print("%s %s" % (pp.get_sequence()[-6], pp[-6].get_resname()))
MDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTILKALGPGATLEEMMTACQG
M MSE
M MSE
M MSE
In this case the selenomethionines (the first and also seventh and sixth from last residues) have been shown as M (methionine) by the get_sequence method.
- Bio.PDB.Polypeptide.index_to_one(index)
Index to corresponding one letter amino acid name.
>>> index_to_one(0) 'A' >>> index_to_one(19) 'Y'
- Bio.PDB.Polypeptide.one_to_index(s)
One letter code to index.
>>> one_to_index('A') 0 >>> one_to_index('Y') 19
- Bio.PDB.Polypeptide.index_to_three(i)
Index to corresponding three letter amino acid name.
>>> index_to_three(0) 'ALA' >>> index_to_three(19) 'TYR'
- Bio.PDB.Polypeptide.three_to_index(s)
Three letter code to index.
>>> three_to_index('ALA') 0 >>> three_to_index('TYR') 19
- Bio.PDB.Polypeptide.three_to_one(s)
Three letter code to one letter code.
>>> three_to_one('ALA') 'A' >>> three_to_one('TYR') 'Y'
For non-standard amino acids, you get a KeyError:
>>> three_to_one('MSE') Traceback (most recent call last): ... KeyError: 'MSE'
- Bio.PDB.Polypeptide.one_to_three(s)
One letter code to three letter code.
>>> one_to_three('A') 'ALA' >>> one_to_three('Y') 'TYR'
- Bio.PDB.Polypeptide.is_aa(residue, standard=False)
Return True if residue object/string is an amino acid.
- Parameters
residue (L{Residue} or string) – a L{Residue} object OR a three letter amino acid code
standard (boolean) – flag to check for the 20 AA (default false)
>>> is_aa('ALA') True
Known three letter codes for modified amino acids are supported,
>>> is_aa('FME') True >>> is_aa('FME', standard=True) False
- Bio.PDB.Polypeptide.is_nucleic(residue, standard=False)
Return True if residue object/string is a nucleic acid.
- Parameters
residue (L{Residue} or string) – a L{Residue} object OR a three letter code
standard (boolean) – flag to check for the 8 (DNA + RNA) canonical bases. Default is False.
>>> is_nucleic('DA ') True
>>> is_nucleic('A ') True
Known three letter codes for modified nucleotides are supported,
>>> is_nucleic('A2L') True >>> is_nucleic('A2L', standard=True) False
- class Bio.PDB.Polypeptide.Polypeptide(iterable=(), /)
Bases:
list
A polypeptide is simply a list of L{Residue} objects.
- get_ca_list()
Get list of C-alpha atoms in the polypeptide.
- Returns
the list of C-alpha atoms
- Return type
[L{Atom}, L{Atom}, …]
- get_phi_psi_list()
Return the list of phi/psi dihedral angles.
- get_tau_list()
List of tau torsions angles for all 4 consecutive Calpha atoms.
- get_theta_list()
List of theta angles for all 3 consecutive Calpha atoms.
- get_sequence()
Return the AA sequence as a Seq object.
- Returns
polypeptide sequence
- Return type
L{Seq}
- __repr__()
Return string representation of the polypeptide.
Return <Polypeptide start=START end=END>, where START and END are sequence identifiers of the outer residues.