Remove PDB disordered atoms

From Biopython
(Difference between revisions)
Jump to: navigation, search
m (Small additional note to remove atom alternate location value before output.)
 
(9 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
Contributed by Ramon Crehuet
 +
 
== Problem ==
 
== Problem ==
  
 
You have a PDB with disordered atoms, i.e. different atomic positions with occupancies that add up to 100%. From this PDB you want to create a new one having only one set of the disordered atoms. This can be necessary if you want to perform RMSD calculations or Molecular Dynamics simulations.
 
You have a PDB with disordered atoms, i.e. different atomic positions with occupancies that add up to 100%. From this PDB you want to create a new one having only one set of the disordered atoms. This can be necessary if you want to perform RMSD calculations or Molecular Dynamics simulations.
 
== Solution ==
 
== Solution ==
[[Bio.PDB]] is proficient in dealing disordered atoms. Each disordered atom has a property indicating its alternative positions: atom.altloc. Usually there are only two alternative positions labelled 'A' and 'B'
+
[http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc118 Bio.PDB] is proficient in dealing with disordered atoms. Each disordered atom has a property indicating its alternative positions: atom.altloc. Usually there are only two alternative positions labelled 'A' and 'B'. The key is to save a PDB with the optional "select" argument. This argument needs to return a True value for the atoms that have to be saved. In the following example we save all not-disordered atoms and the 'A' positions of the disordered ones.
 +
 
 +
<python>
 +
from Bio.PDB import *
 +
 
 +
parser=PDBParser()
 +
s=parser.get_structure('my_pdb', 'my_pdb.pdb')
 +
io=PDBIO()
 +
 
 +
class NotDisordered(Select):
 +
    def accept_atom(self, atom):
 +
        return not atom.is_disordered() or
 +
              atom.get_altloc()=='A'
 +
 
 +
io=PDBIO()
 +
io.set_structure(s)
 +
io.save("ordered.pdb", select=NotDisordered())
 +
</python>
 +
 
 +
Note that the code above does not eliminate the alternate location identifier ('A' in the example above).  It is the programmer's responsibility to eliminate the identifier when necessary.
 +
 
 +
<python>
 +
keepAltID = ...
 +
class NMROutputSelector2( Select ): # Inherit methods from Select class
 +
def accept_atom( self, atom ):
 +
  if ( not atom.is_disordered() ) or atom.get_altloc() ==  keepAltID:
 +
  atom.set_altloc( ' ' ) # Eliminate alt location ID before output.
 +
  return True
 +
  else: # Alt location was not one to be output.
 +
  return False
 +
# end of accept_atom()
 +
# end of NMROutputSelector2()
 +
</python>
 +
 
 +
== Discussion ==
 +
It is trivial to change that to save 'B' altloc positions. One can even do more complicated selections based on other atom properties. The key is to generate a class that returns True or False for a given atom.
 +
One could also think of deleting atoms with 'B' values in atom.altloc.
 +
<python>
 +
for atom in all_atoms:  # all_atoms is a list containg all atoms
 +
  if atom.altloc=='B': del atom
 +
</python>
 +
but that '''does not work''', because it only deletes the local variable and not the PDB structure.
  
  
 
  [[Category:Cookbook]]
 
  [[Category:Cookbook]]

Latest revision as of 06:55, 6 July 2010

Contributed by Ramon Crehuet

Problem

You have a PDB with disordered atoms, i.e. different atomic positions with occupancies that add up to 100%. From this PDB you want to create a new one having only one set of the disordered atoms. This can be necessary if you want to perform RMSD calculations or Molecular Dynamics simulations.

Solution

Bio.PDB is proficient in dealing with disordered atoms. Each disordered atom has a property indicating its alternative positions: atom.altloc. Usually there are only two alternative positions labelled 'A' and 'B'. The key is to save a PDB with the optional "select" argument. This argument needs to return a True value for the atoms that have to be saved. In the following example we save all not-disordered atoms and the 'A' positions of the disordered ones.

from Bio.PDB import *
 
parser=PDBParser()
s=parser.get_structure('my_pdb', 'my_pdb.pdb')
io=PDBIO()
 
class NotDisordered(Select):
    def accept_atom(self, atom):
        return not atom.is_disordered() or
               atom.get_altloc()=='A'
 
io=PDBIO()
io.set_structure(s)
io.save("ordered.pdb", select=NotDisordered())

Note that the code above does not eliminate the alternate location identifier ('A' in the example above). It is the programmer's responsibility to eliminate the identifier when necessary.

keepAltID = ...
class NMROutputSelector2( Select ): # Inherit methods from Select class
 def accept_atom( self, atom ):
  if ( not atom.is_disordered() ) or atom.get_altloc() ==  keepAltID:
   atom.set_altloc( ' ' ) # Eliminate alt location ID before output.
   return True
  else: # Alt location was not one to be output.
   return False
 # end of accept_atom()
# end of NMROutputSelector2()

Discussion

It is trivial to change that to save 'B' altloc positions. One can even do more complicated selections based on other atom properties. The key is to generate a class that returns True or False for a given atom. One could also think of deleting atoms with 'B' values in atom.altloc.

 
for atom in all_atoms:  # all_atoms is a list containg all atoms
   if atom.altloc=='B': del atom

but that does not work, because it only deletes the local variable and not the PDB structure.

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox