Package Bio :: Package motifs
[hide private]
[frames] | no frames]

Package motifs

source code

Tools for sequence motif analysis.

Bio.motifs contains the core Motif class containing various I/O methods
as well as methods for motif comparisons and motif searching in sequences.
It also includes functionality for parsing output from the AlignACE, MEME,
and MAST programs, as well as files in the TRANSFAC format.

Bio.motifs is replacing the older and now obsolete Bio.Motif module.

Submodules [hide private]

Classes [hide private]
  Instances
A class representing instances of sequence motifs.
  Motif
A class representing sequence motifs.
Functions [hide private]
 
create(instances, alphabet=None) source code
 
parse(handle, format)
Parses an output file of motif finding programs.
source code
 
read(handle, format)
Reads a motif from a handle using a specified file-format.
source code
 
write(motifs, format)
Returns a string representation of motifs in a given format
source code
Variables [hide private]
  __package__ = 'Bio.motifs'
Function Details [hide private]

parse(handle, format)

source code 
Parses an output file of motif finding programs.

Currently supported formats (case is ignored):
 - AlignAce:      AlignAce output file format
 - MEME:          MEME output file motif
 - MAST:          MAST output file motif
 - TRANSFAC:      TRANSFAC database file format
 - pfm:           JASPAR-style position-frequency matrix
 - jaspar:        JASPAR-style multiple PFM format
 - sites:         JASPAR-style sites file
As files in the pfm and sites formats contain only a single motif,
it is easier to use Bio.motifs.read() instead of Bio.motifs.parse()
for those.

For example:

>>> from Bio import motifs
>>> for m in motifs.parse(open("Motif/alignace.out"), "AlignAce"):
...     print(m.consensus)
TCTACGATTGAG
CTGCAGCTAGCTACGAGTGAG
GTGCTCTAAGCATAGTAGGCG
GCCACTAGCAGAGCAGGGGGC
CGACTCAGAGGTT
CCACGCTAAGAGAGGTGCCGGAG
GCGCGTCGCTGAGCA
GTCCATCGCAAAGCGTGGGGC
GGGATCAGAGGGCCG
TGGAGGCGGGG
GACCAGAGCTTCGCATGGGGG
GGCGTGCGTG
GCTGGTTGCTGTTCATTAGG
GCCGGCGGCAGCTAAAAGGG
GAGGCCGGGGAT
CGACTCGTGCTTAGAAGG

read(handle, format)

source code 
Reads a motif from a handle using a specified file-format.

This supports the same formats as Bio.motifs.parse(), but
only for files containing exactly one motif.  For example,
reading a JASPAR-style pfm file:

>>> from Bio import motifs
>>> m = motifs.read(open("motifs/SRF.pfm"), "pfm")
>>> m.consensus
Seq('GCCCATATATGG', IUPACUnambiguousDNA())

Or a single-motif MEME file,

>>> from Bio import motifs
>>> m = motifs.read(open("motifs/meme.out"), "meme")
>>> m.consensus
Seq('CTCAATCGTA', IUPACUnambiguousDNA())

If the handle contains no records, or more than one record,
an exception is raised:

>>> from Bio import motifs
>>> motif = motifs.read(open("motifs/alignace.out"), "AlignAce")
Traceback (most recent call last):
    ...
ValueError: More than one motif found in handle

If however you want the first motif from a file containing
multiple motifs this function would raise an exception (as
shown in the example above).  Instead use:

>>> from Bio import motifs
>>> record = motifs.parse(open("motifs/alignace.out"), "alignace")
>>> motif = record[0]
>>> motif.consensus
Seq('TCTACGATTGAG', IUPACUnambiguousDNA())

Use the Bio.motifs.parse(handle, format) function if you want
to read multiple records from the handle.

write(motifs, format)

source code 
Returns a string representation of motifs in a given format

Currently supported formats (case is ignored):
 - pfm : JASPAR simple single Position Frequency Matrix
 - jaspar : JASPAR multiple PFM format
 - transfac : TRANSFAC like files