Package Bio :: Package AlignIO :: Module MafIO
[hide private]
[frames] | no frames]

Module MafIO

source code

Bio.AlignIO support for the "maf" multiple alignment format.

The Multiple Alignment Format, described by UCSC, stores a series of multiple alignments in a single file. It is suitable for whole-genome to whole-genome alignments, metadata such as source chromosome, start position, size, and strand can be stored.

See http://genome.ucsc.edu/FAQ/FAQformat.html#format5

You are expected to use this module via the Bio.AlignIO functions(or the Bio.SeqIO functions if you want to work directly with the gapped sequences).

Coordinates in the MAF format are defined in terms of zero-based start positions (like Python) and aligning region sizes.

A minimal aligned region of length one and starting at first position in the source sequence would have start == 0 and size == 1.

As we can see on this example, start + size will give one more than the zero-based end position. We can therefore manipulate start and start + size as python list slice boundaries.

For an inclusive end coordinate, we need to use end = start + size - 1. A 1-column wide alignment would have start == end.

Classes [hide private]
  MafWriter
Accepts a MultipleSeqAlignment object, writes a MAF file.
  MafIndex
Index for a MAF file.
Functions [hide private]
 
MafIterator(handle, seq_count=None, alphabet=SingleLetterAlphabet())
Iterate over a MAF file handle as MultipleSeqAlignment objects.
source code
Variables [hide private]
  MAFINDEX_VERSION = 1
  __package__ = 'Bio.AlignIO'
Function Details [hide private]

MafIterator(handle, seq_count=None, alphabet=SingleLetterAlphabet())

source code 

Iterate over a MAF file handle as MultipleSeqAlignment objects.

Iterates over lines in a MAF file-like object (handle), yielding MultipleSeqAlignment objects. SeqRecord IDs generally correspond to species names.