Package Bio :: Package AlignIO :: Module MafIO :: Class MafIndex
[hide private]
[frames] | no frames]

Class MafIndex

source code

object --+
         |
        MafIndex

Index for a MAF file.

The index is a sqlite3 database that is built upon creation of the object if necessary, and queried when methods search or get_spliced are used.

Instance Methods [hide private]
 
__init__(self, sqlite_file, maf_file, target_seqname)
Indexes or loads the index of a MAF file.
source code
 
__check_existing_db(self)
Perform basic sanity checks upon loading an existing index (PRIVATE).
source code
 
__make_new_index(self)
Read MAF file and generate SQLite index (PRIVATE).
source code
 
__maf_indexer(self)
Return index information for each bundle (PRIVATE).
source code
 
_get_record(self, offset)
Retrieve a single MAF record located at the offset provided (PRIVATE).
source code
 
search(self, starts, ends)
Search index database for MAF records overlapping ranges provided.
source code
 
get_spliced(self, starts, ends, strand=1)
Return a multiple alignment of the exact sequence range provided.
source code
 
__repr__(self)
Return a string representation of the index.
source code
 
__len__(self)
Return the number of records in the index.
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __str__, __subclasshook__

Static Methods [hide private]
 
_region2bin(start, end)
Find bins that a region may belong to (PRIVATE).
source code
 
_ucscbin(start, end)
Return the smallest bin a given region will fit into (PRIVATE).
source code
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, sqlite_file, maf_file, target_seqname)
(Constructor)

source code 
Indexes or loads the index of a MAF file.
Overrides: object.__init__

__maf_indexer(self)

source code 

Return index information for each bundle (PRIVATE).

Yields index information for each bundle in the form of (bin, start, end, offset) tuples where start and end are 0-based inclusive coordinates.

_region2bin(start, end)
Static Method

source code 

Find bins that a region may belong to (PRIVATE).

Converts a region to a list of bins that it may belong to, including largest and smallest bins.

_ucscbin(start, end)
Static Method

source code 

Return the smallest bin a given region will fit into (PRIVATE).

Adapted from http://genomewiki.ucsc.edu/index.php/Bin_indexing_system

search(self, starts, ends)

source code 

Search index database for MAF records overlapping ranges provided.

Returns MultipleSeqAlignment results in order by start, then end, then internal offset field.

starts should be a list of 0-based start coordinates of segments in the reference. ends should be the list of the corresponding segment ends (in the half-open UCSC convention: http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/).

get_spliced(self, starts, ends, strand=1)

source code 

Return a multiple alignment of the exact sequence range provided.

Accepts two lists of start and end positions on target_seqname, representing exons to be spliced in silico. Returns a MultipleSeqAlignment of the desired sequences spliced together.

starts should be a list of 0-based start coordinates of segments in the reference. ends should be the list of the corresponding segment ends (in the half-open UCSC convention: http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/).

To ask for the alignment portion corresponding to the first 100 nucleotides of the reference sequence, you would use search([0], [100])

__repr__(self)
(Representation operator)

source code 
Return a string representation of the index.
Overrides: object.__repr__