Bio.Align.psl module
Bio.Align support for the “psl” pairwise alignment format.
The Pattern Space Layout (PSL) format, described by UCSC, stores a series of pairwise alignments in a single file. Typically they are used for transcript to genome alignments. PSL files store the alignment positions and alignment scores, but do not store the aligned sequences.
See http://genome.ucsc.edu/FAQ/FAQformat.html#format2
You are expected to use this module via the Bio.Align functions.
Coordinates in the PSL format are defined in terms of zero-based start positions (like Python) and aligning region sizes.
A minimal aligned region of length one and starting at first position in the
source sequence would have start == 0
and size == 1
.
As we can see in this example, start + size
will give one more than the
zero-based end position. We can therefore manipulate start
and
start + size
as python list slice boundaries.
- class Bio.Align.psl.AlignmentWriter(target, header=True, mask=None, wildcard='N')
Bases:
Bio.Align.interfaces.AlignmentWriter
Alignment file writer for the Pattern Space Layout (PSL) file format.
- __init__(target, header=True, mask=None, wildcard='N')
Create an AlignmentWriter object.
- Arguments:
target - output stream or file name
- header - If True (default), write the PSL header consisting of
five lines containing the PSL format version and a header for each column. If False, suppress the PSL header, resulting in a simple tab-delimited file.
- mask - Specify if repeat regions in the target sequence are
masked and should be reported in the repMatches field of the PSL file instead of in the matches field. Acceptable values are None : no masking (default); “lower”: masking by lower-case characters; “upper”: masking by upper-case characters.
- wildcard - Report alignments to the wildcard character in the
target or query sequence in the nCount field of the PSL file instead of in the matches, misMatches, or repMatches fields. Default value is ‘N’.
- write_header(alignments)
Write the PSL header.
- format_alignment(alignment)
Return a string with a single alignment formatted as one PSL line.
- class Bio.Align.psl.AlignmentIterator(source)
Bases:
Bio.Align.interfaces.AlignmentIterator
Alignment iterator for Pattern Space Layout (PSL) files.
Each line in the file contains one pairwise alignment, which are loaded and returned incrementally. Alignment score information such as the number of matches and mismatches are stored as attributes of each alignment.
- __init__(source)
Create an AlignmentIterator object.
- Arguments:
source - input data or file name
- __abstractmethods__ = frozenset({})