Bio.Align.psl module

Bio.Align support for the “psl” pairwise alignment format.

The Pattern Space Layout (PSL) format, described by UCSC, stores a series of pairwise alignments in a single file. Typically they are used for transcript to genome alignments. PSL files store the alignment positions and alignment scores, but do not store the aligned sequences.

See http://genome.ucsc.edu/FAQ/FAQformat.html#format2

You are expected to use this module via the Bio.Align functions.

Coordinates in the PSL format are defined in terms of zero-based start positions (like Python) and aligning region sizes.

A minimal aligned region of length one and starting at first position in the source sequence would have start == 0 and size == 1.

As we can see in this example, start + size will give one more than the zero-based end position. We can therefore manipulate start and start + size as python list slice boundaries.

class Bio.Align.psl.AlignmentWriter(target, header=True, mask=None, wildcard='N')

Bases: Bio.Align.interfaces.AlignmentWriter

Alignment file writer for the Pattern Space Layout (PSL) file format.

fmt: Optional[str] = 'PSL'
__init__(target, header=True, mask=None, wildcard='N')

Create an AlignmentWriter object.

Arguments:
  • target - output stream or file name

  • header - If True (default), write the PSL header consisting of

    five lines containing the PSL format version and a header for each column. If False, suppress the PSL header, resulting in a simple tab-delimited file.

  • mask - Specify if repeat regions in the target sequence are

    masked and should be reported in the repMatches field of the PSL file instead of in the matches field. Acceptable values are None : no masking (default); “lower”: masking by lower-case characters; “upper”: masking by upper-case characters.

  • wildcard - Report alignments to the wildcard character in the

    target or query sequence in the nCount field of the PSL file instead of in the matches, misMatches, or repMatches fields. Default value is ‘N’.

write_header(stream, alignments)

Write the PSL header.

format_alignment(alignment)

Return a string with a single alignment formatted as one PSL line.

__abstractmethods__ = frozenset({})
__annotations__ = {'fmt': 'Optional[str]'}
class Bio.Align.psl.AlignmentIterator(source)

Bases: Bio.Align.interfaces.AlignmentIterator

Alignment iterator for Pattern Space Layout (PSL) files.

Each line in the file contains one pairwise alignment, which are loaded and returned incrementally. Alignment score information such as the number of matches and mismatches are stored as attributes of each alignment.

fmt: Optional[str] = 'PSL'
__abstractmethods__ = frozenset({})
__annotations__ = {'fmt': 'Optional[str]'}