Bio.Align.bigpsl module

Bio.Align support for alignment files in the bigPsl format.

A bigPsl file is a bigBed file with a BED12+13 format consisting of the 12 predefined BED fields and 13 custom fields defined in the autoSql file bigPsl.as. This module uses the Bio.Align.bigbed module to parse the file, but stores the data in a PSL-consistent manner as defined in bigPsl.as. As the bigPsl format is a special case of the bigBed format, bigPsl files are binary and are indexed as bigBed files.

See http://genome.ucsc.edu/goldenPath/help/bigPsl.html for more information.

You are expected to use this module via the Bio.Align functions.

class Bio.Align.bigpsl.AlignmentWriter(target, targets=None, compress=True, extraIndex=(), cds=False, fa=False, mask=None, wildcard='N')

Bases: Bio.Align.bigbed.AlignmentWriter

Alignment file writer for the bigPsl file format.

fmt: Optional[str] = 'bigPsl'
__init__(target, targets=None, compress=True, extraIndex=(), cds=False, fa=False, mask=None, wildcard='N')

Create an AlignmentWriter object.

Arguments:
  • target - output stream or file name.

  • targets - A list of SeqRecord objects with the chromosomes in the

    order as they appear in the alignments. The sequence contents in each SeqRecord may be undefined, but the sequence length must be defined, as in this example:

    SeqRecord(Seq(None, length=248956422), id=”chr1”)

    If targets is None (the default value), the alignments must have an attribute .targets providing the list of SeqRecord objects.

  • compress - If True (default), compress data using zlib.

    If False, do not compress data.

  • extraIndex - List of strings with the names of extra columns to be

    indexed. Default value is an empty list.

  • cds - If True, look for a query feature of type CDS and write

    it in NCBI style in the PSL file (default: False).

  • fa - If True, include the query sequence in the PSL file

    (default: False).

  • mask - Specify if repeat regions in the target sequence are

    masked and should be reported in the repMatches field instead of in the matches field. Acceptable values are None : no masking (default); “lower”: masking by lower-case characters; “upper”: masking by upper-case characters.

  • wildcard - Report alignments to the wildcard character in the

    target or query sequence in the nCount field instead of in the matches, misMatches, or repMatches fields. Default value is ‘N’.

write_file(stream, alignments)

Write the file.

__abstractmethods__ = frozenset({})
__annotations__ = {'fmt': 'Optional[str]'}
class Bio.Align.bigpsl.AlignmentIterator(source)

Bases: Bio.Align.bigbed.AlignmentIterator

Alignment iterator for bigPsl files.

The pairwise alignments stored in the bigPsl file are loaded and returned incrementally. Additional alignment information is stored as attributes of each alignment.

fmt: Optional[str] = 'bigPsl'
__abstractmethods__ = frozenset({})
__annotations__ = {'fmt': 'Optional[str]'}