Bio.SeqIO.NibIO module

Bio.SeqIO support for the UCSC nib file format.

Nib stands for nibble (4 bit) representation of nucleotide sequences. The two nibbles in a byte each store one nucleotide, represented numerically as follows:

  • 0 - T

  • 1 - C

  • 2 - A

  • 3 - G

  • 4 - N (unknown)

As the first bit in a nibble is set if the nucleotide is soft-masked, we additionally have:

  • 8 - t

  • 9 - c

  • a - a

  • b - g

  • c - n (unknown)

A nib file contains only one sequence record. You are expected to use this module via the Bio.SeqIO functions under the format name “nib”:

>>> from Bio import SeqIO
>>> record = SeqIO.read("Nib/test_even_bigendian.nib", "nib")
>>> print("%i %s..." % (len(record), record.seq[:20]))
50 nAGAAGagccgcNGgCActt...

For detailed information on the file format, please see the UCSC description at https://genome.ucsc.edu/FAQ/FAQformat.html.

class Bio.SeqIO.NibIO.NibIterator(source, alphabet=None)

Bases: Bio.SeqIO.Interfaces.SequenceIterator

Parser for nib files.

__init__(self, source, alphabet=None)

Iterate over a nib file and yield a SeqRecord.

  • source - a file-like object or a path to a file in the nib file format as defined by UCSC; the file must be opened in binary mode.

  • alphabet - always ignored.

Note that a nib file always contains only one sequence record. The sequence of the resulting SeqRecord object should match the sequence generated by Jim Kent’s nibFrag utility run with the -masked option.

This function is used internally via the Bio.SeqIO functions:

>>> from Bio import SeqIO
>>> record = SeqIO.read("Nib/test_even_bigendian.nib", "nib")
>>> print("%s %i" % (record.seq, len(record)))
nAGAAGagccgcNGgCActtGAnTAtCGTCgcCacCaGncGncTtGNtGG 50

You can also call it directly:

>>> with open("Nib/test_even_bigendian.nib", "rb") as handle:
...     for record in NibIterator(handle):
...         print("%s %i" % (record.seq, len(record)))
...
nAGAAGagccgcNGgCActtGAnTAtCGTCgcCacCaGncGncTtGNtGG 50
parse(self, handle)

Start parsing the file, and return a SeqRecord generator.

iterate(self, handle, byteorder)

Iterate over the records in the nib file.

class Bio.SeqIO.NibIO.NibWriter(target)

Bases: Bio.SeqIO.Interfaces.SequenceWriter

Nib file writer.

__init__(self, target)

Initialize a Nib writer object.

Arguments:
  • target - output stream opened in binary mode, or a path to a file

write_header(self)

Write the file header.

write_record(self, record)

Write a single record to the output file.

write_file(self, records)

Write the complete file with the records, and return the number of records.