Bio.SeqIO.PhdIO module

Bio.SeqIO support for the “phd” file format.

PHD files are output by PHRED and used by PHRAP and CONSED.

You are expected to use this module via the Bio.SeqIO functions, under the format name “phd”. See also the underlying Bio.Sequencing.Phd module.

For example, using Bio.SeqIO we can read in one of the example PHRED files from the Biopython unit tests:

>>> from Bio import SeqIO
>>> for record in SeqIO.parse("Phd/phd1", "phd"):
...     print(record.id)
...     print("%s..." % record.seq[:10])
...     print("%s..." % record.letter_annotations["phred_quality"][:10])
34_222_(80-A03-19).b.ab1
ctccgtcgga...
[9, 9, 10, 19, 22, 37, 28, 28, 24, 22]...
425_103_(81-A03-19).g.ab1
cgggatccca...
[14, 17, 22, 10, 10, 10, 15, 8, 8, 9]...
425_7_(71-A03-19).b.ab1
acataaatca...
[10, 10, 10, 10, 8, 8, 6, 6, 6, 6]...

Since PHRED files contain quality scores, you can save them as FASTQ or as QUAL files, for example using Bio.SeqIO.write(…), or simply with the format method of the SeqRecord object:

>>> print(record[:50].format("fastq"))
@425_7_(71-A03-19).b.ab1
acataaatcaaattactnaccaacacacaaaccngtctcgcgtagtggag
+
++++))'''')(''')$!$''')''''(+.''$!$))))+)))'''''''

Or,

>>> print(record[:50].format("qual"))
>425_7_(71-A03-19).b.ab1
10 10 10 10 8 8 6 6 6 6 8 7 6 6 6 8 3 0 3 6 6 6 8 6 6 6 6 7
10 13 6 6 3 0 3 8 8 8 8 10 8 8 8 6 6 6 6 6 6 6

Note these examples only show the first 50 bases to keep the output short.

Bio.SeqIO.PhdIO.PhdIterator(source)

Return SeqRecord objects from a PHD file.

Arguments:
  • source - input stream opened in text mode, or a path to a file

This uses the Bio.Sequencing.Phd module to do the hard work.

class Bio.SeqIO.PhdIO.PhdWriter(handle)

Bases: Bio.SeqIO.Interfaces.SequenceWriter

Class to write Phd format files.

__init__(self, handle)

Initialize the class.

write_record(self, record)

Write a single Phd record to the file.