Package Bio :: Package Sequencing :: Module Ace
[hide private]
[frames] | no frames]

Module Ace

source code

Parser for ACE files output by PHRAP.

Written by Frank Kauff (fkauff@duke.edu) and Cymon J. Cox (cymon@duke.edu)

Usage:

There are two ways of reading an ace file:

  1. The function 'read' reads the whole file at once;
  2. The function 'parse' reads the file contig after contig.

First option, parse whole ace file at once:

from Bio.Sequencing import Ace
acefilerecord = Ace.read(open('my_ace_file.ace'))
This gives you:

The Contig class holds the info of the CO tag, CT and WA tags, and all the reads used for this contig in a list of instances of the Read class, e.g.:

contig3 = acefilerecord.contigs[2]
read4 = contig3.reads[3]
RD_of_read4 = read4.rd
DS_of_read4 = read4.ds

CT, WA, RT tags from the end of the file can appear anywhere are automatically sorted into the right place.

see _RecordConsumer for details.

The second option is to iterate over the contigs of an ace file one by one in the ususal way:

from Bio.Sequencing import Ace
contigs = Ace.parse(open('my_ace_file.ace'))
for contig in contigs:
    print(contig.name)
    ...

Please note that for memory efficiency, when using the iterator approach, only one contig is kept in memory at once. However, there can be a footer to the ACE file containing WA, CT, RT or WR tags which contain additional meta-data on the contigs. Because the parser doesn't see this data until the final record, it cannot be added to the appropriate records. Instead these tags will be returned with the last contig record. Thus an ace file does not entirerly suit the concept of iterating. If WA, CT, RT, WR tags are needed, the 'read' function rather than the 'parse' function might be more appropriate.

Classes [hide private]
  rd
RD (reads), store a read with its name, sequence etc.
  qa
QA (read quality), including which part if any was used as the consensus.
  ds
DS lines, include file name of a read's chromatogram file.
  af
AF lines, define the location of the read within the contig.
  bs
BS (base segment), which read was chosen as the consensus at each position.
  rt
RT (transient read tags), generated by crossmatch and phrap.
  ct
CT (consensus tags).
  wa
WA (whole assembly tag), holds the assembly program name, version, etc.
  wr
WR lines.
  Reads
Holds information about a read supporting an ACE contig.
  Contig
Holds information about a contig from an ACE record.
  ACEFileRecord
Holds data of an ACE file.
Functions [hide private]
 
parse(handle)
Iterate of ACE file contig by contig.
source code
 
read(handle)
Parses the full ACE file in list of contigs.
source code
Variables [hide private]
  __package__ = 'Bio.Sequencing'
Function Details [hide private]

parse(handle)

source code 

Iterate of ACE file contig by contig.

Argument handle is a file-like object.

This function returns an iterator that allows you to iterate over the ACE file record by record:

records = parse(handle)
for record in records:
    # do something with the record

where each record is a Contig object.