Reading from unix pipes

From Biopython
Revision as of 12:01, 5 June 2009 by Giles.weaver (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Problem

There are many circumstances when reading data from a Unix pipe is preferable to reading data from a file. One example is reading sequences from a compressed file, which is often preferable to uncompressing the file and then reading from it.

Solution

This example script reads a solexa/illumina fastq from stdin, converts the data to sanger fastq and writes it to stdout.

import sys
from Bio import SeqIO
 
recs = SeqIO.parse(sys.stdin, "fastq-solexa")
SeqIO.write(recs, sys.stdout, "fastq")

The following bash command can be used to extract the compressed sequence and pipe it to the script (solexa2sanger_fq.py).

gunzip -c some_solexa.fastq.gz | python solexa2sanger_fq.py

This will write the sequence in sanger fastq format to stdout - in this case the screen.

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox