Package Bio :: Package Blast :: Module NCBIStandalone
[hide private]
[frames] | no frames]

Module NCBIStandalone

source code

Code for calling standalone BLAST and parsing plain text output (DEPRECATED).

Rather than parsing the human readable plain text BLAST output (which seems to
change with every update to BLAST), we and the NBCI recommend you parse the
XML output instead. The plain text parser in this module still works at the
time of writing, but is considered obsolete and updating it to cope with the
latest versions of BLAST is not a priority for us.

This module also provides code to work with the "legacy" standalone version of
NCBI BLAST, tools blastall, rpsblast and blastpgp via three helper functions of
the same name. These functions are very limited for dealing with the output as
files rather than handles, for which the wrappers in Bio.Blast.Applications are
preferred. Furthermore, the NCBI themselves regard these command line tools as
"legacy", and encourage using the new BLAST+ tools instead. Biopython has
wrappers for these under Bio.Blast.Applications (see the tutorial).

Classes:
LowQualityBlastError     Except that indicates low quality query sequences.
BlastParser              Parses output from blast.
BlastErrorParser         Parses output and tries to diagnose possible errors.
PSIBlastParser           Parses output from psi-blast.
Iterator                 Iterates over a file of blast results.

_Scanner                 Scans output from standalone BLAST.
_BlastConsumer           Consumes output from blast.
_PSIBlastConsumer        Consumes output from psi-blast.
_HeaderConsumer          Consumes header information.
_DescriptionConsumer     Consumes description information.
_AlignmentConsumer       Consumes alignment information.
_HSPConsumer             Consumes hsp information.
_DatabaseReportConsumer  Consumes database report information.
_ParametersConsumer      Consumes parameters information.

Functions:
blastall        Execute blastall.
blastpgp        Execute blastpgp.
rpsblast        Execute rpsblast.

For calling the BLAST command line tools, we encourage you to use the
command line wrappers in Bio.Blast.Applications - the three functions
blastall, blastpgp and rpsblast are considered to be obsolete now, and
are likely to be deprecated and then removed in future releases.

Classes [hide private]
  LowQualityBlastError
Error caused by running a low quality sequence through BLAST.
  ShortQueryBlastError
Error caused by running a short query sequence through BLAST.
  _Scanner
Scan BLAST output from blastall or blastpgp.
  BlastParser
Parses BLAST data into a Record.Blast object.
  PSIBlastParser
Parses BLAST data into a Record.PSIBlast object.
  _HeaderConsumer
  _DescriptionConsumer
  _AlignmentConsumer
  _HSPConsumer
  _DatabaseReportConsumer
  _ParametersConsumer
  _BlastConsumer
  _PSIBlastConsumer
  Iterator
Iterates over a file of multiple BLAST results.
  _BlastErrorConsumer
  BlastErrorParser
Attempt to catch and diagnose BLAST errors while parsing.
Functions [hide private]
 
blastall(blastcmd, program, database, infile, align_view='7', **keywds)
Execute and retrieve data from standalone BLASTPALL as handles (DEPRECATED).
source code
 
blastpgp(blastcmd, database, infile, align_view='7', **keywds)
Execute and retrieve data from standalone BLASTPGP as handles (DEPRECATED).
source code
 
rpsblast(blastcmd, database, infile, align_view='7', **keywds)
Execute and retrieve data from standalone RPS-BLAST as handles (DEPRECATED).
source code
 
_re_search(regex, line, error_msg) source code
 
_get_cols(line, cols_to_get, ncols=None, expected={}) source code
 
_safe_int(str) source code
 
_safe_float(str) source code
 
_invoke_blast(cline)
Start BLAST and returns handles for stdout and stderr (PRIVATE).
source code
 
_security_check_parameters(param_dict)
Look for any attempt to insert a command into a parameter.
source code
Variables [hide private]
  __package__ = 'Bio.Blast'
  __warningregistry__ = {('This module has been deprecated. Cons...
Function Details [hide private]

blastall(blastcmd, program, database, infile, align_view='7', **keywds)

source code 
Execute and retrieve data from standalone BLASTPALL as handles (DEPRECATED).

NOTE - This function is deprecated, you are encouraged to the command
line wrapper Bio.Blast.Applications.BlastallCommandline instead, or
better the BLAST+ command line wrappers in Bio.Blast.Applications.

Execute and retrieve data from blastall.  blastcmd is the command
used to launch the 'blastall' executable.  program is the blast program
to use, e.g. 'blastp', 'blastn', etc.  database is the path to the database
to search against.  infile is the path to the file containing
the sequence to search with.

The return values are two handles, for standard output and standard error.

You may pass more parameters to **keywds to change the behavior of
the search.  Otherwise, optional values will be chosen by blastall.
The Blast output is by default in XML format. Use the align_view keyword
for output in a different format.

    Scoring
matrix              Matrix to use.
gap_open            Gap open penalty.
gap_extend          Gap extension penalty.
nuc_match           Nucleotide match reward.  (BLASTN)
nuc_mismatch        Nucleotide mismatch penalty.  (BLASTN)
query_genetic_code  Genetic code for Query.
db_genetic_code     Genetic code for database.  (TBLAST[NX])

    Algorithm
gapped              Whether to do a gapped alignment. T/F (not for TBLASTX)
expectation         Expectation value cutoff.
wordsize            Word size.
strands             Query strands to search against database.([T]BLAST[NX])
keep_hits           Number of best hits from a region to keep.
xdrop               Dropoff value (bits) for gapped alignments.
hit_extend          Threshold for extending hits.
region_length       Length of region used to judge hits.
db_length           Effective database length.
search_length       Effective length of search space.

    Processing
filter              Filter query sequence for low complexity (with SEG)?  T/F
believe_query       Believe the query defline.  T/F
restrict_gi         Restrict search to these GI's.
nprocessors         Number of processors to use.
oldengine           Force use of old engine T/F

    Formatting
html                Produce HTML output?  T/F
descriptions        Number of one-line descriptions.
alignments          Number of alignments.
align_view          Alignment view.  Integer 0-11,
                    passed as a string or integer.
show_gi             Show GI's in deflines?  T/F
seqalign_file       seqalign file to output.
outfile             Output file for report.  Filename to write to, if
                    omitted standard output is used (which you can access
                    from the returned handles).

blastpgp(blastcmd, database, infile, align_view='7', **keywds)

source code 
Execute and retrieve data from standalone BLASTPGP as handles (DEPRECATED).

NOTE - This function is deprecated, you are encouraged to the command
line wrapper Bio.Blast.Applications.BlastpgpCommandline instead, or
better the BLAST+ tool psiblast via the NcbipsiblastCommandline wrapper.

Execute and retrieve data from blastpgp.  blastcmd is the command
used to launch the 'blastpgp' executable.  database is the path to the
database to search against.  infile is the path to the file containing
the sequence to search with.

The return values are two handles, for standard output and standard error.

You may pass more parameters to **keywds to change the behavior of
the search.  Otherwise, optional values will be chosen by blastpgp.
The Blast output is by default in XML format. Use the align_view keyword
for output in a different format.

    Scoring
matrix              Matrix to use.
gap_open            Gap open penalty.
gap_extend          Gap extension penalty.
window_size         Multiple hits window size.
npasses             Number of passes.
passes              Hits/passes.  Integer 0-2.

    Algorithm
gapped              Whether to do a gapped alignment.  T/F
expectation         Expectation value cutoff.
wordsize            Word size.
keep_hits           Number of beset hits from a region to keep.
xdrop               Dropoff value (bits) for gapped alignments.
hit_extend          Threshold for extending hits.
region_length       Length of region used to judge hits.
db_length           Effective database length.
search_length       Effective length of search space.
nbits_gapping       Number of bits to trigger gapping.
pseudocounts        Pseudocounts constants for multiple passes.
xdrop_final         X dropoff for final gapped alignment.
xdrop_extension     Dropoff for blast extensions.
model_threshold     E-value threshold to include in multipass model.
required_start      Start of required region in query.
required_end        End of required region in query.

    Processing
XXX should document default values
program             The blast program to use. (PHI-BLAST)
filter              Filter query sequence  for low complexity (with SEG)?  T/F
believe_query       Believe the query defline?  T/F
nprocessors         Number of processors to use.

    Formatting
html                Produce HTML output?  T/F
descriptions        Number of one-line descriptions.
alignments          Number of alignments.
align_view          Alignment view.  Integer 0-11,
                    passed as a string or integer.
show_gi             Show GI's in deflines?  T/F
seqalign_file       seqalign file to output.
align_outfile       Output file for alignment.
checkpoint_outfile  Output file for PSI-BLAST checkpointing.
restart_infile      Input file for PSI-BLAST restart.
hit_infile          Hit file for PHI-BLAST.
matrix_outfile      Output file for PSI-BLAST matrix in ASCII.
align_outfile       Output file for alignment.  Filename to write to, if
                    omitted standard output is used (which you can access
                    from the returned handles).

align_infile        Input alignment file for PSI-BLAST restart.

rpsblast(blastcmd, database, infile, align_view='7', **keywds)

source code 
Execute and retrieve data from standalone RPS-BLAST as handles (DEPRECATED).

NOTE - This function is deprecated, you are encouraged to the command
line wrapper Bio.Blast.Applications.RpsBlastCommandline instead, or
better the BLAST+ rpsblast wrapper NcbirpsblastCommandline.

Execute and retrieve data from standalone RPS-BLAST.  blastcmd is the
command used to launch the 'rpsblast' executable.  database is the path
to the database to search against.  infile is the path to the file
containing the sequence to search with.

The return values are two handles, for standard output and standard error.

You may pass more parameters to **keywds to change the behavior of
the search.  Otherwise, optional values will be chosen by rpsblast.

Please note that this function will give XML output by default, by
setting align_view to seven (i.e. command line option -m 7).
You should use the NCBIXML.parse() function to read the resulting output.
This is because NCBIStandalone.BlastParser() does not understand the
plain text output format from rpsblast.

WARNING - The following text and associated parameter handling has not
received extensive testing.  Please report any errors we might have made...

    Algorithm/Scoring
gapped              Whether to do a gapped alignment.  T/F
multihit            0 for multiple hit (default), 1 for single hit
expectation         Expectation value cutoff.
range_restriction   Range restriction on query sequence (Format: start,stop) blastp only
                    0 in 'start' refers to the beginning of the sequence
                    0 in 'stop' refers to the end of the sequence
                    Default = 0,0
xdrop               Dropoff value (bits) for gapped alignments.
xdrop_final         X dropoff for final gapped alignment (in bits).
xdrop_extension     Dropoff for blast extensions (in bits).
search_length       Effective length of search space.
nbits_gapping       Number of bits to trigger gapping.
protein             Query sequence is protein.  T/F
db_length           Effective database length.

    Processing
filter              Filter query sequence for low complexity?  T/F
case_filter         Use lower case filtering of FASTA sequence T/F, default F
believe_query       Believe the query defline.  T/F
nprocessors         Number of processors to use.
logfile             Name of log file to use, default rpsblast.log

    Formatting
html                Produce HTML output?  T/F
descriptions        Number of one-line descriptions.
alignments          Number of alignments.
align_view          Alignment view.  Integer 0-11,
                    passed as a string or integer.
show_gi             Show GI's in deflines?  T/F
seqalign_file       seqalign file to output.
align_outfile       Output file for alignment.  Filename to write to, if
                    omitted standard output is used (which you can access
                    from the returned handles).

_invoke_blast(cline)

source code 
Start BLAST and returns handles for stdout and stderr (PRIVATE).

Expects a command line wrapper object from Bio.Blast.Applications

_security_check_parameters(param_dict)

source code 
Look for any attempt to insert a command into a parameter.

e.g. blastall(..., matrix='IDENTITY -F 0; rm -rf /etc/passwd')

Looks for ";" or "&&" in the strings (Unix and Windows syntax
for appending a command line), or ">", "<" or "|" (redirection)
and if any are found raises an exception.


Variables Details [hide private]

__warningregistry__

Value:
{('This module has been deprecated. Consider Bio.SearchIO for parsing \
BLAST output instead.',
  <class 'Bio.BiopythonDeprecationWarning'>,
  57): True}