Package Bio :: Package SearchIO :: Package _model :: Module query :: Class QueryResult
[hide private]
[frames] | no frames]

Class QueryResult

source code

             object --+    
                      |    
_base._BaseSearchObject --+
                          |
                         QueryResult

Class representing search results from a single query.

QueryResult is the container object that stores all search hits from a single search query. It is the top-level object returned by SearchIO's two main functions, read and parse. Depending on the search results and search output format, a QueryResult object will contain zero or more Hit objects (see Hit).

You can take a quick look at a QueryResult's contents and attributes by invoking print on it:

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> print(qresult)
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      1  gi|301171311|ref|NR_035856.1|  Pan troglodytes microRNA...
            2      1  gi|270133242|ref|NR_032573.1|  Macaca mulatta microRNA ...
            3      2  gi|301171322|ref|NR_035857.1|  Pan troglodytes microRNA...
            4      1  gi|301171267|ref|NR_035851.1|  Pan troglodytes microRNA...
            5      2  gi|262205330|ref|NR_030198.1|  Homo sapiens microRNA 52...
            6      1  gi|262205302|ref|NR_030191.1|  Homo sapiens microRNA 51...
            7      1  gi|301171259|ref|NR_035850.1|  Pan troglodytes microRNA...
            8      1  gi|262205451|ref|NR_030222.1|  Homo sapiens microRNA 51...
            9      2  gi|301171447|ref|NR_035871.1|  Pan troglodytes microRNA...
           10      1  gi|301171276|ref|NR_035852.1|  Pan troglodytes microRNA...
           11      1  gi|262205290|ref|NR_030188.1|  Homo sapiens microRNA 51...
...

If you just want to know how many hits a QueryResult has, you can invoke len on it. Alternatively, you can simply type its name in the interpreter:

>>> len(qresult)
100
>>> qresult
QueryResult(id='33211', 100 hits)

QueryResult behaves like a hybrid of Python's built-in list and dictionary. You can retrieve its items (Hit objects) using the integer index of the item, just like regular Python lists:

>>> first_hit = qresult[0]
>>> first_hit
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)

You can slice QueryResult objects as well. Slicing will return a new QueryResult object containing only the sliced hits:

>>> sliced_qresult = qresult[:3]    # slice the first three hits
>>> len(qresult)
100
>>> len(sliced_qresult)
3
>>> print(sliced_qresult)
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      1  gi|301171311|ref|NR_035856.1|  Pan troglodytes microRNA...
            2      1  gi|270133242|ref|NR_032573.1|  Macaca mulatta microRNA ...

Like Python dictionaries, you can also retrieve hits using the hit's ID. This is useful for retrieving hits that you know should exist in a given search:

>>> hit = qresult['gi|262205317|ref|NR_030195.1|']
>>> hit
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)

You can also replace a Hit in QueryResult with another Hit using either the integer index or hit key string. Note that the replacing object must be a Hit that has the same query_id property as the QueryResult object.

If you're not sure whether a QueryResult contains a particular hit, you can use the hit ID to check for membership first:

>>> 'gi|262205317|ref|NR_030195.1|' in qresult
True
>>> 'gi|262380031|ref|NR_023426.1|' in qresult
False

Or, if you just want to know the rank / position of a given hit, you can use the hit ID as an argument for the index method. Note that the values returned will be zero-based. So zero (0) means the hit is the first in the QueryResult, three (3) means the hit is the fourth item, and so on. If the hit does not exist in the QueryResult, a ValueError will be raised.

>>> qresult.index('gi|262205317|ref|NR_030195.1|')
0
>>> qresult.index('gi|262205330|ref|NR_030198.1|')
5
>>> qresult.index('gi|262380031|ref|NR_023426.1|')
Traceback (most recent call last):
...
ValueError: ...

To ease working with a large number of hits, QueryResult has several filter and map methods, analogous to Python's built-in functions with the same names. There are filter and map methods available for operations over both Hit objects or HSP objects. As an example, here we are using the hit_map method to rename all hit IDs within a QueryResult:

>>> def renamer(hit):
...     hit.id = hit.id.split('|')[3]
...     return hit
>>> mapped_qresult = qresult.hit_map(renamer)
>>> print(mapped_qresult)
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description
         ----  -----  ----------------------------------------------------------
            0      1  NR_030195.1  Homo sapiens microRNA 520b (MIR520B), micr...
            1      1  NR_035856.1  Pan troglodytes microRNA mir-520b (MIR520B...
            2      1  NR_032573.1  Macaca mulatta microRNA mir-519a (MIR519A)...
...

The principle for other map and filter methods are similar: they accept a function, applies it, and returns a new QueryResult object.

There are also other methods useful for working with list-like objects: append, pop, and sort. More details and examples are available in their respective documentations.

Finally, just like Python lists and dictionaries, QueryResult objects are iterable. Iteration over QueryResults will yield Hit objects:

>>> for hit in qresult[:4]:     # iterate over the first four items
...     hit
...
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)
Hit(id='gi|301171311|ref|NR_035856.1|', query_id='33211', 1 hsps)
Hit(id='gi|270133242|ref|NR_032573.1|', query_id='33211', 1 hsps)
Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps)

If you need access to all the hits in a QueryResult object, you can get them in a list using the hits property. Similarly, access to all hit IDs is available through the hit_keys property.

>>> qresult.hits
[Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps), ...]
>>> qresult.hit_keys
['gi|262205317|ref|NR_030195.1|', 'gi|301171311|ref|NR_035856.1|', ...]
Instance Methods [hide private]
 
__init__(self, hits=[], id=None, hit_key_function=<function <lambda> at 0x479ecf8>)
Initializes a QueryResult object.
source code
 
__iter__(self) source code
 
iterhits(self)
Returns an iterator over the Hit objects.
source code
 
iterhit_keys(self)
Returns an iterator over the ID of the Hit objects.
source code
 
iteritems(self)
Returns an iterator yielding tuples of Hit ID and Hit objects.
source code
 
__contains__(self, hit_key) source code
 
__len__(self) source code
 
__bool__(self) source code
 
__nonzero__(self) source code
 
__repr__(self)
repr(x)
source code
 
__str__(self)
str(x)
source code
 
__getitem__(self, hit_key) source code
 
__setitem__(self, hit_key, hit) source code
 
__delitem__(self, hit_key) source code
 
absorb(self, hit)
Adds a Hit object to the end of QueryResult.
source code
 
append(self, hit)
Adds a Hit object to the end of QueryResult.
source code
 
hit_filter(self, func=None)
Creates a new QueryResult object whose Hit objects pass the filter function.
source code
 
hit_map(self, func=None)
Creates a new QueryResult object, mapping the given function to its Hits.
source code
 
hsp_filter(self, func=None)
Creates a new QueryResult object whose HSP objects pass the filter function.
source code
 
hsp_map(self, func=None)
Creates a new QueryResult object, mapping the given function to its HSPs.
source code
 
pop(self, hit_key=-1, default=object())
Removes the specified hit key and return the Hit object.
source code
 
index(self, hit_key)
Returns the index of a given hit key, zero-based.
source code
 
sort(self, key=None, reverse=False, in_place=True)
Sorts the Hit objects.
source code

Inherited from _base._BaseSearchObject (private): _transfer_attrs

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

Class Variables [hide private]
  _NON_STICKY_ATTRS = ('_items', '__alt_hit_ids')
  __marker = object()
Properties [hide private]
  hits
Hit objects contained in the QueryResult.
  hit_keys
Hit IDs of the Hit objects contained in the QueryResult.
  items
List of tuples of Hit IDs and Hit objects.
  id
QueryResult ID string
  description
QueryResult description
  hsps
HSP objects contained in the QueryResult.
  fragments
HSPFragment objects contained in the QueryResult.

Inherited from object: __class__

Method Details [hide private]

__init__(self, hits=[], id=None, hit_key_function=<function <lambda> at 0x479ecf8>)
(Constructor)

source code 
Initializes a QueryResult object.
Parameters:
  • id (string) - query sequence ID
  • hits (iterable) - iterator yielding Hit objects
  • hit_key_function (callable, accepts Hit objects, returns string) - function to define hit keys
Overrides: object.__init__

__repr__(self)
(Representation operator)

source code 
repr(x)

Overrides: object.__repr__
(inherited documentation)

__str__(self)
(Informal representation operator)

source code 
str(x)

Overrides: object.__str__
(inherited documentation)

absorb(self, hit)

source code 

Adds a Hit object to the end of QueryResult. If the QueryResult already has a Hit with the same ID, append the new Hit's HSPs into the existing Hit.

This method is used for file formats that may output the same Hit in separate places, such as BLAT or Exonerate. In both formats, Hit with different strands are put in different places. However, SearchIO considers them to be the same as a Hit object should be all database entries with the same ID, regardless of strand orientation.

Parameters:
  • hit (Hit) - object to absorb

append(self, hit)

source code 

Adds a Hit object to the end of QueryResult.

Any Hit object appended must have the same query_id property as the QueryResult's id property. If the hit key already exists, a ValueError will be raised.

Parameters:
  • hit (Hit) - object to append

hit_filter(self, func=None)

source code 

Creates a new QueryResult object whose Hit objects pass the filter function.

Here is an example of using hit_filter to select Hits whose description begins with the string 'Homo sapiens', case sensitive:

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> def desc_filter(hit):
...     return hit.description.startswith('Homo sapiens')
...
>>> len(qresult)
100
>>> filtered = qresult.hit_filter(desc_filter)
>>> len(filtered)
39
>>> print(filtered[:4])
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      2  gi|262205330|ref|NR_030198.1|  Homo sapiens microRNA 52...
            2      1  gi|262205302|ref|NR_030191.1|  Homo sapiens microRNA 51...
            3      1  gi|262205451|ref|NR_030222.1|  Homo sapiens microRNA 51...

Note that instance attributes (other than the hits) from the unfiltered QueryResult are retained in the filtered object.

>>> qresult.program == filtered.program
True
>>> qresult.target == filtered.target
True
Parameters:
  • func (callable, accepts Hit, returns bool) - filter function

hit_map(self, func=None)

source code 

Creates a new QueryResult object, mapping the given function to its Hits.

Here is an example of using hit_map with a function that discards all HSPs in a Hit except for the first one:

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> print(qresult[:8])
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      1  gi|301171311|ref|NR_035856.1|  Pan troglodytes microRNA...
            2      1  gi|270133242|ref|NR_032573.1|  Macaca mulatta microRNA ...
            3      2  gi|301171322|ref|NR_035857.1|  Pan troglodytes microRNA...
            4      1  gi|301171267|ref|NR_035851.1|  Pan troglodytes microRNA...
            5      2  gi|262205330|ref|NR_030198.1|  Homo sapiens microRNA 52...
            6      1  gi|262205302|ref|NR_030191.1|  Homo sapiens microRNA 51...
            7      1  gi|301171259|ref|NR_035850.1|  Pan troglodytes microRNA...

>>> top_hsp = lambda hit: hit[:1]
>>> mapped_qresult = qresult.hit_map(top_hsp)
>>> print(mapped_qresult[:8])
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      1  gi|301171311|ref|NR_035856.1|  Pan troglodytes microRNA...
            2      1  gi|270133242|ref|NR_032573.1|  Macaca mulatta microRNA ...
            3      1  gi|301171322|ref|NR_035857.1|  Pan troglodytes microRNA...
            4      1  gi|301171267|ref|NR_035851.1|  Pan troglodytes microRNA...
            5      1  gi|262205330|ref|NR_030198.1|  Homo sapiens microRNA 52...
            6      1  gi|262205302|ref|NR_030191.1|  Homo sapiens microRNA 51...
            7      1  gi|301171259|ref|NR_035850.1|  Pan troglodytes microRNA...
Parameters:
  • func (callable, accepts Hit, returns Hit) - map function

hsp_filter(self, func=None)

source code 

Creates a new QueryResult object whose HSP objects pass the filter function.

hsp_filter is the same as hit_filter, except that it filters directly on each HSP object in every Hit. If the filtering removes all HSP objects in a given Hit, the entire Hit will be discarded. This will result in the QueryResult having less Hit after filtering.

hsp_map(self, func=None)

source code 

Creates a new QueryResult object, mapping the given function to its HSPs.

hsp_map is the same as hit_map, except that it applies the given function to all HSP objects in every Hit, instead of the Hit objects.

pop(self, hit_key=-1, default=object())

source code 

Removes the specified hit key and return the Hit object.

By default, pop will remove and return the last Hit object in the QueryResult object. To remove specific Hit objects, you can use its integer index or hit key.

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> len(qresult)
100
>>> for hit in qresult[:5]:
...     print(hit.id)
...
gi|262205317|ref|NR_030195.1|
gi|301171311|ref|NR_035856.1|
gi|270133242|ref|NR_032573.1|
gi|301171322|ref|NR_035857.1|
gi|301171267|ref|NR_035851.1|

# remove the last hit >>> qresult.pop() Hit(id='gi|397513516|ref|XM_003827011.1|', query_id='33211', 1 hsps)

# remove the first hit >>> qresult.pop(0) Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)

# remove hit with the given ID >>> qresult.pop('gi|301171322|ref|NR_035857.1|') Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps)

Parameters:
  • hit_key (int or string) - key of the Hit object to return
  • default (object) - return value if no Hit exists with the given key

index(self, hit_key)

source code 

Returns the index of a given hit key, zero-based.

This method is useful for finding out the integer index (usually correlated with search rank) of a given hit key.

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> qresult.index('gi|301171259|ref|NR_035850.1|')
7
Parameters:
  • hit_key (string) - hit ID

sort(self, key=None, reverse=False, in_place=True)

source code 

Sorts the Hit objects.

sort defaults to sorting in-place, to mimick Python's list.sort method. If you set the in_place argument to False, it will treat return a new, sorted QueryResult object and keep the initial one unsorted.

Parameters:
  • key (callable, accepts Hit, returns key for sorting) - sorting function
  • reverse (bool) - whether to reverse sorting results or no
  • in_place (bool) - whether to do in-place sorting or no

Property Details [hide private]

hits

Hit objects contained in the QueryResult.
Get Method:
unreachable.hits(self) - Hit objects contained in the QueryResult.

hit_keys

Hit IDs of the Hit objects contained in the QueryResult.
Get Method:
unreachable.hit_keys(self) - Hit IDs of the Hit objects contained in the QueryResult.

items

List of tuples of Hit IDs and Hit objects.
Get Method:
unreachable.items(self) - List of tuples of Hit IDs and Hit objects.

id

QueryResult ID string
Get Method:
unreachable.getter(self)
Set Method:
unreachable.setter(self, value)

description

QueryResult description
Get Method:
unreachable.getter(self)
Set Method:
unreachable.setter(self, value)

hsps

HSP objects contained in the QueryResult.
Get Method:
unreachable.hsps(self) - HSP objects contained in the QueryResult.

fragments

HSPFragment objects contained in the QueryResult.
Get Method:
unreachable.fragments(self) - HSPFragment objects contained in the QueryResult.