Class QueryResult
source code
object --+
|
_base._BaseSearchObject --+
|
QueryResult
Class representing search results from a single query.
QueryResult is the container object that stores all search hits from a
single search query. It is the top-level object returned by SearchIO's two
main functions, `read` and `parse`. Depending on the search results and
search output format, a QueryResult object will contain zero or more Hit
objects (see Hit).
You can take a quick look at a QueryResult's contents and attributes by
invoking `print` on it:
>>> from Bio import SearchIO
>>> qresult = SearchIO.parse('Blast/mirna.xml', 'blast-xml').next()
>>> print qresult
Program: blastn (2.2.27+)
Query: 33211 (61)
mir_1
Target: refseq_rna
Hits: ---- ----- ----------------------------------------------------------
# # HSP ID + description
---- ----- ----------------------------------------------------------
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52...
1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA...
2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ...
3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA...
4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA...
5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52...
6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51...
7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA...
8 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51...
9 2 gi|301171447|ref|NR_035871.1| Pan troglodytes microRNA...
10 1 gi|301171276|ref|NR_035852.1| Pan troglodytes microRNA...
11 1 gi|262205290|ref|NR_030188.1| Homo sapiens microRNA 51...
...
If you just want to know how many hits a QueryResult has, you can invoke
`len` on it. Alternatively, you can simply type its name in the interpreter:
>>> len(qresult)
100
>>> qresult
QueryResult(id='33211', 100 hits)
QueryResult behaves like a hybrid of Python's built-in list and dictionary.
You can retrieve its items (Hit objects) using the integer index of the
item, just like regular Python lists:
>>> first_hit = qresult[0]
>>> first_hit
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)
You can slice QueryResult objects as well. Slicing will return a new
QueryResult object containing only the sliced hits:
>>> sliced_qresult = qresult[:3] # slice the first three hits
>>> len(qresult)
100
>>> len(sliced_qresult)
3
>>> print sliced_qresult
Program: blastn (2.2.27+)
Query: 33211 (61)
mir_1
Target: refseq_rna
Hits: ---- ----- ----------------------------------------------------------
# # HSP ID + description
---- ----- ----------------------------------------------------------
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52...
1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA...
2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ...
Like Python dictionaries, you can also retrieve hits using the hit's ID.
This is useful for retrieving hits that you know should exist in a given
search:
>>> hit = qresult['gi|262205317|ref|NR_030195.1|']
>>> hit
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)
You can also replace a Hit in QueryResult with another Hit using either the
integer index or hit key string. Note that the replacing object must be a
Hit that has the same `query_id` property as the QueryResult object.
If you're not sure whether a QueryResult contains a particular hit, you can
use the hit ID to check for membership first:
>>> 'gi|262205317|ref|NR_030195.1|' in qresult
True
>>> 'gi|262380031|ref|NR_023426.1|' in qresult
False
Or, if you just want to know the rank / position of a given hit, you can
use the hit ID as an argument for the `index` method. Note that the values
returned will be zero-based. So zero (0) means the hit is the first in the
QueryResult, three (3) means the hit is the fourth item, and so on. If the
hit does not exist in the QueryResult, a `ValueError` will be raised.
>>> qresult.index('gi|262205317|ref|NR_030195.1|')
0
>>> qresult.index('gi|262205330|ref|NR_030198.1|')
5
>>> qresult.index('gi|262380031|ref|NR_023426.1|')
Traceback (most recent call last):
...
ValueError: ...
To ease working with a large number of hits, QueryResult has several
`filter` and `map` methods, analogous to Python's built-in functions with
the same names. There are `filter` and `map` methods available for
operations over both Hit objects or HSP objects. As an example, here we are
using the `hit_map` method to rename all hit IDs within a QueryResult:
>>> def renamer(hit):
... hit.id = hit.id.split('|')[3]
... return hit
>>> mapped_qresult = qresult.hit_map(renamer)
>>> print mapped_qresult
Program: blastn (2.2.27+)
Query: 33211 (61)
mir_1
Target: refseq_rna
Hits: ---- ----- ----------------------------------------------------------
# # HSP ID + description
---- ----- ----------------------------------------------------------
0 1 NR_030195.1 Homo sapiens microRNA 520b (MIR520B), micr...
1 1 NR_035856.1 Pan troglodytes microRNA mir-520b (MIR520B...
2 1 NR_032573.1 Macaca mulatta microRNA mir-519a (MIR519A)...
...
The principle for other `map` and `filter` methods are similar: they accept
a function, applies it, and returns a new QueryResult object.
There are also other methods useful for working with list-like objects:
`append`, `pop`, and `sort`. More details and examples are available in
their respective documentations.
Finally, just like Python lists and dictionaries, QueryResult objects are
iterable. Iteration over QueryResults will yield Hit objects:
>>> for hit in qresult[:4]: # iterate over the first four items
... hit
...
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)
Hit(id='gi|301171311|ref|NR_035856.1|', query_id='33211', 1 hsps)
Hit(id='gi|270133242|ref|NR_032573.1|', query_id='33211', 1 hsps)
Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps)
If you need access to all the hits in a QueryResult object, you can get
them in a list using the `hits` property. Similarly, access to all hit IDs is
available through the `hit_keys` property.
>>> qresult.hits
[Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps), ...]
>>> qresult.hit_keys
['gi|262205317|ref|NR_030195.1|', 'gi|301171311|ref|NR_035856.1|', ...]
|
|
__init__(self,
id='<unknown id>',
hits=[],
hit_key_function=<function <lambda> at 0xa95ea3c>)
Initializes a QueryResult object. |
source code
|
|
|
|
|
|
|
iterhits(self)
Returns an iterator over the Hit objects. |
source code
|
|
|
|
iterhit_keys(self)
Returns an iterator over the ID of the Hit objects. |
source code
|
|
|
|
iteritems(self)
Returns an iterator yielding tuples of Hit ID and Hit objects. |
source code
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
hit_filter(self,
func=None)
Creates a new QueryResult object whose Hit objects pass the filter
function. |
source code
|
|
|
|
hit_map(self,
func=None)
Creates a new QueryResult object, mapping the given function to its
Hits. |
source code
|
|
|
|
hsp_filter(self,
func=None)
Creates a new QueryResult object whose HSP objects pass the filter
function. |
source code
|
|
|
|
hsp_map(self,
func=None)
Creates a new QueryResult object, mapping the given function to its
HSPs. |
source code
|
|
|
|
pop(self,
hit_key=-1,
default=object())
Removes the specified hit key and return the Hit object. |
source code
|
|
|
|
index(self,
hit_key)
Returns the index of a given hit key, zero-based. |
source code
|
|
|
|
sort(self,
key=None,
reverse=False,
in_place=True)
Sorts the Hit objects. |
source code
|
|
|
Inherited from object:
__delattr__,
__getattribute__,
__hash__,
__new__,
__reduce__,
__reduce_ex__,
__setattr__
|
|
|
_NON_STICKY_ATTRS = ('_items')
|
|
|
__marker = object()
|
|
|
hits
Hit objects contained in the QueryResult.
|
|
|
hit_keys
Hit IDs of the Hit objects contained in the QueryResult.
|
|
|
items
List of tuples of Hit IDs and Hit objects.
|
|
|
id
QueryResult ID string
|
|
|
description
QueryResult description
|
|
|
hsps
HSP objects contained in the QueryResult.
|
|
|
fragments
HSPFragment objects contained in the QueryResult.
|
|
Inherited from object:
__class__
|
__init__(self,
id='<unknown id>',
hits=[],
hit_key_function=<function <lambda> at 0xa95ea3c>)
(Constructor)
| source code
|
Initializes a QueryResult object.
Arguments:
id -- String of query sequence ID.
hits -- Iterator returning Hit objects.
hit_key_function -- Function to define hit keys, defaults to a function
that return Hit object IDs.
- Overrides:
object.__init__
|
repr(x)
- Overrides:
object.__repr__
- (inherited documentation)
|
__str__(self)
(Informal representation operator)
| source code
|
str(x)
- Overrides:
object.__str__
- (inherited documentation)
|
Adds a Hit object to the end of QueryResult. If the QueryResult
already has a Hit with the same ID, append the new Hit's HSPs into
the existing Hit.
Arguments:
hit -- Hit object to absorb.
This method is used for file formats that may output the same Hit in
separate places, such as BLAT or Exonerate. In both formats, Hit
with different strands are put in different places. However, SearchIO
considers them to be the same as a Hit object should be all database
entries with the same ID, regardless of strand orientation.
|
Adds a Hit object to the end of QueryResult.
Parameters
hit -- Hit object to append.
Any Hit object appended must have the same `query_id` property as the
QueryResult's `id` property. If the hit key already exists, a
`ValueError` will be raised.
|
Creates a new QueryResult object whose Hit objects pass the filter
function.
Arguments:
func -- Callback function that accepts a Hit object as its parameter,
does a boolean check, and returns True or False
Here is an example of using `hit_filter` to select Hits whose
description begins with the string 'Homo sapiens', case sensitive:
>>> from Bio import SearchIO
>>> qresult = SearchIO.parse('Blast/mirna.xml', 'blast-xml').next()
>>> def desc_filter(hit):
... return hit.description.startswith('Homo sapiens')
...
>>> len(qresult)
100
>>> filtered = qresult.hit_filter(desc_filter)
>>> len(filtered)
39
>>> print filtered[:4]
Program: blastn (2.2.27+)
Query: 33211 (61)
mir_1
Target: refseq_rna
Hits: ---- ----- ----------------------------------------------------------
# # HSP ID + description
---- ----- ----------------------------------------------------------
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52...
1 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52...
2 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51...
3 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51...
Note that instance attributes (other than the hits) from the unfiltered
QueryResult are retained in the filtered object.
>>> qresult.program == filtered.program
True
>>> qresult.target == filtered.target
True
|
Creates a new QueryResult object, mapping the given function to its
Hits.
Arguments:
func -- Callback function that accepts a Hit object as its parameter and
also returns a Hit object.
Here is an example of using `hit_map` with a function that discards all
HSPs in a Hit except for the first one:
>>> from Bio import SearchIO
>>> qresult = SearchIO.parse('Blast/mirna.xml', 'blast-xml').next()
>>> print qresult[:8]
Program: blastn (2.2.27+)
Query: 33211 (61)
mir_1
Target: refseq_rna
Hits: ---- ----- ----------------------------------------------------------
# # HSP ID + description
---- ----- ----------------------------------------------------------
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52...
1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA...
2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ...
3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA...
4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA...
5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52...
6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51...
7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA...
>>> top_hsp = lambda hit: hit[:1]
>>> mapped_qresult = qresult.hit_map(top_hsp)
>>> print mapped_qresult[:8]
Program: blastn (2.2.27+)
Query: 33211 (61)
mir_1
Target: refseq_rna
Hits: ---- ----- ----------------------------------------------------------
# # HSP ID + description
---- ----- ----------------------------------------------------------
0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52...
1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA...
2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ...
3 1 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA...
4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA...
5 1 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52...
6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51...
7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA...
|
Creates a new QueryResult object whose HSP objects pass the filter
function.
`hsp_filter` is the same as `hit_filter`, except that it filters
directly on each HSP object in every Hit. If a the filtering removes
all HSP object in a given Hit, the entire Hit will be discarded. This
will result in the QueryResult having less Hit after filtering.
|
Creates a new QueryResult object, mapping the given function to its
HSPs.
`hsp_map` is the same as `hit_map`, except that it applies the given
function to all HSP objects in every Hit, instead of the Hit objects.
|
Removes the specified hit key and return the Hit object.
Arguments:
hit_key -- Integer index or string of hit key that points to a Hit
object.
default -- Value that will be returned if the Hit object with the
specified index or hit key is not found.
By default, `pop` will remove and return the last Hit object in the
QueryResult object. To remove specific Hit objects, you can use its
integer index or hit key.
>>> from Bio import SearchIO
>>> qresult = SearchIO.parse('Blast/mirna.xml', 'blast-xml').next()
>>> len(qresult)
100
>>> for hit in qresult[:5]:
... print hit.id
...
gi|262205317|ref|NR_030195.1|
gi|301171311|ref|NR_035856.1|
gi|270133242|ref|NR_032573.1|
gi|301171322|ref|NR_035857.1|
gi|301171267|ref|NR_035851.1|
# remove the last hit
>>> qresult.pop()
Hit(id='gi|397513516|ref|XM_003827011.1|', query_id='33211', 1 hsps)
# remove the first hit
>>> qresult.pop(0)
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)
# remove hit with the given ID
>>> qresult.pop('gi|301171322|ref|NR_035857.1|')
Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps)
|
Returns the index of a given hit key, zero-based.
Arguments:
hit_key -- Hit ID string to look up.
This method is useful for finding out the integer index (usually
correlated with search rank) of a given hit key.
>>> from Bio import SearchIO
>>> qresult = SearchIO.parse('Blast/mirna.xml', 'blast-xml').next()
>>> qresult.index('gi|301171259|ref|NR_035850.1|')
7
|
sort(self,
key=None,
reverse=False,
in_place=True)
| source code
|
Sorts the Hit objects.
Arguments:
key -- Function used to sort the Hit objects.
reverse -- Boolean, whether to reverse the sorting or not.
in_place -- Boolean, whether to perform sorting in place (in the same
object) or not (creating a new object).
`sort` defaults to sorting in-place, to mimick Python's `list.sort`
method. If you set the `in_place` argument to False, it will treat
return a new, sorted QueryResult object and keep the initial one
unsorted.
|
hits
Hit objects contained in the QueryResult.
- Get Method:
- unreachable.hits(self)
- Hit objects contained in the QueryResult.
|
hit_keys
Hit IDs of the Hit objects contained in the QueryResult.
- Get Method:
- unreachable.hit_keys(self)
- Hit IDs of the Hit objects contained in the QueryResult.
|
items
List of tuples of Hit IDs and Hit objects.
- Get Method:
- unreachable.items(self)
- List of tuples of Hit IDs and Hit objects.
|
id
QueryResult ID string
- Get Method:
- unreachable.getter(self)
- Set Method:
- unreachable.setter(self,
value)
|
description
QueryResult description
- Get Method:
- unreachable.getter(self)
- Set Method:
- unreachable.setter(self,
value)
|
hsps
HSP objects contained in the QueryResult.
- Get Method:
- unreachable.hsps(self)
- HSP objects contained in the QueryResult.
|
fragments
HSPFragment objects contained in the QueryResult.
- Get Method:
- unreachable.fragments(self)
- HSPFragment objects contained in the QueryResult.
|