Package Bio :: Package SearchIO :: Package _model :: Module query
[hide private]
[frames] | no frames]

Source Code for Module Bio.SearchIO._model.query

  1  # Copyright 2012 by Wibowo Arindrarto.  All rights reserved. 
  2  # This code is part of the Biopython distribution and governed by its 
  3  # license.  Please see the LICENSE file that should have been included 
  4  # as part of this package. 
  5   
  6  """Bio.SearchIO object to model search results from a single query.""" 
  7   
  8  from __future__ import print_function 
  9  from Bio._py3k import basestring 
 10   
 11  from copy import deepcopy 
 12  from itertools import chain 
 13   
 14  from Bio._py3k import OrderedDict 
 15  from Bio._py3k import filter 
 16   
 17  from Bio._utils import trim_str 
 18  from Bio.SearchIO._utils import optionalcascade 
 19   
 20  from ._base import _BaseSearchObject 
 21  from .hit import Hit 
22 23 24 -class QueryResult(_BaseSearchObject):
25 26 """Class representing search results from a single query. 27 28 QueryResult is the container object that stores all search hits from a 29 single search query. It is the top-level object returned by SearchIO's two 30 main functions, ``read`` and ``parse``. Depending on the search results and 31 search output format, a QueryResult object will contain zero or more Hit 32 objects (see Hit). 33 34 You can take a quick look at a QueryResult's contents and attributes by 35 invoking ``print`` on it:: 36 37 >>> from Bio import SearchIO 38 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 39 >>> print(qresult) 40 Program: blastn (2.2.27+) 41 Query: 33211 (61) 42 mir_1 43 Target: refseq_rna 44 Hits: ---- ----- ---------------------------------------------------------- 45 # # HSP ID + description 46 ---- ----- ---------------------------------------------------------- 47 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 48 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 49 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 50 3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... 51 4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... 52 5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 53 6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 54 7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... 55 8 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51... 56 9 2 gi|301171447|ref|NR_035871.1| Pan troglodytes microRNA... 57 10 1 gi|301171276|ref|NR_035852.1| Pan troglodytes microRNA... 58 11 1 gi|262205290|ref|NR_030188.1| Homo sapiens microRNA 51... 59 ... 60 61 If you just want to know how many hits a QueryResult has, you can invoke 62 ``len`` on it. Alternatively, you can simply type its name in the interpreter:: 63 64 >>> len(qresult) 65 100 66 >>> qresult 67 QueryResult(id='33211', 100 hits) 68 69 QueryResult behaves like a hybrid of Python's built-in list and dictionary. 70 You can retrieve its items (Hit objects) using the integer index of the 71 item, just like regular Python lists:: 72 73 >>> first_hit = qresult[0] 74 >>> first_hit 75 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 76 77 You can slice QueryResult objects as well. Slicing will return a new 78 QueryResult object containing only the sliced hits:: 79 80 >>> sliced_qresult = qresult[:3] # slice the first three hits 81 >>> len(qresult) 82 100 83 >>> len(sliced_qresult) 84 3 85 >>> print(sliced_qresult) 86 Program: blastn (2.2.27+) 87 Query: 33211 (61) 88 mir_1 89 Target: refseq_rna 90 Hits: ---- ----- ---------------------------------------------------------- 91 # # HSP ID + description 92 ---- ----- ---------------------------------------------------------- 93 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 94 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 95 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 96 97 Like Python dictionaries, you can also retrieve hits using the hit's ID. 98 This is useful for retrieving hits that you know should exist in a given 99 search:: 100 101 >>> hit = qresult['gi|262205317|ref|NR_030195.1|'] 102 >>> hit 103 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 104 105 You can also replace a Hit in QueryResult with another Hit using either the 106 integer index or hit key string. Note that the replacing object must be a 107 Hit that has the same ``query_id`` property as the QueryResult object. 108 109 If you're not sure whether a QueryResult contains a particular hit, you can 110 use the hit ID to check for membership first:: 111 112 >>> 'gi|262205317|ref|NR_030195.1|' in qresult 113 True 114 >>> 'gi|262380031|ref|NR_023426.1|' in qresult 115 False 116 117 Or, if you just want to know the rank / position of a given hit, you can 118 use the hit ID as an argument for the ``index`` method. Note that the values 119 returned will be zero-based. So zero (0) means the hit is the first in the 120 QueryResult, three (3) means the hit is the fourth item, and so on. If the 121 hit does not exist in the QueryResult, a ``ValueError`` will be raised. 122 123 >>> qresult.index('gi|262205317|ref|NR_030195.1|') 124 0 125 >>> qresult.index('gi|262205330|ref|NR_030198.1|') 126 5 127 >>> qresult.index('gi|262380031|ref|NR_023426.1|') 128 Traceback (most recent call last): 129 ... 130 ValueError: ... 131 132 To ease working with a large number of hits, QueryResult has several 133 ``filter`` and ``map`` methods, analogous to Python's built-in functions with 134 the same names. There are ``filter`` and ``map`` methods available for 135 operations over both Hit objects or HSP objects. As an example, here we are 136 using the ``hit_map`` method to rename all hit IDs within a QueryResult:: 137 138 >>> def renamer(hit): 139 ... hit.id = hit.id.split('|')[3] 140 ... return hit 141 >>> mapped_qresult = qresult.hit_map(renamer) 142 >>> print(mapped_qresult) 143 Program: blastn (2.2.27+) 144 Query: 33211 (61) 145 mir_1 146 Target: refseq_rna 147 Hits: ---- ----- ---------------------------------------------------------- 148 # # HSP ID + description 149 ---- ----- ---------------------------------------------------------- 150 0 1 NR_030195.1 Homo sapiens microRNA 520b (MIR520B), micr... 151 1 1 NR_035856.1 Pan troglodytes microRNA mir-520b (MIR520B... 152 2 1 NR_032573.1 Macaca mulatta microRNA mir-519a (MIR519A)... 153 ... 154 155 The principle for other ``map`` and ``filter`` methods are similar: they accept 156 a function, applies it, and returns a new QueryResult object. 157 158 There are also other methods useful for working with list-like objects: 159 ``append``, ``pop``, and ``sort``. More details and examples are available in 160 their respective documentations. 161 162 Finally, just like Python lists and dictionaries, QueryResult objects are 163 iterable. Iteration over QueryResults will yield Hit objects:: 164 165 >>> for hit in qresult[:4]: # iterate over the first four items 166 ... hit 167 ... 168 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 169 Hit(id='gi|301171311|ref|NR_035856.1|', query_id='33211', 1 hsps) 170 Hit(id='gi|270133242|ref|NR_032573.1|', query_id='33211', 1 hsps) 171 Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps) 172 173 If you need access to all the hits in a QueryResult object, you can get 174 them in a list using the ``hits`` property. Similarly, access to all hit IDs is 175 available through the ``hit_keys`` property. 176 177 >>> qresult.hits 178 [Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps), ...] 179 >>> qresult.hit_keys 180 ['gi|262205317|ref|NR_030195.1|', 'gi|301171311|ref|NR_035856.1|', ...] 181 182 """ 183 184 # attributes we don't want to transfer when creating a new QueryResult class 185 # from this one 186 _NON_STICKY_ATTRS = ('_items', '__alt_hit_ids', ) 187
188 - def __init__(self, hits=(), id=None, 189 hit_key_function=lambda hit: hit.id):
190 """Initializes a QueryResult object. 191 192 :param id: query sequence ID 193 :type id: string 194 :param hits: iterator yielding Hit objects 195 :type hits: iterable 196 :param hit_key_function: function to define hit keys 197 :type hit_key_function: callable, accepts Hit objects, returns string 198 199 """ 200 # default values 201 self._id = id 202 self._hit_key_function = hit_key_function 203 self._items = OrderedDict() 204 self._description = None 205 self.__alt_hit_ids = {} 206 self.program = '<unknown program>' 207 self.target = '<unknown target>' 208 self.version = '<unknown version>' 209 210 # validate Hit objects and fill up self._items 211 for hit in hits: 212 # validation is handled by __setitem__ 213 self.append(hit)
214 215 # handle Python 2 OrderedDict behavior 216 if hasattr(OrderedDict, 'iteritems'): 217
218 - def __iter__(self):
219 return self.iterhits()
220 221 @property
222 - def hits(self):
223 """Hit objects contained in the QueryResult.""" 224 return self._items.values()
225 226 @property
227 - def hit_keys(self):
228 """Hit IDs of the Hit objects contained in the QueryResult.""" 229 return self._items.keys()
230 231 @property
232 - def items(self):
233 """List of tuples of Hit IDs and Hit objects.""" 234 return self._items.items()
235
236 - def iterhits(self):
237 """Returns an iterator over the Hit objects.""" 238 for hit in self._items.itervalues(): 239 yield hit
240
241 - def iterhit_keys(self):
242 """Returns an iterator over the ID of the Hit objects.""" 243 for hit_id in self._items: 244 yield hit_id
245
246 - def iteritems(self):
247 """Returns an iterator yielding tuples of Hit ID and Hit objects.""" 248 for item in self._items.iteritems(): 249 yield item
250 251 else: 252
253 - def __iter__(self):
254 return iter(self.hits)
255 256 @property
257 - def hits(self):
258 """Hit objects contained in the QueryResult.""" 259 return list(self._items.values())
260 261 @property
262 - def hit_keys(self):
263 """Hit IDs of the Hit objects contained in the QueryResult.""" 264 return list(self._items.keys())
265 266 @property
267 - def items(self):
268 """List of tuples of Hit IDs and Hit objects.""" 269 return list(self._items.items())
270
271 - def iterhits(self):
272 """Returns an iterator over the Hit objects.""" 273 for hit in self._items.values(): 274 yield hit
275
276 - def iterhit_keys(self):
277 """Returns an iterator over the ID of the Hit objects.""" 278 for hit_id in self._items: 279 yield hit_id
280
281 - def iteritems(self):
282 """Returns an iterator yielding tuples of Hit ID and Hit objects.""" 283 for item in self._items.items(): 284 yield item
285
286 - def __contains__(self, hit_key):
287 if isinstance(hit_key, Hit): 288 return self._hit_key_function(hit_key) in self._items 289 return hit_key in self._items or hit_key in self.__alt_hit_ids
290
291 - def __len__(self):
292 return len(self._items)
293 294 # Python 3:
295 - def __bool__(self):
296 return bool(self._items)
297 298 # Python 2: 299 __nonzero__ = __bool__ 300
301 - def __repr__(self):
302 return "QueryResult(id=%r, %r hits)" % (self.id, len(self))
303
304 - def __str__(self):
305 lines = [] 306 307 # set program and version line 308 lines.append('Program: %s (%s)' % (self.program, self.version)) 309 310 # set query id line 311 qid_line = ' Query: %s' % self.id 312 if hasattr(self, 'seq_len'): 313 qid_line += ' (%i)' % self.seq_len 314 if self.description: 315 qid_line += trim_str('\n %s' % self.description, 80, '...') 316 lines.append(qid_line) 317 318 # set target line 319 lines.append(' Target: %s' % self.target) 320 321 # set hit lines 322 if not self.hits: 323 lines.append(' Hits: 0') 324 else: 325 lines.append(' Hits: %s %s %s' % ('-' * 4, '-' * 5, '-' * 58)) 326 pattern = '%13s %5s %s' 327 lines.append(pattern % ('#', '# HSP', 'ID + description')) 328 lines.append(pattern % ('-' * 4, '-' * 5, '-' * 58)) 329 for idx, hit in enumerate(self.hits): 330 if idx < 30: 331 hid_line = '%s %s' % (hit.id, hit.description) 332 if len(hid_line) > 58: 333 hid_line = hid_line[:55] + '...' 334 lines.append(pattern % (idx, str(len(hit)), hid_line)) 335 elif idx > len(self.hits) - 4: 336 hid_line = '%s %s' % (hit.id, hit.description) 337 if len(hid_line) > 58: 338 hid_line = hid_line[:55] + '...' 339 lines.append(pattern % (idx, str(len(hit)), hid_line)) 340 elif idx == 30: 341 lines.append('%14s' % '~~~') 342 343 return '\n'.join(lines)
344
345 - def __getitem__(self, hit_key):
346 # retrieval using slice objects returns another QueryResult object 347 if isinstance(hit_key, slice): 348 # should we return just a list of Hits instead of a full blown 349 # QueryResult object if it's a slice? 350 hits = list(self.hits)[hit_key] 351 obj = self.__class__(hits, self.id, self._hit_key_function) 352 self._transfer_attrs(obj) 353 return obj 354 355 # if key is an int, then retrieve the Hit at the int index 356 elif isinstance(hit_key, int): 357 length = len(self) 358 if 0 <= hit_key < length: 359 for idx, item in enumerate(self.iterhits()): 360 if idx == hit_key: 361 return item 362 elif -1 * length <= hit_key < 0: 363 for idx, item in enumerate(self.iterhits()): 364 if length + hit_key == idx: 365 return item 366 raise IndexError("list index out of range") 367 368 # if key is a string, then do a regular dictionary retrieval 369 # falling back on alternative hit IDs 370 try: 371 return self._items[hit_key] 372 except KeyError: 373 return self._items[self.__alt_hit_ids[hit_key]]
374
375 - def __setitem__(self, hit_key, hit):
376 # only accept string keys 377 if not isinstance(hit_key, basestring): 378 raise TypeError("QueryResult object keys must be a string.") 379 # hit must be a Hit object 380 if not isinstance(hit, Hit): 381 raise TypeError("QueryResult objects can only contain Hit objects.") 382 qid = self.id 383 hqid = hit.query_id 384 # and it must have the same query ID as this object's ID 385 # unless it's the query ID is None (default for empty objects), in which 386 # case we want to use the hit's query ID as the query ID 387 if qid is not None: 388 if hqid != qid: 389 raise ValueError("Expected Hit with query ID %r, found %r " 390 "instead." % (qid, hqid)) 391 else: 392 self.id = hqid 393 # same thing with descriptions 394 qdesc = self.description 395 hqdesc = hit.query_description 396 if qdesc is not None: 397 if hqdesc != qdesc: 398 raise ValueError("Expected Hit with query description %r, " 399 "found %r instead." % (qdesc, hqdesc)) 400 else: 401 self.description = hqdesc 402 403 # remove existing alt_id references, if hit_key already exists 404 if hit_key in self._items: 405 for alt_key in self._items[hit_key].id_all[1:]: 406 del self.__alt_hit_ids[alt_key] 407 408 # if hit_key is already present as an alternative ID 409 # delete it from the alternative ID dict 410 if hit_key in self.__alt_hit_ids: 411 del self.__alt_hit_ids[hit_key] 412 413 self._items[hit_key] = hit 414 for alt_id in hit.id_all[1:]: 415 self.__alt_hit_ids[alt_id] = hit_key
416
417 - def __delitem__(self, hit_key):
418 # if hit_key an integer or slice, get the corresponding key first 419 # and put it into a list 420 if isinstance(hit_key, int): 421 hit_keys = [list(self.hit_keys)[hit_key]] 422 # the same, if it's a slice 423 elif isinstance(hit_key, slice): 424 hit_keys = list(self.hit_keys)[hit_key] 425 # otherwise put it in a list 426 else: 427 hit_keys = [hit_key] 428 429 for key in hit_keys: 430 deleted = False 431 if key in self._items: 432 del self._items[key] 433 deleted = True 434 if key in self.__alt_hit_ids: 435 del self._items[self.__alt_hit_ids[key]] 436 del self.__alt_hit_ids[key] 437 deleted = True 438 if not deleted: 439 raise KeyError('%r'.format(key)) 440 return
441 442 # properties # 443 id = optionalcascade('_id', 'query_id', """QueryResult ID string""") 444 description = optionalcascade('_description', 'query_description', 445 """QueryResult description""") 446 447 @property
448 - def hsps(self):
449 """HSP objects contained in the QueryResult.""" 450 return [hsp for hsp in chain(*self.hits)]
451 452 @property
453 - def fragments(self):
454 """HSPFragment objects contained in the QueryResult.""" 455 return [frag for frag in chain(*self.hsps)]
456 457 # public methods #
458 - def absorb(self, hit):
459 """Adds a Hit object to the end of QueryResult. If the QueryResult 460 already has a Hit with the same ID, append the new Hit's HSPs into 461 the existing Hit. 462 463 :param hit: object to absorb 464 :type hit: Hit 465 466 This method is used for file formats that may output the same Hit in 467 separate places, such as BLAT or Exonerate. In both formats, Hit 468 with different strands are put in different places. However, SearchIO 469 considers them to be the same as a Hit object should be all database 470 entries with the same ID, regardless of strand orientation. 471 472 """ 473 try: 474 self.append(hit) 475 except ValueError: 476 assert hit.id in self 477 for hsp in hit: 478 self[hit.id].append(hsp)
479
480 - def append(self, hit):
481 """Adds a Hit object to the end of QueryResult. 482 483 :param hit: object to append 484 :type hit: Hit 485 486 Any Hit object appended must have the same ``query_id`` property as the 487 QueryResult's ``id`` property. If the hit key already exists, a 488 ``ValueError`` will be raised. 489 490 """ 491 # if a custom hit_key_function is supplied, use it to define th hit key 492 if self._hit_key_function is not None: 493 hit_key = self._hit_key_function(hit) 494 else: 495 hit_key = hit.id 496 497 if hit_key not in self and all(pid not in self for pid in hit.id_all[1:]): 498 self[hit_key] = hit 499 else: 500 raise ValueError("The ID or alternative IDs of Hit %r exists in " 501 "this QueryResult." % hit_key)
502
503 - def hit_filter(self, func=None):
504 """Creates a new QueryResult object whose Hit objects pass the filter 505 function. 506 507 :param func: filter function 508 :type func: callable, accepts Hit, returns bool 509 510 Here is an example of using ``hit_filter`` to select Hits whose 511 description begins with the string 'Homo sapiens', case sensitive:: 512 513 >>> from Bio import SearchIO 514 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 515 >>> def desc_filter(hit): 516 ... return hit.description.startswith('Homo sapiens') 517 ... 518 >>> len(qresult) 519 100 520 >>> filtered = qresult.hit_filter(desc_filter) 521 >>> len(filtered) 522 39 523 >>> print(filtered[:4]) 524 Program: blastn (2.2.27+) 525 Query: 33211 (61) 526 mir_1 527 Target: refseq_rna 528 Hits: ---- ----- ---------------------------------------------------------- 529 # # HSP ID + description 530 ---- ----- ---------------------------------------------------------- 531 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 532 1 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 533 2 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 534 3 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51... 535 536 Note that instance attributes (other than the hits) from the unfiltered 537 QueryResult are retained in the filtered object. 538 539 >>> qresult.program == filtered.program 540 True 541 >>> qresult.target == filtered.target 542 True 543 544 """ 545 hits = list(filter(func, self.hits)) 546 obj = self.__class__(hits, self.id, self._hit_key_function) 547 self._transfer_attrs(obj) 548 return obj
549
550 - def hit_map(self, func=None):
551 """Creates a new QueryResult object, mapping the given function to its 552 Hits. 553 554 :param func: map function 555 :type func: callable, accepts Hit, returns Hit 556 557 Here is an example of using ``hit_map`` with a function that discards all 558 HSPs in a Hit except for the first one:: 559 560 >>> from Bio import SearchIO 561 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 562 >>> print(qresult[:8]) 563 Program: blastn (2.2.27+) 564 Query: 33211 (61) 565 mir_1 566 Target: refseq_rna 567 Hits: ---- ----- ---------------------------------------------------------- 568 # # HSP ID + description 569 ---- ----- ---------------------------------------------------------- 570 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 571 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 572 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 573 3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... 574 4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... 575 5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 576 6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 577 7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... 578 579 >>> top_hsp = lambda hit: hit[:1] 580 >>> mapped_qresult = qresult.hit_map(top_hsp) 581 >>> print(mapped_qresult[:8]) 582 Program: blastn (2.2.27+) 583 Query: 33211 (61) 584 mir_1 585 Target: refseq_rna 586 Hits: ---- ----- ---------------------------------------------------------- 587 # # HSP ID + description 588 ---- ----- ---------------------------------------------------------- 589 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 590 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 591 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 592 3 1 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... 593 4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... 594 5 1 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 595 6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 596 7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... 597 598 """ 599 hits = [deepcopy(hit) for hit in self.hits] 600 if func is not None: 601 hits = [func(x) for x in hits] 602 obj = self.__class__(hits, self.id, self._hit_key_function) 603 self._transfer_attrs(obj) 604 return obj
605
606 - def hsp_filter(self, func=None):
607 """Creates a new QueryResult object whose HSP objects pass the filter 608 function. 609 610 ``hsp_filter`` is the same as ``hit_filter``, except that it filters 611 directly on each HSP object in every Hit. If the filtering removes 612 all HSP objects in a given Hit, the entire Hit will be discarded. This 613 will result in the QueryResult having less Hit after filtering. 614 615 """ 616 hits = [x for x in (hit.filter(func) for hit in self.hits) if x] 617 obj = self.__class__(hits, self.id, self._hit_key_function) 618 self._transfer_attrs(obj) 619 return obj
620
621 - def hsp_map(self, func=None):
622 """Creates a new QueryResult object, mapping the given function to its 623 HSPs. 624 625 ``hsp_map`` is the same as ``hit_map``, except that it applies the given 626 function to all HSP objects in every Hit, instead of the Hit objects. 627 628 """ 629 hits = [x for x in (hit.map(func) for hit in list(self.hits)[:]) if x] 630 obj = self.__class__(hits, self.id, self._hit_key_function) 631 self._transfer_attrs(obj) 632 return obj
633 634 # marker for default self.pop() return value 635 # this method is adapted from Python's built in OrderedDict.pop 636 # implementation 637 __marker = object() 638
639 - def pop(self, hit_key=-1, default=__marker):
640 """Removes the specified hit key and return the Hit object. 641 642 :param hit_key: key of the Hit object to return 643 :type hit_key: int or string 644 :param default: return value if no Hit exists with the given key 645 :type default: object 646 647 By default, ``pop`` will remove and return the last Hit object in the 648 QueryResult object. To remove specific Hit objects, you can use its 649 integer index or hit key. 650 651 >>> from Bio import SearchIO 652 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 653 >>> len(qresult) 654 100 655 >>> for hit in qresult[:5]: 656 ... print(hit.id) 657 ... 658 gi|262205317|ref|NR_030195.1| 659 gi|301171311|ref|NR_035856.1| 660 gi|270133242|ref|NR_032573.1| 661 gi|301171322|ref|NR_035857.1| 662 gi|301171267|ref|NR_035851.1| 663 664 # remove the last hit 665 >>> qresult.pop() 666 Hit(id='gi|397513516|ref|XM_003827011.1|', query_id='33211', 1 hsps) 667 668 # remove the first hit 669 >>> qresult.pop(0) 670 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 671 672 # remove hit with the given ID 673 >>> qresult.pop('gi|301171322|ref|NR_035857.1|') 674 Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps) 675 676 """ 677 # if key is an integer (index) 678 # get the ID for the Hit object at that index 679 if isinstance(hit_key, int): 680 # raise the appropriate error if there is no hit 681 if not self: 682 raise IndexError("pop from empty list") 683 hit_key = list(self.hit_keys)[hit_key] 684 685 try: 686 hit = self._items.pop(hit_key) 687 # remove all alternative IDs of the popped hit 688 for alt_id in hit.id_all[1:]: 689 try: 690 del self.__alt_hit_ids[alt_id] 691 except KeyError: 692 pass 693 return hit 694 except KeyError: 695 if hit_key in self.__alt_hit_ids: 696 return self.pop(self.__alt_hit_ids[hit_key], default) 697 # if key doesn't exist and no default is set, raise a KeyError 698 if default is self.__marker: 699 raise KeyError(hit_key) 700 # if key doesn't exist but a default is set, return the default value 701 return default
702
703 - def index(self, hit_key):
704 """Returns the index of a given hit key, zero-based. 705 706 :param hit_key: hit ID 707 :type hit_key: string 708 709 This method is useful for finding out the integer index (usually 710 correlated with search rank) of a given hit key. 711 712 >>> from Bio import SearchIO 713 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 714 >>> qresult.index('gi|301171259|ref|NR_035850.1|') 715 7 716 717 """ 718 if isinstance(hit_key, Hit): 719 return list(self.hit_keys).index(hit_key.id) 720 try: 721 return list(self.hit_keys).index(hit_key) 722 except ValueError: 723 if hit_key in self.__alt_hit_ids: 724 return self.index(self.__alt_hit_ids[hit_key]) 725 raise
726
727 - def sort(self, key=None, reverse=False, in_place=True):
728 # no cmp argument to make sort more Python 3-like 729 """Sorts the Hit objects. 730 731 :param key: sorting function 732 :type key: callable, accepts Hit, returns key for sorting 733 :param reverse: whether to reverse sorting results or no 734 :type reverse: bool 735 :param in_place: whether to do in-place sorting or no 736 :type in_place: bool 737 738 ``sort`` defaults to sorting in-place, to mimick Python's ``list.sort`` 739 method. If you set the ``in_place`` argument to False, it will treat 740 return a new, sorted QueryResult object and keep the initial one 741 unsorted. 742 743 """ 744 if key is None: 745 # if reverse is True, reverse the hits 746 if reverse: 747 sorted_hits = list(self.hits)[::-1] 748 # otherwise (default options) make a copy of the hits 749 else: 750 sorted_hits = list(self.hits)[:] 751 else: 752 sorted_hits = sorted(self.hits, key=key, reverse=reverse) 753 754 # if sorting is in-place, don't create a new QueryResult object 755 if in_place: 756 new_hits = OrderedDict() 757 for hit in sorted_hits: 758 new_hits[self._hit_key_function(hit)] = hit 759 self._items = new_hits 760 # otherwise, return a new sorted QueryResult object 761 else: 762 obj = self.__class__(sorted_hits, self.id, self._hit_key_function) 763 self._transfer_attrs(obj) 764 return obj
765 766 767 # if not used as a module, run the doctest 768 if __name__ == "__main__": 769 from Bio._utils import run_doctest 770 run_doctest() 771