Package Bio :: Package SearchIO :: Package _model :: Module query
[hide private]
[frames] | no frames]

Source Code for Module Bio.SearchIO._model.query

  1  # Copyright 2012 by Wibowo Arindrarto.  All rights reserved. 
  2  # This code is part of the Biopython distribution and governed by its 
  3  # license.  Please see the LICENSE file that should have been included 
  4  # as part of this package. 
  5   
  6  """Bio.SearchIO object to model search results from a single query.""" 
  7   
  8  from __future__ import print_function 
  9   
 10  from copy import deepcopy 
 11  from itertools import chain 
 12  from collections import OrderedDict 
 13   
 14  from Bio._py3k import filter 
 15  from Bio._py3k import basestring 
 16   
 17  from Bio._utils import trim_str 
 18  from Bio.SearchIO._utils import optionalcascade 
 19   
 20  from ._base import _BaseSearchObject 
 21  from .hit import Hit 
22 23 24 -class QueryResult(_BaseSearchObject):
25 """Class representing search results from a single query. 26 27 QueryResult is the container object that stores all search hits from a 28 single search query. It is the top-level object returned by SearchIO's two 29 main functions, ``read`` and ``parse``. Depending on the search results and 30 search output format, a QueryResult object will contain zero or more Hit 31 objects (see Hit). 32 33 You can take a quick look at a QueryResult's contents and attributes by 34 invoking ``print`` on it:: 35 36 >>> from Bio import SearchIO 37 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 38 >>> print(qresult) 39 Program: blastn (2.2.27+) 40 Query: 33211 (61) 41 mir_1 42 Target: refseq_rna 43 Hits: ---- ----- ---------------------------------------------------------- 44 # # HSP ID + description 45 ---- ----- ---------------------------------------------------------- 46 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 47 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 48 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 49 3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... 50 4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... 51 5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 52 6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 53 7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... 54 8 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51... 55 9 2 gi|301171447|ref|NR_035871.1| Pan troglodytes microRNA... 56 10 1 gi|301171276|ref|NR_035852.1| Pan troglodytes microRNA... 57 11 1 gi|262205290|ref|NR_030188.1| Homo sapiens microRNA 51... 58 ... 59 60 If you just want to know how many hits a QueryResult has, you can invoke 61 ``len`` on it. Alternatively, you can simply type its name in the interpreter:: 62 63 >>> len(qresult) 64 100 65 >>> qresult 66 QueryResult(id='33211', 100 hits) 67 68 QueryResult behaves like a hybrid of Python's built-in list and dictionary. 69 You can retrieve its items (Hit objects) using the integer index of the 70 item, just like regular Python lists:: 71 72 >>> first_hit = qresult[0] 73 >>> first_hit 74 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 75 76 You can slice QueryResult objects as well. Slicing will return a new 77 QueryResult object containing only the sliced hits:: 78 79 >>> sliced_qresult = qresult[:3] # slice the first three hits 80 >>> len(qresult) 81 100 82 >>> len(sliced_qresult) 83 3 84 >>> print(sliced_qresult) 85 Program: blastn (2.2.27+) 86 Query: 33211 (61) 87 mir_1 88 Target: refseq_rna 89 Hits: ---- ----- ---------------------------------------------------------- 90 # # HSP ID + description 91 ---- ----- ---------------------------------------------------------- 92 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 93 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 94 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 95 96 Like Python dictionaries, you can also retrieve hits using the hit's ID. 97 This is useful for retrieving hits that you know should exist in a given 98 search:: 99 100 >>> hit = qresult['gi|262205317|ref|NR_030195.1|'] 101 >>> hit 102 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 103 104 You can also replace a Hit in QueryResult with another Hit using either the 105 integer index or hit key string. Note that the replacing object must be a 106 Hit that has the same ``query_id`` property as the QueryResult object. 107 108 If you're not sure whether a QueryResult contains a particular hit, you can 109 use the hit ID to check for membership first:: 110 111 >>> 'gi|262205317|ref|NR_030195.1|' in qresult 112 True 113 >>> 'gi|262380031|ref|NR_023426.1|' in qresult 114 False 115 116 Or, if you just want to know the rank / position of a given hit, you can 117 use the hit ID as an argument for the ``index`` method. Note that the values 118 returned will be zero-based. So zero (0) means the hit is the first in the 119 QueryResult, three (3) means the hit is the fourth item, and so on. If the 120 hit does not exist in the QueryResult, a ``ValueError`` will be raised. 121 122 >>> qresult.index('gi|262205317|ref|NR_030195.1|') 123 0 124 >>> qresult.index('gi|262205330|ref|NR_030198.1|') 125 5 126 >>> qresult.index('gi|262380031|ref|NR_023426.1|') 127 Traceback (most recent call last): 128 ... 129 ValueError: ... 130 131 To ease working with a large number of hits, QueryResult has several 132 ``filter`` and ``map`` methods, analogous to Python's built-in functions with 133 the same names. There are ``filter`` and ``map`` methods available for 134 operations over both Hit objects or HSP objects. As an example, here we are 135 using the ``hit_map`` method to rename all hit IDs within a QueryResult:: 136 137 >>> def renamer(hit): 138 ... hit.id = hit.id.split('|')[3] 139 ... return hit 140 >>> mapped_qresult = qresult.hit_map(renamer) 141 >>> print(mapped_qresult) 142 Program: blastn (2.2.27+) 143 Query: 33211 (61) 144 mir_1 145 Target: refseq_rna 146 Hits: ---- ----- ---------------------------------------------------------- 147 # # HSP ID + description 148 ---- ----- ---------------------------------------------------------- 149 0 1 NR_030195.1 Homo sapiens microRNA 520b (MIR520B), micr... 150 1 1 NR_035856.1 Pan troglodytes microRNA mir-520b (MIR520B... 151 2 1 NR_032573.1 Macaca mulatta microRNA mir-519a (MIR519A)... 152 ... 153 154 The principle for other ``map`` and ``filter`` methods are similar: they accept 155 a function, applies it, and returns a new QueryResult object. 156 157 There are also other methods useful for working with list-like objects: 158 ``append``, ``pop``, and ``sort``. More details and examples are available in 159 their respective documentations. 160 161 Finally, just like Python lists and dictionaries, QueryResult objects are 162 iterable. Iteration over QueryResults will yield Hit objects:: 163 164 >>> for hit in qresult[:4]: # iterate over the first four items 165 ... hit 166 ... 167 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 168 Hit(id='gi|301171311|ref|NR_035856.1|', query_id='33211', 1 hsps) 169 Hit(id='gi|270133242|ref|NR_032573.1|', query_id='33211', 1 hsps) 170 Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps) 171 172 If you need access to all the hits in a QueryResult object, you can get 173 them in a list using the ``hits`` property. Similarly, access to all hit IDs is 174 available through the ``hit_keys`` property. 175 176 >>> qresult.hits 177 [Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps), ...] 178 >>> qresult.hit_keys 179 ['gi|262205317|ref|NR_030195.1|', 'gi|301171311|ref|NR_035856.1|', ...] 180 181 """ 182 183 # attributes we don't want to transfer when creating a new QueryResult class 184 # from this one 185 _NON_STICKY_ATTRS = ('_items', '__alt_hit_ids', ) 186
187 - def __init__(self, hits=(), id=None, hit_key_function=None):
188 """Initializes a QueryResult object. 189 190 :param id: query sequence ID 191 :type id: string 192 :param hits: iterator yielding Hit objects 193 :type hits: iterable 194 :param hit_key_function: function to define hit keys 195 :type hit_key_function: callable, accepts Hit objects, returns string 196 197 """ 198 # default values 199 self._id = id 200 self._hit_key_function = hit_key_function or _hit_key_func 201 self._items = OrderedDict() 202 self._description = None 203 self.__alt_hit_ids = {} 204 self.program = '<unknown program>' 205 self.target = '<unknown target>' 206 self.version = '<unknown version>' 207 208 # validate Hit objects and fill up self._items 209 for hit in hits: 210 # validation is handled by __setitem__ 211 self.append(hit)
212 213 # handle Python 2 OrderedDict behavior 214 if hasattr(OrderedDict, 'iteritems'): 215
216 - def __iter__(self):
217 return self.iterhits()
218 219 @property
220 - def hits(self):
221 """Hit objects contained in the QueryResult.""" 222 return self._items.values()
223 224 @property
225 - def hit_keys(self):
226 """Hit IDs of the Hit objects contained in the QueryResult.""" 227 return self._items.keys()
228 229 @property
230 - def items(self):
231 """List of tuples of Hit IDs and Hit objects.""" 232 return self._items.items()
233
234 - def iterhits(self):
235 """Returns an iterator over the Hit objects.""" 236 for hit in self._items.itervalues(): 237 yield hit
238
239 - def iterhit_keys(self):
240 """Returns an iterator over the ID of the Hit objects.""" 241 for hit_id in self._items: 242 yield hit_id
243
244 - def iteritems(self):
245 """Returns an iterator yielding tuples of Hit ID and Hit objects.""" 246 for item in self._items.iteritems(): 247 yield item
248 249 else: 250
251 - def __iter__(self):
252 return iter(self.hits)
253 254 @property
255 - def hits(self):
256 """Hit objects contained in the QueryResult.""" 257 return list(self._items.values())
258 259 @property
260 - def hit_keys(self):
261 """Hit IDs of the Hit objects contained in the QueryResult.""" 262 return list(self._items.keys())
263 264 @property
265 - def items(self):
266 """List of tuples of Hit IDs and Hit objects.""" 267 return list(self._items.items())
268
269 - def iterhits(self):
270 """Returns an iterator over the Hit objects.""" 271 for hit in self._items.values(): 272 yield hit
273
274 - def iterhit_keys(self):
275 """Returns an iterator over the ID of the Hit objects.""" 276 for hit_id in self._items: 277 yield hit_id
278
279 - def iteritems(self):
280 """Returns an iterator yielding tuples of Hit ID and Hit objects.""" 281 for item in self._items.items(): 282 yield item
283
284 - def __contains__(self, hit_key):
285 if isinstance(hit_key, Hit): 286 return self._hit_key_function(hit_key) in self._items 287 return hit_key in self._items or hit_key in self.__alt_hit_ids
288
289 - def __len__(self):
290 return len(self._items)
291 292 # Python 3:
293 - def __bool__(self):
294 return bool(self._items)
295 296 # Python 2: 297 __nonzero__ = __bool__ 298
299 - def __repr__(self):
300 return "QueryResult(id=%r, %r hits)" % (self.id, len(self))
301
302 - def __str__(self):
303 lines = [] 304 305 # set program and version line 306 lines.append('Program: %s (%s)' % (self.program, self.version)) 307 308 # set query id line 309 qid_line = ' Query: %s' % self.id 310 if hasattr(self, 'seq_len'): 311 qid_line += ' (%i)' % self.seq_len 312 if self.description: 313 qid_line += trim_str('\n %s' % self.description, 80, '...') 314 lines.append(qid_line) 315 316 # set target line 317 lines.append(' Target: %s' % self.target) 318 319 # set hit lines 320 if not self.hits: 321 lines.append(' Hits: 0') 322 else: 323 lines.append(' Hits: %s %s %s' % ('-' * 4, '-' * 5, '-' * 58)) 324 pattern = '%13s %5s %s' 325 lines.append(pattern % ('#', '# HSP', 'ID + description')) 326 lines.append(pattern % ('-' * 4, '-' * 5, '-' * 58)) 327 for idx, hit in enumerate(self.hits): 328 if idx < 30: 329 hid_line = '%s %s' % (hit.id, hit.description) 330 if len(hid_line) > 58: 331 hid_line = hid_line[:55] + '...' 332 lines.append(pattern % (idx, str(len(hit)), hid_line)) 333 elif idx > len(self.hits) - 4: 334 hid_line = '%s %s' % (hit.id, hit.description) 335 if len(hid_line) > 58: 336 hid_line = hid_line[:55] + '...' 337 lines.append(pattern % (idx, str(len(hit)), hid_line)) 338 elif idx == 30: 339 lines.append('%14s' % '~~~') 340 341 return '\n'.join(lines)
342
343 - def __getitem__(self, hit_key):
344 # retrieval using slice objects returns another QueryResult object 345 if isinstance(hit_key, slice): 346 # should we return just a list of Hits instead of a full blown 347 # QueryResult object if it's a slice? 348 hits = list(self.hits)[hit_key] 349 obj = self.__class__(hits, self.id, self._hit_key_function) 350 self._transfer_attrs(obj) 351 return obj 352 353 # if key is an int, then retrieve the Hit at the int index 354 elif isinstance(hit_key, int): 355 length = len(self) 356 if 0 <= hit_key < length: 357 for idx, item in enumerate(self.iterhits()): 358 if idx == hit_key: 359 return item 360 elif -1 * length <= hit_key < 0: 361 for idx, item in enumerate(self.iterhits()): 362 if length + hit_key == idx: 363 return item 364 raise IndexError("list index out of range") 365 366 # if key is a string, then do a regular dictionary retrieval 367 # falling back on alternative hit IDs 368 try: 369 return self._items[hit_key] 370 except KeyError: 371 return self._items[self.__alt_hit_ids[hit_key]]
372
373 - def __setitem__(self, hit_key, hit):
374 # only accept string keys 375 if not isinstance(hit_key, basestring): 376 raise TypeError("QueryResult object keys must be a string.") 377 # hit must be a Hit object 378 if not isinstance(hit, Hit): 379 raise TypeError("QueryResult objects can only contain Hit objects.") 380 qid = self.id 381 hqid = hit.query_id 382 # and it must have the same query ID as this object's ID 383 # unless it's the query ID is None (default for empty objects), in which 384 # case we want to use the hit's query ID as the query ID 385 if qid is not None: 386 if hqid != qid: 387 raise ValueError("Expected Hit with query ID %r, found %r " 388 "instead." % (qid, hqid)) 389 else: 390 self.id = hqid 391 # same thing with descriptions 392 qdesc = self.description 393 hqdesc = hit.query_description 394 if qdesc is not None: 395 if hqdesc != qdesc: 396 raise ValueError("Expected Hit with query description %r, " 397 "found %r instead." % (qdesc, hqdesc)) 398 else: 399 self.description = hqdesc 400 401 # remove existing alt_id references, if hit_key already exists 402 if hit_key in self._items: 403 for alt_key in self._items[hit_key].id_all[1:]: 404 del self.__alt_hit_ids[alt_key] 405 406 # if hit_key is already present as an alternative ID 407 # delete it from the alternative ID dict 408 if hit_key in self.__alt_hit_ids: 409 del self.__alt_hit_ids[hit_key] 410 411 self._items[hit_key] = hit 412 for alt_id in hit.id_all[1:]: 413 self.__alt_hit_ids[alt_id] = hit_key
414
415 - def __delitem__(self, hit_key):
416 # if hit_key an integer or slice, get the corresponding key first 417 # and put it into a list 418 if isinstance(hit_key, int): 419 hit_keys = [list(self.hit_keys)[hit_key]] 420 # the same, if it's a slice 421 elif isinstance(hit_key, slice): 422 hit_keys = list(self.hit_keys)[hit_key] 423 # otherwise put it in a list 424 else: 425 hit_keys = [hit_key] 426 427 for key in hit_keys: 428 deleted = False 429 if key in self._items: 430 del self._items[key] 431 deleted = True 432 if key in self.__alt_hit_ids: 433 del self._items[self.__alt_hit_ids[key]] 434 del self.__alt_hit_ids[key] 435 deleted = True 436 if not deleted: 437 raise KeyError(repr(key)) 438 return
439 440 # properties # 441 id = optionalcascade('_id', 'query_id', """QueryResult ID string""") 442 description = optionalcascade('_description', 'query_description', 443 """QueryResult description""") 444 445 @property
446 - def hsps(self):
447 """HSP objects contained in the QueryResult.""" 448 return [hsp for hsp in chain(*self.hits)]
449 450 @property
451 - def fragments(self):
452 """HSPFragment objects contained in the QueryResult.""" 453 return [frag for frag in chain(*self.hsps)]
454 455 # public methods #
456 - def absorb(self, hit):
457 """Adds a Hit object to the end of QueryResult. If the QueryResult 458 already has a Hit with the same ID, append the new Hit's HSPs into 459 the existing Hit. 460 461 :param hit: object to absorb 462 :type hit: Hit 463 464 This method is used for file formats that may output the same Hit in 465 separate places, such as BLAT or Exonerate. In both formats, Hit 466 with different strands are put in different places. However, SearchIO 467 considers them to be the same as a Hit object should be all database 468 entries with the same ID, regardless of strand orientation. 469 470 """ 471 try: 472 self.append(hit) 473 except ValueError: 474 assert hit.id in self 475 for hsp in hit: 476 self[hit.id].append(hsp)
477
478 - def append(self, hit):
479 """Adds a Hit object to the end of QueryResult. 480 481 :param hit: object to append 482 :type hit: Hit 483 484 Any Hit object appended must have the same ``query_id`` property as the 485 QueryResult's ``id`` property. If the hit key already exists, a 486 ``ValueError`` will be raised. 487 488 """ 489 # if a custom hit_key_function is supplied, use it to define th hit key 490 if self._hit_key_function is not None: 491 hit_key = self._hit_key_function(hit) 492 else: 493 hit_key = hit.id 494 495 if hit_key not in self and all(pid not in self for pid in hit.id_all[1:]): 496 self[hit_key] = hit 497 else: 498 raise ValueError("The ID or alternative IDs of Hit %r exists in " 499 "this QueryResult." % hit_key)
500
501 - def hit_filter(self, func=None):
502 """Creates a new QueryResult object whose Hit objects pass the filter 503 function. 504 505 :param func: filter function 506 :type func: callable, accepts Hit, returns bool 507 508 Here is an example of using ``hit_filter`` to select Hits whose 509 description begins with the string 'Homo sapiens', case sensitive:: 510 511 >>> from Bio import SearchIO 512 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 513 >>> def desc_filter(hit): 514 ... return hit.description.startswith('Homo sapiens') 515 ... 516 >>> len(qresult) 517 100 518 >>> filtered = qresult.hit_filter(desc_filter) 519 >>> len(filtered) 520 39 521 >>> print(filtered[:4]) 522 Program: blastn (2.2.27+) 523 Query: 33211 (61) 524 mir_1 525 Target: refseq_rna 526 Hits: ---- ----- ---------------------------------------------------------- 527 # # HSP ID + description 528 ---- ----- ---------------------------------------------------------- 529 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 530 1 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 531 2 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 532 3 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51... 533 534 Note that instance attributes (other than the hits) from the unfiltered 535 QueryResult are retained in the filtered object. 536 537 >>> qresult.program == filtered.program 538 True 539 >>> qresult.target == filtered.target 540 True 541 542 """ 543 hits = list(filter(func, self.hits)) 544 obj = self.__class__(hits, self.id, self._hit_key_function) 545 self._transfer_attrs(obj) 546 return obj
547
548 - def hit_map(self, func=None):
549 """Creates a new QueryResult object, mapping the given function to its 550 Hits. 551 552 :param func: map function 553 :type func: callable, accepts Hit, returns Hit 554 555 Here is an example of using ``hit_map`` with a function that discards all 556 HSPs in a Hit except for the first one:: 557 558 >>> from Bio import SearchIO 559 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 560 >>> print(qresult[:8]) 561 Program: blastn (2.2.27+) 562 Query: 33211 (61) 563 mir_1 564 Target: refseq_rna 565 Hits: ---- ----- ---------------------------------------------------------- 566 # # HSP ID + description 567 ---- ----- ---------------------------------------------------------- 568 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 569 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 570 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 571 3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... 572 4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... 573 5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 574 6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 575 7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... 576 577 >>> top_hsp = lambda hit: hit[:1] 578 >>> mapped_qresult = qresult.hit_map(top_hsp) 579 >>> print(mapped_qresult[:8]) 580 Program: blastn (2.2.27+) 581 Query: 33211 (61) 582 mir_1 583 Target: refseq_rna 584 Hits: ---- ----- ---------------------------------------------------------- 585 # # HSP ID + description 586 ---- ----- ---------------------------------------------------------- 587 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 588 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 589 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 590 3 1 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... 591 4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... 592 5 1 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 593 6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 594 7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... 595 596 """ 597 hits = [deepcopy(hit) for hit in self.hits] 598 if func is not None: 599 hits = [func(x) for x in hits] 600 obj = self.__class__(hits, self.id, self._hit_key_function) 601 self._transfer_attrs(obj) 602 return obj
603
604 - def hsp_filter(self, func=None):
605 """Creates a new QueryResult object whose HSP objects pass the filter 606 function. 607 608 ``hsp_filter`` is the same as ``hit_filter``, except that it filters 609 directly on each HSP object in every Hit. If the filtering removes 610 all HSP objects in a given Hit, the entire Hit will be discarded. This 611 will result in the QueryResult having less Hit after filtering. 612 613 """ 614 hits = [x for x in (hit.filter(func) for hit in self.hits) if x] 615 obj = self.__class__(hits, self.id, self._hit_key_function) 616 self._transfer_attrs(obj) 617 return obj
618
619 - def hsp_map(self, func=None):
620 """Creates a new QueryResult object, mapping the given function to its 621 HSPs. 622 623 ``hsp_map`` is the same as ``hit_map``, except that it applies the given 624 function to all HSP objects in every Hit, instead of the Hit objects. 625 626 """ 627 hits = [x for x in (hit.map(func) for hit in list(self.hits)[:]) if x] 628 obj = self.__class__(hits, self.id, self._hit_key_function) 629 self._transfer_attrs(obj) 630 return obj
631 632 # marker for default self.pop() return value 633 # this method is adapted from Python's built in OrderedDict.pop 634 # implementation 635 __marker = object() 636
637 - def pop(self, hit_key=-1, default=__marker):
638 """Removes the specified hit key and return the Hit object. 639 640 :param hit_key: key of the Hit object to return 641 :type hit_key: int or string 642 :param default: return value if no Hit exists with the given key 643 :type default: object 644 645 By default, ``pop`` will remove and return the last Hit object in the 646 QueryResult object. To remove specific Hit objects, you can use its 647 integer index or hit key. 648 649 >>> from Bio import SearchIO 650 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 651 >>> len(qresult) 652 100 653 >>> for hit in qresult[:5]: 654 ... print(hit.id) 655 ... 656 gi|262205317|ref|NR_030195.1| 657 gi|301171311|ref|NR_035856.1| 658 gi|270133242|ref|NR_032573.1| 659 gi|301171322|ref|NR_035857.1| 660 gi|301171267|ref|NR_035851.1| 661 662 # remove the last hit 663 >>> qresult.pop() 664 Hit(id='gi|397513516|ref|XM_003827011.1|', query_id='33211', 1 hsps) 665 666 # remove the first hit 667 >>> qresult.pop(0) 668 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 669 670 # remove hit with the given ID 671 >>> qresult.pop('gi|301171322|ref|NR_035857.1|') 672 Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps) 673 674 """ 675 # if key is an integer (index) 676 # get the ID for the Hit object at that index 677 if isinstance(hit_key, int): 678 # raise the appropriate error if there is no hit 679 if not self: 680 raise IndexError("pop from empty list") 681 hit_key = list(self.hit_keys)[hit_key] 682 683 try: 684 hit = self._items.pop(hit_key) 685 # remove all alternative IDs of the popped hit 686 for alt_id in hit.id_all[1:]: 687 try: 688 del self.__alt_hit_ids[alt_id] 689 except KeyError: 690 pass 691 return hit 692 except KeyError: 693 if hit_key in self.__alt_hit_ids: 694 return self.pop(self.__alt_hit_ids[hit_key], default) 695 # if key doesn't exist and no default is set, raise a KeyError 696 if default is self.__marker: 697 raise KeyError(hit_key) 698 # if key doesn't exist but a default is set, return the default value 699 return default
700
701 - def index(self, hit_key):
702 """Returns the index of a given hit key, zero-based. 703 704 :param hit_key: hit ID 705 :type hit_key: string 706 707 This method is useful for finding out the integer index (usually 708 correlated with search rank) of a given hit key. 709 710 >>> from Bio import SearchIO 711 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 712 >>> qresult.index('gi|301171259|ref|NR_035850.1|') 713 7 714 715 """ 716 if isinstance(hit_key, Hit): 717 return list(self.hit_keys).index(hit_key.id) 718 try: 719 return list(self.hit_keys).index(hit_key) 720 except ValueError: 721 if hit_key in self.__alt_hit_ids: 722 return self.index(self.__alt_hit_ids[hit_key]) 723 raise
724
725 - def sort(self, key=None, reverse=False, in_place=True):
726 # no cmp argument to make sort more Python 3-like 727 """Sorts the Hit objects. 728 729 :param key: sorting function 730 :type key: callable, accepts Hit, returns key for sorting 731 :param reverse: whether to reverse sorting results or no 732 :type reverse: bool 733 :param in_place: whether to do in-place sorting or no 734 :type in_place: bool 735 736 ``sort`` defaults to sorting in-place, to mimick Python's ``list.sort`` 737 method. If you set the ``in_place`` argument to False, it will treat 738 return a new, sorted QueryResult object and keep the initial one 739 unsorted. 740 741 """ 742 if key is None: 743 # if reverse is True, reverse the hits 744 if reverse: 745 sorted_hits = list(self.hits)[::-1] 746 # otherwise (default options) make a copy of the hits 747 else: 748 sorted_hits = list(self.hits)[:] 749 else: 750 sorted_hits = sorted(self.hits, key=key, reverse=reverse) 751 752 # if sorting is in-place, don't create a new QueryResult object 753 if in_place: 754 new_hits = OrderedDict() 755 for hit in sorted_hits: 756 new_hits[self._hit_key_function(hit)] = hit 757 self._items = new_hits 758 # otherwise, return a new sorted QueryResult object 759 else: 760 obj = self.__class__(sorted_hits, self.id, self._hit_key_function) 761 self._transfer_attrs(obj) 762 return obj
763
764 765 -def _hit_key_func(hit):
766 """Default hit key function for QueryResult.__init__ (PRIVATE).""" 767 return hit.id
768 769 770 # if not used as a module, run the doctest 771 if __name__ == "__main__": 772 from Bio._utils import run_doctest 773 run_doctest() 774