Talk:Retrieve nonmatching blast queries

From Biopython
Revision as of 12:04, 5 June 2009 by Giles.weaver (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

re Discussion

Hmmm, perhaps the 'one record at a time' I talked about isn't any faster. Doing this

for record in NCBIXML.parse(open("BLAST_RESULTS.out", 'r')):
  recID = record.query.split()[0] 
  if recID in q_dict.keys():
    del q_dict[recID

takes ~ 30secs on my little laptop with the files I used for the recipe whereas the recipe version takes more like 20secs. Is there another way to do this?

I think the check on each BLAST record should be conditional (e.g. if record.alignments) in case the NCBI do give an empty record when there are no hits. I think they do this already when there is a single query. Peter

For now at least multi-query searches completely skip 'no hits', but I guess that could change, I've added a check that record.alignments is longer than zero (each blast record is born with an empty list so this should work) to 'future proof' it and added (probably too many) comments to that effect --Davidw 01:39, 12 May 2009 (UTC)
Personal tools