Package Bio :: Package Medline
[hide private]
[frames] | no frames]

Source Code for Package Bio.Medline

  1  # Copyright 1999 by Jeffrey Chang.  All rights reserved. 
  2  # This code is part of the Biopython distribution and governed by its 
  3  # license.  Please see the LICENSE file that should have been included 
  4  # as part of this package. 
  5   
  6  """Code to work with Medline from the NCBI. 
  7   
  8  Classes: 
  9   - Record           A dictionary holding Medline data. 
 10   
 11  Functions: 
 12   - read             Reads one Medline record 
 13   - parse            Allows you to iterate over a bunch of Medline records 
 14   
 15  """ 
 16   
 17   
18 -class Record(dict):
19 """A dictionary holding information from a Medline record. 20 21 All data are stored under the mnemonic appearing in the Medline 22 file. These mnemonics have the following interpretations: 23 24 ========= ============================== 25 Mnemonic Description 26 --------- ------------------------------ 27 AB Abstract 28 CI Copyright Information 29 AD Affiliation 30 IRAD Investigator Affiliation 31 AID Article Identifier 32 AU Author 33 FAU Full Author 34 CN Corporate Author 35 DCOM Date Completed 36 DA Date Created 37 LR Date Last Revised 38 DEP Date of Electronic Publication 39 DP Date of Publication 40 EDAT Entrez Date 41 GS Gene Symbol 42 GN General Note 43 GR Grant Number 44 IR Investigator Name 45 FIR Full Investigator Name 46 IS ISSN 47 IP Issue 48 TA Journal Title Abbreviation 49 JT Journal Title 50 LA Language 51 LID Location Identifier 52 MID Manuscript Identifier 53 MHDA MeSH Date 54 MH MeSH Terms 55 JID NLM Unique ID 56 RF Number of References 57 OAB Other Abstract 58 OCI Other Copyright Information 59 OID Other ID 60 OT Other Term 61 OTO Other Term Owner 62 OWN Owner 63 PG Pagination 64 PS Personal Name as Subject 65 FPS Full Personal Name as Subject 66 PL Place of Publication 67 PHST Publication History Status 68 PST Publication Status 69 PT Publication Type 70 PUBM Publishing Model 71 PMC PubMed Central Identifier 72 PMID PubMed Unique Identifier 73 RN Registry Number/EC Number 74 NM Substance Name 75 SI Secondary Source ID 76 SO Source 77 SFM Space Flight Mission 78 STAT Status 79 SB Subset 80 TI Title 81 TT Transliterated Title 82 VI Volume 83 CON Comment on 84 CIN Comment in 85 EIN Erratum in 86 EFR Erratum for 87 CRI Corrected and Republished in 88 CRF Corrected and Republished from 89 PRIN Partial retraction in 90 PROF Partial retraction of 91 RPI Republished in 92 RPF Republished from 93 RIN Retraction in 94 ROF Retraction of 95 UIN Update in 96 UOF Update of 97 SPIN Summary for patients in 98 ORI Original report in 99 ========= ============================== 100 101 """
102 103
104 -def parse(handle):
105 """Read Medline records one by one from the handle. 106 107 The handle is either is a Medline file, a file-like object, or a list 108 of lines describing one or more Medline records. 109 110 Typical usage:: 111 112 from Bio import Medline 113 with open("mymedlinefile") as handle: 114 records = Medline.parse(handle) 115 for record in records: 116 print(record['TI']) 117 118 """ 119 # TODO - Turn that into a working doctest 120 # These keys point to string values 121 textkeys = ("ID", "PMID", "SO", "RF", "NI", "JC", "TA", "IS", "CY", "TT", 122 "CA", "IP", "VI", "DP", "YR", "PG", "LID", "DA", "LR", "OWN", 123 "STAT", "DCOM", "PUBM", "DEP", "PL", "JID", "SB", "PMC", 124 "EDAT", "MHDA", "PST", "AB", "AD", "EA", "TI", "JT") 125 handle = iter(handle) 126 127 key = "" 128 record = Record() 129 for line in handle: 130 line = line.rstrip() 131 if line[:6] == " ": # continuation line 132 if key == "MH": 133 # Multi-line MESH term, want to append to last entry in list 134 record[key][-1] += line[5:] # including space using line[5:] 135 else: 136 record[key].append(line[6:]) 137 elif line: 138 key = line[:4].rstrip() 139 if key not in record: 140 record[key] = [] 141 record[key].append(line[6:]) 142 elif record: 143 # Join each list of strings into one string. 144 for key in record: 145 if key in textkeys: 146 record[key] = " ".join(record[key]) 147 yield record 148 record = Record() 149 if record: # catch last one 150 for key in record: 151 if key in textkeys: 152 record[key] = " ".join(record[key]) 153 yield record
154 155
156 -def read(handle):
157 """Read a single Medline record from the handle. 158 159 The handle is either is a Medline file, a file-like object, or a list 160 of lines describing a Medline record. 161 162 Typical usage: 163 164 >>> from Bio import Medline 165 >>> with open("mymedlinefile") as handle: 166 ... record = Medline.read(handle) 167 ... print(record['TI']) 168 169 """ 170 # TODO - Turn that into a working doctest 171 records = parse(handle) 172 return next(records)
173