Edit this page on GitHub

Publications using Biopython.

There is a separate shorter listing of Biopython papers you may wish to cite.

This is a list of papers citing, referencing or using Biopython, by year sorted alphabetically by author. In many cases these citations are just via the website, which is now discouraged with the publication of the Biopython application note (Cock et al., 2009). Regrettably, in some cases the manuscripts themselves do not directly mention Biopython, but this has been confirmed explicitly by an author.

Publications from 2010 onwards

We don’t plan to compile a listing manually, since most publications should be citing our application note Cock et al., 2009 explicitly (and/or one of the module specific papers).

Publications from 2009

  1. Armano G and Manconi A (2009) ProDaMa: an open source Python library to generate protein structure datasets. BMC Res Notes, 2, 202

    Used Bio.PDB

  2. Banach M, Stapor K and Roterman I (2009) Chaperonin structure: the lare multi-subunit protien complex. Int J Mol Sci, 10, 844-861

    Used Bio.PDB and Bio.KDTree

  3. Berkholz DS, Krenesky PB, Davidson JR and Karplus PA (2009) Protein Geometry Database: a flexible engine to explore backbone conformations and their relationships to covalent geometry. Nucleic Acids Res, 38, D320-D325

    Used Bio.PDB

  4. Chi SW, Zang JB, Mele A and Darnell RB (2009) Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature, 460, 479-86

    General bioinformatics analysis including in silico random CLIP (crosslinking immunoprecipitation)

  5. Bouvier G, Evrard-Todeschi N, Girault JP and Bertho G (2009) Automatic clustering of docking poses in virtual screening process using self-organizing map. Bioinformatics, 26, 53-60

    Used Bio.PDB

  6. Cock PJ, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B and de Hoon MJ (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 25, 1422-3

    This application note covers the whole of Biopython

  7. Cock PJ, Fields CJ, Goto N, Heuer ML and Rice PM (2009) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res., 38, 1767-71

    This describes the FASTQ file format as supported in Biopython, BioPerl, BioRuby, BioJava and EMBOSS

  8. Cox CJ, Foster PG, Hirt RP, Harris SR and Embley TM (2008) The archaebacterial origin of eukaryotes. Proc. Natl. Acad. Sci. U.S.A., 105, 20356-61

    Used Bio.Blast, Bio.Fasta, & Bio.Nexus

  9. Daily MD and Gray JJ (2009) Allosteric communication occurs via networks of tertiary and quaternary motions in proteins. PLoS Comput. Biol., 5, e1000293

    Used Bio.SVDSuperimposer (and probably Bio.PDB as well)

  10. Donaire L, Wang Y, Gonzalez-Ibeas D, Mayer KF, Aranda MA and Llave C (2009) Deep-sequencing of plant viral small RNAs reveals effective and widespread targeting of viral genomes. Virology, 392, 203-14

    Used Biopython for removing adaptors from 454 sequencing reads

  11. Dudley JT and Butte AJ (2009) A quick guide for developing effective bioinformatics programming skills. PLoS Comput. Biol., 5, e1000589

    A bioinformatics review citing Biopython and other project

  12. Garbino A, van Oort RJ, Dixit SS, Landstrom AP, Ackerman MJ and Wehrens XH (2009) Molecular evolution of the junctophilin gene family. Physiol. Genomics, 37, 175-86

  13. Gould CM, Diella F, Via A, Puntervoll P, Gemünd C, Chabanis-Davidson S, Michael S, Sayadi A, Bryne JC, Chica C, Seiler M, Davey NE, Haslam N, Weatheritt RJ, Budd A, Hughes T, Pas J, Rychlewski L, Travé G, Aasland R, Helmer-Citterich M, Linding R and Gibson TJ (2009) ELM: the status of the 2010 eukaryotic linear motif resource. Nucleic Acids Res., 38, D167-80

    Used Biopython to retrieve information from SWISS-PROT and PubMed

  14. Han MV and Zmasek CM (2009) phyloXML: XML for evolutionary biology and comparative genomics. BMC Bioinformatics, 10, 356

  15. Holden N, Pritchard L and Toth I (2009) Colonization outwith the colon: plants as an alternative environmental reservoir for human pathogenic enterobacteria. FEMS Microbiol. Rev., 33, 689-703

    Used Bio.SeqIO, BioSQL and GenomeDiagram

  16. Ihekwaba AE, Nguyen PT and Priami C (2009) Elucidation of functional consequences of signalling pathway interactions. BMC Bioinformatics, 10, 370

    Data mining

  17. Jankun-Kelly TJ, Lindeman AD and Bridges SM (2009) Exploratory visual analysis of conserved domains on multiple sequence alignments. BMC Bioinformatics, 10 Suppl 11, S7

    Used Biopython for working with sequence alignments

  18. Jones JT, Kumar A, Pylypenko LA, Thirugnanasambandam A, Castelli L, Chapman S, Cock PJ, Grenier E, Lilley CJ, Phillips MS and Blok VC (2009) Identification and functional characterization of effectors in expressed sequence tags from various life cycle stages of the potato cyst nematode Globodera pallida. Mol. Plant Pathol., 10, 815-28

    Used Biopython for sequence manipulation

  19. Korhonen J, Martinmäki P, Pizzi C, Rastas P and Ukkonen E (2009) MOODS: fast search for position weight matrix matches in DNA sequences. Bioinformatics, 25, 3181-2

    Includes a Python wrapper with examples using it with Biopthon

  20. Lundborg M, Modhukur V and Widmalm G (2009) Glycosyltransferase functions of E. coli O-antigens. Glycobiology, 20, 366-8

  21. Macdonald N, Parks D and Beiko R (2009) SeqMonitor: influenza analysis pipeline and visualization. PLoS Curr, 1, RRN1040

  22. Miles LG, Isberg SR, Glenn TC, Lance SL, Dalzell P, Thomson PC and Moran C (2009) A genetic linkage map for the saltwater crocodile (Crocodylus porosus). BMC Genomics, 10, 339

    Used Biopython to work with genotypes for microsatellite loci

  23. Muller B, Richards AJ, Jin B and Lu X (2009) GOGrapher: A Python library for GO graph representation and analysis. BMC Res Notes, 2, 122

  24. Narayanan A, Sellers BD and Jacobson MP (2009) Energy-based analysis and prediction of the orientation between light- and heavy-chain antibody variable domains. J. Mol. Biol., 388, 941-53

    Used Bio.SVDSuperimposer (and probably Bio.PDB as well)

  25. Parisien M, Cruz JA, Westhof E and Major F (2009) New metrics for comparing and assessing discrepancies between RNA 3D structures and models. RNA, 15, 1875-85

    Used Bio.PDB

  26. Schanda P. (2009) Fast-pulsing longitudinal relaxation optimized techniques: Enriching the toolbox of fast biomolecular NMR spectroscopy. Progress in Nuclear Magnetic Resonance Spectroscopy, 55, 238-264

    Used Bio.PDB

  27. Smith SA, Beaulieu JM and Donoghue MJ (2009) Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches. BMC Evol. Biol., 9, 37

    Used Biopython to implement a pipeline with BioSQL

  28. Stivala A, Wirth A and Stuckey PJ (2009) Tableau-based protein substructure search using quadratic programming. BMC Bioinformatics, 10, 153

    PDB, SCOP and ASTRAL data

  29. Sun C, Wang X, Lin L. (2009) A multi-level disambiguation framework for gene name normalization. Acta Automatica Sinica, 35, 193-197

    Used Biopython to access MEDLINE

  30. Szabó TG, Palotai R, Antal P, Tokatly I, Tóthfalusi L, Lund O, Nagy G, Falus A and Buzás EI (2009) Critical role of glycosylation in determining the length and structure of T cell epitopes. Immunome Res, 5, 4

    Used Biopython for sequence manipulation and analysis

  31. Tanaka LY, Herskovic JR, Iyengar MS and Bernstam EV (2009) Sequential result refinement for searching the biomedical literature. J Biomed Inform, 42, 678-84

    Used Biopython to access MEDLINE

  32. Thomson RC (2009) PhyLIS: a simple GNU/Linux distribution for phylogenetics and phyloinformatics. Evol. Bioinform. Online, 5, 91-5

    A Linux distribution including Biopython

  33. Torrance GM, Leader DP, Gilbert DR and Milner-White EJ (2008) A novel main chain motif in proteins bridged by cationic groups: the niche. J. Mol. Biol., 385, 1076-86

    Used Biopython for the k-means algorithm

  34. Van der Auwera GA, Król JE, Suzuki H, Foster B, Van Houdt R, Brown CJ, Mergeay M and Top EM (2009) Plasmids captured in C. metallidurans CH34: defining the PromA family of broad-host-range plasmids. Antonie Van Leeuwenhoek, 96, 193-204

    Used Biopython and GenomeDiagram for plasmid map and alignment figures

  35. Weil P, Hoffgaard F and Hamacher K (2009) Estimating sufficient statistics in co-evolutionary analysis by mutual information. Comput Biol Chem, 33, 440-4

    Used Biopython for sequence manipulation

  36. Wiwanitkit V (2009) Weak linkage in androgen receptor: identification of mutation-prone points. Fertil. Steril., 91, e1-3

    Used Biopython to work with ExPASy

Publications from 2008

  1. Wiwanitkit V (2008) FHM3 in familial hemiplegic migraine is more resistant to mutation than FHM1 and FHM2. J. Neurol. Sci., 277, 76-9

    Used Biopython to work with ExPASy

  2. Antao T, Lopes A, Lopes RJ, Beja-Pereira A and Luikart G (2008) LOSITAN: a workbench to detect molecular adaptation based on a Fst-outlier method. BMC Bioinformatics, 9, 323

    Used Bio.PopGen

  3. Cardona G, Rosselló F and Valiente G (2008) Extended Newick: it is time for a standard representation of phylogenetic networks. BMC Bioinformatics, 9, 532

  4. Diella F, Gould CM, Chica C, Via A and Gibson TJ (2007) Phospho.ELM: a database of phosphorylation sites - update 2008. Nucleic Acids Res., 36, D240-4

    Used with UniProt and PubMed

  5. Faircloth BC (2008) MSATCOMMANDER: detection of microsatellite repeat arrays and automated, locus-specific primer design. Molecular Ecology Resources, 8, 92-94

    Used Bio.SeqIO

  6. Feldhahn M, Thiel P, Schuler MM, Hillen N, Stevanovic S, Rammensee HG and Kohlbacher O (2008) EpiToolKit–a web server for computational immunomics. Nucleic Acids Res., 36, W519-22

    Used Biopython for several unspecified tasks

  7. Fourment M and Gillings MR (2008) A comparison of common programming languages used in bioinformatics. BMC Bioinformatics, 9, 82

  8. Frelinger J, Kepler TB and Chan C (2008) Flow: Statistics, visualization and informatics for flow cytometry. Source Code Biol Med, 3, 10

    Used PyCluster / Bio.Cluster

  9. Geraci F, Pellegrini M and Renda ME (2008) AMIC@: All MIcroarray Clusterings @ once. Nucleic Acids Res., 36, W315-9

    Used PyCluster / Bio.Cluster and other unspecified parts of Biopython

  10. Gront D and Kolinski A (2008) Utility library for structural bioinformatics. Bioinformatics, 24, 584-5

  11. Higa RH and Tozzi CL (2008) A simple and efficient method for predicting protein-protein interaction sites. Genet. Mol. Res., 7, 898-909

    Used Bio.PDB

  12. Kim N and Lee C (2008) Bioinformatics detection of alternative splicing. Methods Mol. Biol., 452, 179-97

  13. Langkilde A, Kristensen SM, Lo Leggio L, Mølgaard A, Jensen JH, Houk AR, Navarro Poulsen JC, Kauppinen S and Larsen S (2008) Short strong hydrogen bonds in proteins: a case study of rhamnogalacturonan acetylesterase. Acta Crystallogr. D Biol. Crystallogr., D64, 851-63

    Used Bio.PDB

  14. Koczyk G and Berezovsky IN (2008) Domain Hierarchy and closed Loops (DHcL): a server for exploring hierarchy of protein domain structure. Nucleic Acids Res., 36, W239-45

    Used Bio.PDB

  15. Munteanu CR, González-Díaz H and Magalhães AL (2008) Enzymes/non-enzymes classification model complexity based on composition, sequence, 3D and topological indices. J. Theor. Biol., 254, 476-82

    Used Bio.PDB

  16. Park D, Kim BC, Cho SW, Park SJ, Choi JS, Kim SI, Bhak J and Lee S (2008) MassNet: a functional annotation service for protein mass spectrometry data. Nucleic Acids Res., 36, W491-5

    Used Biopython to calculate hydropathy profiles

  17. Ponty Y, Istrate R, Porcelli E and Clote P (2008) LocalMove: computing on-lattice fits for biopolymers. Nucleic Acids Res., 36, W216-22

    Used Biopython to work with lattice structures, including superimposition

  18. Raman K, Yeturu K and Chandra N (2008) targetTB: a target identification pipeline for Mycobacterium tuberculosis through an interactome, reactome and genome-scale structural analysis. BMC Syst Biol, 2, 109

    Used Bio.Blast

  19. Singh S (2008) India takes an open source approach to drug discovery. Cell, 133, 201-3

  20. Song J, Tan H, Takemoto K and Akutsu T (2008) HSEpred: predict half-sphere exposure from protein sequences. Bioinformatics, 24, 1489-97

    Used Bio.PDB

  21. Southey BR, Sweedler JV and Rodriguez-Zas SL (2008) A python analytical pipeline to identify prohormone precursors and predict prohormone cleavage sites. Front Neuroinform, 2, 7

  22. Walters J, Binkley E, Haygood R and Romano LA (2008) Evolutionary analysis of the cis-regulatory region of the spicule matrix gene SM50 in strongylocentrotid sea urchins. Dev. Biol., 315, 567-78

    Used Biopython for sequence manipulation

  23. Whitworth DE and Cock PJ (2008) Two-component systems of the myxobacteria: structure, diversity and evolutionary relationships. Microbiology (Reading, Engl.), 154, 360-72

    Used Bio.SeqIO, Bio.AlignIO and Bio.Blast

  24. Wiwanitkit V (2008) Identification of weak points prone for mutation in ferredoxin of Trichomonas vaginalis. Indian J Med Microbiol, 26, 158-9

    Used Biopython for working with ExPASy

  25. Wiwanitkit V (2007) Mutation-prone points in thrombin receptor. J. Thromb. Thrombolysis, 25, 190-2

    Used Biopython for working with ExPASy

Publications from 2007

  1. Antao T, Beja-Pereira A and Luikart G (2007) MODELER4SIMCOAL2: a user-friendly, extensible modeler of demography and linked loci for coalescent simulations. Bioinformatics, 23, 1848-50

  2. Avrova AO, Whisson SC, Pritchard L, Venter E, De Luca S, Hein I and Birch PR (2007) A novel non-protein-coding infection-specific gene family is clustered throughout the genome of Phytophthora infestans. Microbiology (Reading, Engl.), 153, 747-59

  3. Bassi S (2007) A primer on python for life science researchers. PLoS Comput. Biol., 3, e199

  4. Bernauer J, Azé J, Janin J and Poupon A (2007) A new protein-protein docking scoring function based on interface residue properties. Bioinformatics, 23, 555-62

  5. Domingues FS, Rahnenführer J and Lengauer T (2007) Conformational analysis of alternative protein structures. Bioinformatics, 23, 3131-8

  6. Chapman MA, Chang J, Weisman D, Kesseli RV and Burke JM (2007) Universal markers for comparative mapping and phylogenetic analysis in the Asteraceae (Compositae). Theor. Appl. Genet., 115, 747-55

  7. Cock PJ and Whitworth DE (2007) Evolution of gene overlaps: relative reading frame bias in prokaryotic two-component system genes. J. Mol. Evol., 64, 457-62

  8. Cock PJ and Whitworth DE (2007) Evolution of prokaryotic two-component system signaling pathways: gene fusions and fissions. Mol. Biol. Evol., 24, 2355-7

  9. Craddock T, Harwood CR, Hallinan J and Wipat A (2008) e-Science: relieving bottlenecks in large-scale genome analyses. Nat. Rev. Microbiol., 6, 948-54

  10. Ferrè F, Ponty Y, Lorenz WA and Clote P (2007) DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities. Nucleic Acids Res., 35, W659-68

  11. Gentle IE, Perry AJ, Alcock FH, Likić VA, Dolezal P, Ng ET, Purcell AW, McConnville M, Naderer T, Chanez AL, Charrière F, Aschinger C, Schneider A, Tokatlidis K and Lithgow T (2007) Conserved motifs reveal details of ancestry and structure in the small TIM chaperones of the mitochondrial intermembrane space. Mol. Biol. Evol., 24, 1149-60

  12. Grünberg R, Nilges M and Leckner J (2007) Biskit–a software platform for structural bioinformatics. Bioinformatics, 23, 769-70

  13. Hackney JA, Ehrenkaufer GM and Singh U (2007) Identification of putative transcriptional regulatory networks in Entamoeba histolytica using Bayesian inference. Nucleic Acids Res., 35, 2141-52

    Describes the Bio.MEME module

  14. Kauff F, Cox CJ and Lutzoni F (2007) WASABI: an automated sequence processing system for multigene phylogenies. Syst. Biol., 56, 523-31

  15. Ma BG, Chen LL and Zhang HY (2007) What determines protein folding type? An investigation of intrinsic structural properties and its implications for understanding folding mechanisms. J. Mol. Biol., 370, 439-48

  16. Picardi E, Regina TM, Brennicke A and Quagliariello C (2006) REDIdb: the RNA editing database. Nucleic Acids Res., 35, D173-7

  17. Pietal MJ, Tuszynska I and Bujnicki JM (2007) PROTMAP2D: visualization, comparison and analysis of 2D maps of protein structure. Bioinformatics, 23, 1429-30

  18. Rattei T, Ott S, Gutacker M, Rupp J, Maass M, Schreiber S, Solbach W, Wirth T and Gieffers J (2007) Genetic diversity of the obligate intracellular bacterium Chlamydophila pneumoniae by genome-wide analysis of single nucleotide polymorphisms: evidence for highly clonal population structure. BMC Genomics, 8, 355

  19. Sander O, Sing T, Sommer I, Low AJ, Cheung PK, Harrigan PR, Lengauer T and Domingues FS (2007) Structural descriptors of gp120 V3 loop for the prediction of HIV-1 coreceptor usage. PLoS Comput. Biol., 3, e58

  20. Tagami Y, Inaba N, Kutsuna N, Kurihara Y and Watanabe Y (2007) Specific enrichment of miRNAs in Arabidopsis thaliana infected with Tobacco mosaic virus. DNA Res., 14, 227-33

  21. Watanabe H, Enomoto T and Tanaka S (2007) Ab initio study of molecular interactions in higher plant and Galdieria partita Rubiscos with the fragment molecular orbital method. Biochem. Biophys. Res. Commun., 361, 367-72

  22. Whisson SC, Boevink PC, Moleleki L, Avrova AO, Morales JG, Gilroy EM, Armstrong MR, Grouffaud S, van West P, Chapman S, Hein I, Toth IK, Pritchard L and Birch PR (2007) A translocation signal for delivery of oomycete effector proteins into host plant cells. Nature, 450, 115-8

  23. Won KJ, Hamelryck T, Prügel-Bennett A and Krogh A (2007) An evolutionary method for learning HMM structure: prediction of protein secondary structure. BMC Bioinformatics, 8, 357

Publications from 2006

  1. Benita Y, Wise MJ, Lok MC, Humphery-Smith I and Oosting RS (2006) Analysis of high throughput protein expression in Escherichia coli. Mol. Cell Proteomics, 5, 1567-80

    This describes some of the Bio.SeqUtils module

  2. Bonis J, Furlong LI and Sanz F (2006) OSIRIS: a tool for retrieving literature about sequence variants. Bioinformatics, 22, 2567-9

  3. Casbon JA, Crooks GE and Saqi MA (2006) A high level interface to SCOP and ASTRAL implemented in python. BMC Bioinformatics, 7, 10

    Describes additions to the Bio.SCOP module

  4. Croce O, Lamarre M and Christen R (2006) Querying the public databases for sequences using complex keywords contained in the feature lines. BMC Bioinformatics, 7, 45

  5. Ferreira AO, Myers CR, Gordon JS, Martin GB, Vencato M, Collmer A, Wehling MD, Alfano JR, Moreno-Hagelsieb G, Lamboy WF, DeClerck G, Schneider DJ and Cartinhour SW (2006) Whole-genome expression profiling defines the HrpL regulon of Pseudomonas syringae pv. tomato DC3000, allows de novo reconstruction of the Hrp cis clement, and identifies novel coregulated genes. Mol. Plant Microbe Interact., 19, 1167-79

  6. Friedberg I, Harder T and Godzik A (2006) JAFA: a protein function annotation meta-server. Nucleic Acids Res., 34, W379-81

  7. Gront D and Kolinski A (2006) BioShell - a package of tools for structural biology computations. Bioinformatics, 22, 621-2

  8. Hegedűs T and Riordan JR (2006) Search for proteins with similarity to the CFTR R domain using an optimized RDBMS solution, mBioSQL. Central European Journal of Biology, 1, 29-42

  9. Lee KT, Park EW, Moon S, Park HS, Kim HY, Jang GW, Choi BH, Chung HY, Lee JW, Cheong IC, Oh SJ, Kim H, Suh DS and Kim TH (2005) Genomic sequence analysis of a potential QTL region for fat trait on pig chromosome 6. Genomics, 87, 218-24

  10. Mattingly CJ, Rosenstein MC, Davis AP, Colby GT, Forrest JN and Boyer JL (2006) The comparative toxicogenomics database: a cross-species resource for building chemical-gene interaction networks. Toxicol. Sci., 92, 587-95

  11. Munos B (2006) Can open-source R&D reinvigorate drug research? Nat Rev Drug Discov, 5, 723-9

  12. Nilsen H, Hayes B, Berg PR, Roseth A, Sundsaasen KK, Nilsen K and Lien S (2008) Construction of a dense SNP map for bovine chromosome 6 to assist the assembly of the bovine genome sequence. Anim. Genet., 39, 97-104

  13. Pritchard L, White JA, Birch PR and Toth IK (2005) GenomeDiagram: a python package for the visualization of large-scale genomic data. Bioinformatics, 22, 616-7

    This describes GenomeDiagram, now part of the Bio.Graphics module

  14. Stajich JE and Lapp H (2006) Open source tools and toolkits for bioinformatics: significance, and where are we? Brief. Bioinformatics, 7, 287-96

  15. Taylor J and Provart NJ (2006) CapsID: a web-based tool for developing parsimonious sets of CAPS molecular markers for genotyping. BMC Genet., 7, 27

    This describes the Bio.CAPS module

  16. Ternes P, Sperling P, Albrecht S, Franke S, Cregg JM, Warnecke D and Heinz E (2005) Identification of fungal sphingolipid C9-methyltransferases by phylogenetic profiling. J. Biol. Chem., 281, 5582-92

  17. Toth IK, Pritchard L and Birch PR (2006) Comparative genomics reveals what makes an enterobacterial plant pathogen. Annu Rev Phytopathol, 44, 305-36

  18. Windsor AJ, Schranz ME, Formanová N, Gebauer-Jung S, Bishop JG, Schnabelrauch D, Kroymann J and Mitchell-Olds T (2006) Partial shotgun sequencing of the Boechera stricta genome reveals extensive microsynteny and promoter conservation with Arabidopsis. Plant Physiol., 140, 1169-82

  19. Wuchty S (2006) Topology and weights in a protein domain interaction network - a novel way to predict protein interactions. BMC Genomics, 7, 122

  20. Zapala MA and Schork NJ (2006) Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables. Proc. Natl. Acad. Sci. U.S.A., 103, 19430-5

  21. Zotenko E, O’Leary DP and Przytycka TM (2006) Secondary structure spatial conformation footprint: a novel method for fast protein structure comparison and classification. BMC Struct. Biol., 6, 12

Publications from 2005

  1. Boomsma W and Hamelryck T (2005) Full cyclic coordinate descent: solving the protein loop closure problem in Cα space. BMC Bioinformatics, 6, 159

  2. Armstrong MR, Whisson SC, Pritchard L, Bos JI, Venter E, Avrova AO, Rehmany AP, Böhme U, Brooks K, Cherevach I, Hamlin N, White B, Fraser A, Lord A, Quail MA, Churcher C, Hall N, Berriman M, Huang S, Kamoun S, Beynon JL and Birch PR (2005) An ancestral oomycete locus contains late blight avirulence gene Avr3a, encoding a protein that is recognized in the host cytoplasm. Proc. Natl. Acad. Sci. U.S.A., 102, 7766-71

  3. Curk T, Demsar J, Xu Q, Leban G, Petrovic U, Bratko I, Shaulsky G and Zupan B (2004) Microarray data mining with visual programming. Bioinformatics, 21, 396-8

  4. Dimmic MW, Hubisz MJ, Bustamante CD and Nielsen R (2005) Detecting coevolving amino acid sites using Bayesian mutational mapping. Bioinformatics, 21 Suppl 1, i126-35

  5. Feder M and Bujnicki JM (2005) Identification of a new family of putative PD-(D/E)XK nucleases with unusual phylogenomic distribution and a new type of the active site. BMC Genomics, 6, 21

  6. Friedberg I and Godzik A (2005) Fragnostic: walking through protein structure space. Nucleic Acids Res., 33, W249-51

  7. Herskovic JR and Bernstam EV (2005) Using incomplete citation data for MEDLING results ranking. AMIA Annu Symp. Proc, 316-320

  8. Hamelryck T (2005) An amino acid has two sides: a new 2D measure provides a different view of solvent exposure. Proteins, 59, 38-48

  9. Johannessen BR, Skov LK, Kastrup JS, Kristensen O, Bolwig C, Larsen JN, Spangfort M, Lund K and Gajhede M (2005) Structure of the house dust mite allergen Der f 2: implications for function and molecular basis of IgE cross-reactivity. FEBS Lett., 579, 1208-12

  10. Majumdar I, Krishna SS and Grishin NV (2005) PALSSE: a program to delineate linear secondary structural elements from protein structures. BMC Bioinformatics, 6, 202

  11. Neerincx PB and Leunissen JA (2005) Evolution of web services in bioinformatics. Brief. Bioinformatics, 6, 178-88

  12. Trissl S, Rother K, Müller H, Steinke T, Koch I, Preissner R, Frömmel C and Leser U (2005) Columba: an integrated database of proteins, structures, and annotations. BMC Bioinformatics, 6, 81

  13. Olsen HG, Lien S, Gautier M, Nilsen H, Roseth A, Berg PR, Sundsaasen KK, Svendsen M and Meuwissen TH (2005) Mapping of a milk production quantitative trait locus to a 420-kb region on bovine chromosome 6. Genetics, 169, 275-83

  14. Whisson SC, Avrova AO, Lavrova O and Pritchard L (2005) Families of short interspersed elements in the genome of the oomycete plant pathogen, Phytophthora infestans. Fungal Genet. Biol., 42, 351-65

Publications from 2004

  1. Bell KS, Sebaihia M, Pritchard L, Holden MT, Hyman LJ, Holeva MC, Thomson NR, Bentley SD, Churcher LJ, Mungall K, Atkin R, Bason N, Brooks K, Chillingworth T, Clark K, Doggett J, Fraser A, Hance Z, Hauser H, Jagels K, Moule S, Norbertczak H, Ormond D, Price C, Quail MA, Sanders M, Walker D, Whitehead S, Salmond GP, Birch PR, Parkhill J and Toth IK (2004) Genome sequence of the enterobacterial phytopathogen Erwinia carotovora subsp. atroseptica and characterization of virulence factors. Proc. Natl. Acad. Sci. U.S.A., 101, 11105-10

  2. Chapman BA, Bowers JE, Schulze SR and Paterson AH (2004) A comparative phylogenetic approach for dating whole genome duplication events. Bioinformatics, 20, 180-5

  3. Flanagan K, Stevens R, Pocock M, Lee P and Wipat A (2004) Ontology for genome comparison and genomic rearrangements. Comp. Funct. Genomics, 5, 537-44

  4. de Hoon MJ, Imoto S, Nolan J and Miyano S (2004) Open source clustering software. Bioinformatics, 20, 1453-4

    This describes the Bio.Cluster module

  5. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY and Zhang J (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol., 5, R80

  6. King GJ (2004) Bioinformatics: harvesting information for plant and crop science. Semin. Cell Dev. Biol., 15, 721-31

  7. Swart EC, Hide WA and Seoighe C (2004) FRAGS: estimation of coding sequence substitution rates from fragmentary data. BMC Bioinformatics, 5, 8

Publications from 2003

  1. Bowers JE, Chapman BA, Rong J and Paterson AH (2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature, 422, 433-8

  2. Ernst P, Glatting KH and Suhai S (2003) A task framework for the web interface W2H. Bioinformatics, 19, 278-82

  3. Goto, N, Nakao, NC, Kawashima, S, Katayama, T, Kanehisa, T. (2003) BioRuby: open-source bioinformatics library. Genome Informatics, 14, 629-630

    This paper describes BioRuby

  4. Hamelryck T and Manderick B (2003) PDB file parser and structure class implemented in Python. Bioinformatics, 19, 2308-10

    This describes the Bio.PDB module

  5. De Hoon, MJL, Chapman, BA, Friedberg, I (2003) Bioinformatics and computational biology with Biopython. Genome Informatics, 14, 298-299

  6. Horner DS and Pesole G (2003) The estimation of relative site variability among aligned homologous protein sequences. Bioinformatics, 19, 600-6

  7. Kummerfeld SK, Weiss AS, Fekete A and Jermiin LS (2003) AMID: autonomous modeler of intragenic duplication. Appl. Bioinformatics, 2, 169-76

  8. Linding R, Russell RB, Neduva V and Gibson TJ (2003) GlobPlot: Exploring protein sequences for globularity and disorder. Nucleic Acids Res., 31, 3701-8

  9. Sugawara H and Miyazaki S (2003) Biological SOAP servers and web services provided by the public sequence data bank. Nucleic Acids Res., 31, 3836-9

  10. Wroe CJ, Stevens R, Goble CA and Ashburner M (2003) A methodology to migrate the gene ontology to a description logic environment using DAML+OIL. Pac Symp Biocomput, 624-635

Publications from 2002

  1. Chang JT, Schütze H and Altman RB (2002) Creating an online dictionary of abbreviations from MEDLINE. J Am Med Inform Assoc, 9, 612-20

  2. Lenhard B and Wasserman WW (2002) TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics, 18, 1135-6

  3. Mangalam H (2002) The Bio* toolkits - a brief overview. Brief. Bioinformatics, 3, 296-302

  4. Raychaudhuri S, Chang JT, Sutphin PD and Altman RB (2002) Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. Genome Res., 12, 203-14

  5. Raychaudhuri S, Schütze H and Altman RB (2002) Using text analysis to identify functionally coherent gene groups. Genome Res., 12, 1582-90

  6. Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JG, Korf I, Lapp H, Lehväslaiho H, Matsalla C, Mungall CJ, Osborne BI, Pocock MR, Schattner P, Senger M, Stein LD, Stupka E, Wilkinson MD and Birney E (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Res., 12, 1611-8

    This paper describes BioPerl

  7. Stein L (2002) Creating a bioinformatics nation. Nature, 417, 119-20

Publications from 2001

  1. Achard F, Vaysseix G and Barillot E (2001) XML, bioinformatics and data integration. Bioinformatics, 17, 115-25

  2. Chang JT, Raychaudhuri S and Altman RB (2001) Including biological literature improves homology search. Pac Symp Biocomput, 374-83

  3. Gemünd C, Ramu C, Altenberg-Greulich B and Gibson TJ (2001) Gene2EST: a BLAST2 server for searching expressed sequence tag (EST) databases with eukaryotic gene-sized queries. Nucleic Acids Res., 29, 1272-7

  4. Kawagashira N, Ohtomo Y, Murakami K, Matsubara K, Kawai J, Carninci P, Hayashizaki Y, Kikuchi S and Higo K (2001) Multiple zinc finger motifs with comparison of plant and insect. Genome Informatics, 12, 368-369

  5. Ramu C (2001) SIR: a simple indexing and retrieval system for biological flat file databases. Bioinformatics, 17, 756-8

  6. Woodwark KC (2001) Meeting review: the Bioinformatics Open Source Conference 2001 (BOSC 2001). Comp. Funct. Genomics, 2, 327-9

Publications from 2000

  1. Chapman BA and Chang JT (2000). Biopython: Python tools for computational biology. ACM SIGBIO Newsletter, 20, 15-19

    This serves as the official project announcement

  2. Ramu C, Gemünd C and Gibson TJ (2000) Object-oriented parsing of biological databases with Python. Bioinformatics, 16, 628-38

Publications from 1999

  1. Sanner MF (1999) Python: a programming language for software integration and development. J. Mol. Graph. Model., 17, 57-61

Note the Biopython project started in 1999.