This page provides a central location to collect references to active projects. This is a good place to start if you are interested in contributing to Biopython and want to find larger projects in progress. For developers, use this to reference git branches or other projects which you will be working on for an extended period of time. Please keep it up to date as projects are finished and integrated into Biopython.
It has been proposed we port Biopython’s documentation from the existing combination of LaTeX (cookbook and main docs) and EpyDoc (API docs) to Sphinx. This is a multi-step process which we’re proceeding through incrementally, and help is always appreciated.
João Rodrigues’s Summer of Code project aimed to introduce several new features to the Bio.PDB structural biology module: functions for adding polar hydrogens to structures, probing for SS bridges based on structural information and annotations, renumbering residues, coarse-graining a structure, etc. A more comprehensive layout of the project is available on this wiki, and the code is on a GitHub branch. New code for Bio.PDB was also written for GSoC 2011. João and Eric are now working to integrate this new code into Biopython.
Giovanni and Tiago are working on expanding population genetics code in Biopython. See the PopGen development page for more details.
Brad is working on a Biopython GFF parser. Source code is available from git hub. Documentation is in progress at GFF Parsing. See blog posts on the initial implementation and MapReduce parallel version.
Nick Matzke developed a biogeography module for BioPython as a Google Summer of Code project through NESCent’s Phyloinformatics Summer of Code 2009. See the project proposal at: Biogeographical Phylogenetics for BioPython. The mentors were Stephen Smith (primary), Brad Chapman, and David Kidd. The new module is documented on this wiki as BioGeography.
The code currently lives at the Bio/Geography directory of Nick’s Geography branch on GitHub, and Eric is preparing it for integration into the Biopython trunk on another branch.
Several branches for working with RNA data have been made available by Kristian Rother. They can be used mainly for parsing RNA secondary structures 1 and working with Bio.Sequence objects that represent modified RNA nucleotides 2.
A discussion about a new version of Biopython with restructured internals.
The open GitHub issues and GitHub pull requests are worth looking at.
The Biopython network diagram at github.com will show all public branches of our repository on github, and will therefore let you see things that are being worked on. Sadly this no longer gives a good overview as more and more people are contributing to Biopython - there is no way to zoom out for example.
Please add any ideas or proposals for new additions to Biopython. Bugs and enhancements for current code should be discussed though GitHub.
Bio.Phylo.PAML. Presently only the main output files are parsed, but the supplementary output files contain potentially useful information for ancestral sequence reconstruction, Bayes posterior probability estimates for positive selection at all sites, etc. The format is generally different from the main output files and it probably varies even more widely between models than the main output, so it will require a lot of care.
Maintaining software involves incremental improvements for new format changes and removal of bugs. Please see the GitHub pull requests, open issues list and our old Issue Tracker for an overview.
Post to the developer mailing list if you are interested in tackling any open issues.