PopGen dev

(Difference between revisions)
Jump to: navigation, search
m (In the pipeline)
Line 20: Line 20:
* STRUCTURE support
* STRUCTURE support
* LDNe support
* LDNe support
* Marker independent statistics
* [[PopGen_dev_Statistics|Statistics]]
==Code and contributing==
==Code and contributing==

Revision as of 15:59, 7 April 2009


Development page for the PopGen module.


The PopGen module contains modules to handle population genetics data, applications and algorithms.

History and philosophy

Most of the existing Bio.PopGen features are of non-core population genetics functionality. This was seen as feature (and not as a bug) in order to start building a module with functionality where newbie crass errors would not have dramatic consequences. Currently, with the experience accumulated is is possible and desirable to concentrate on core population genetics functionality (i.e., statistics).

Also worth noticing is that we wrap existing functionality whenever possible. For instance we don't provide our own coalescent simulator, but we provide wrappers to an existing one which is established and widely used (SIMCOAL2).

Future Goals

The fundamental goal is to have support for "classic" population genetics operations (statistics). This should be provided in an extensible, easy to use and future-proof framework. Code exists (see below on how to find it), but will probably be refactored. Below there is also a with list where you can add your desired features.

In the pipeline

Currently (i.e., for the near term) the following new functionality can be expected

Code and contributing

The official production code is available on CVS.

If you would like to contribute, we suggest the following:

  1. Please have a look at the General Biopython contribution guidelines.
  2. Join us on the biopython-devel mailing list and tell us about your ideas so that we know who is working on what, and can discuss the the viability of including your contribution on the official release.
  3. Current development of Bio.PopGen is made on github. For Biopython's intruduction to GIT check this page. Most probably you will want to fork from the main development line at http://github.com/tiagoantao/biopython-popgen-test/tree/master (I don't like this being associated with my personal account - any suggestions?)
  4. You are completely free to work on your own branch (but, if you want your changes to go to the official distribution don't forget to go to biopython-dev and discuss what you are doing).
  5. When you feel your contribution is ready and you would like to propose it to the official distribution, your branch will have to be merged with the main development one. Contact the mailing list for help with doing this. You are expected to have production quality code (this includes unit tests and documentation). If you have doubts about unit testing and producing documentation, don't hesitate to contact the mailing list.

Existing development branches

While the fundamental branch to start developing is http://github.com/tiagoantao/biopython-popgen-test/tree/master (this is the real starting point if you want to develop new functionality), we would like to have a notion of who is working on what (to avoid overlapping and allow for coordination).

Here are documented existing development branches. These branches are informal places where developers are creating new functionality, correcting bugs, etc... Feel free to add yours (or fork from existing ones). If you are interested in any of them contact the author directly or go to the mailing list.

Purpose URL who
Statistics (He, Fst, Tajima D, ...) http://github.com/tiagoantao/biopython-popgen-test/tree/stats Tiago Antao
Genepop (parser and application) http://github.com/tiagoantao/biopython-popgen-test/tree/genepop Tiago Antao

Wish list

  • support for a binary format - like HDF5 or this one: snpfile
  • support for database: it is frequent to carry analysis on a big scale, so it is not unfrequent to use databases to store data
Personal tools