Bio.PopGen.GenePop.FileParser module
Code to parse BIG GenePop files.
The difference between this class and the standard Bio.PopGen.GenePop.Record class is that this one does not read the whole file to memory. It provides an iterator interface, slower but consuming much mess memory. Should be used with big files (Thousands of markers and individuals).
See http://wbiomed.curtin.edu.au/genepop/ , the format is documented here: http://wbiomed.curtin.edu.au/genepop/help_input.html .
- Classes:
FileRecord Holds GenePop data.
Functions:
- Bio.PopGen.GenePop.FileParser.read(fname)
Parse a file containing a GenePop file.
fname is a file name that contains a GenePop record.
- class Bio.PopGen.GenePop.FileParser.FileRecord(fname)
Bases:
object
Hold information from a GenePop record.
Attributes: - marker_len The marker length (2 or 3 digit code per allele). - comment_line Comment line. - loci_list List of loci names.
Methods: - get_individual Returns the next individual of the current population. - skip_population Skips the current population.
skip_population skips the individuals of the current population, returns True if there are more populations.
get_individual returns an individual of the current population (or None if the list ended).
Each individual is a pair composed by individual name and a list of alleles (2 per marker or 1 for haploid data). Examples:
('Ind1', [(1,2), (3,3), (200,201)] ('Ind2', [(2,None), (3,3), (None,None)] ('Other1', [(1,1), (4,3), (200,200)]
- __init__(fname)
Initialize the class.
- __str__()
Return (reconstructs) a GenePop textual representation.
This might take a lot of memory. Marker length will be 3.
- start_read()
Start parsing a file containing a GenePop file.
- skip_header()
Skip the Header. To be done after a re-open.
- seek_position(pop, indiv)
Seek a certain position in the file.
- Arguments:
pop - pop position (0 is first)
indiv - individual in pop
- skip_population()
Skip the current population. Returns true if there is another pop.
- get_individual()
Get the next individual.
Returns individual information if there are more individuals in the current population. Returns True if there are no more individuals in the current population, but there are more populations. Next read will be of the following pop. Returns False if at end of file.
- remove_population(pos, fname)
Remove a population (by position).
- Arguments:
pos - position
fname - file to be created with population removed
- remove_locus_by_position(pos, fname)
Remove a locus by position.
- Arguments:
pos - position
fname - file to be created with locus removed
- remove_loci_by_position(positions, fname)
Remove a set of loci by position.
- Arguments:
positions - positions
fname - file to be created with locus removed
- remove_locus_by_name(name, fname)
Remove a locus by name.
- Arguments:
name - name
fname - file to be created with locus removed
- remove_loci_by_name(names, fname)
Remove a loci list (by name).
- Arguments:
names - names
fname - file to be created with loci removed