Bio.PopGen.GenePop.Controller module¶
Module to control GenePop.
- class Bio.PopGen.GenePop.Controller.GenePopController(genepop_dir=None)¶
Bases:
object
Define a class to interface with the GenePop program.
- __init__(self, genepop_dir=None)¶
Initialize the controller.
genepop_dir is the directory where GenePop is.
The binary should be called Genepop (capital G)
- test_pop_hz_deficiency(self, fname, enum_test=True, dememorization=10000, batches=20, iterations=5000)¶
Use Hardy-Weinberg test for heterozygote deficiency.
Returns a population iterator containing a dictionary wehre dictionary[locus]=(P-val, SE, Fis-WC, Fis-RH, steps).
Some loci have a None if the info is not available. SE might be none (for enumerations).
- test_pop_hz_excess(self, fname, enum_test=True, dememorization=10000, batches=20, iterations=5000)¶
Use Hardy-Weinberg test for heterozygote deficiency.
Returns a population iterator containing a dictionary where dictionary[locus]=(P-val, SE, Fis-WC, Fis-RH, steps).
Some loci have a None if the info is not available. SE might be none (for enumerations).
- test_pop_hz_prob(self, fname, ext, enum_test=False, dememorization=10000, batches=20, iterations=5000)¶
Use Hardy-Weinberg test based on probability.
Returns 2 iterators and a final tuple:
- Returns a loci iterator containing:
A dictionary[pop_pos]=(P-val, SE, Fis-WC, Fis-RH, steps). Some pops have a None if the info is not available. SE might be none (for enumerations).
Result of Fisher’s test (Chi2, deg freedom, prob).
- Returns a population iterator containing:
A dictionary[locus]=(P-val, SE, Fis-WC, Fis-RH, steps). Some loci have a None if the info is not available. SE might be none (for enumerations).
Result of Fisher’s test (Chi2, deg freedom, prob).
Final tuple (Chi2, deg freedom, prob).
- test_global_hz_deficiency(self, fname, enum_test=True, dememorization=10000, batches=20, iterations=5000)¶
Use Global Hardy-Weinberg test for heterozygote deficiency.
- Returns a triple with:
An list per population containing (pop_name, P-val, SE, switches). Some pops have a None if the info is not available. SE might be none (for enumerations).
An list per loci containing (locus_name, P-val, SE, switches). Some loci have a None if the info is not available. SE might be none (for enumerations).
Overall results (P-val, SE, switches).
- test_global_hz_excess(self, fname, enum_test=True, dememorization=10000, batches=20, iterations=5000)¶
Use Global Hardy-Weinberg test for heterozygote excess.
- Returns a triple with:
A list per population containing (pop_name, P-val, SE, switches). Some pops have a None if the info is not available. SE might be none (for enumerations).
A list per loci containing (locus_name, P-val, SE, switches). Some loci have a None if the info is not available. SE might be none (for enumerations).
Overall results (P-val, SE, switches)
- test_ld(self, fname, dememorization=10000, batches=20, iterations=5000)¶
Test for linkage disequilibrium on each pair of loci in each population.
- create_contingency_tables(self, fname)¶
Provision for creating Genotypic contingency tables.
- test_genic_diff_all(self, fname, dememorization=10000, batches=20, iterations=5000)¶
Provision for Genic differentiation for all populations.
- test_genic_diff_pair(self, fname, dememorization=10000, batches=20, iterations=5000)¶
Provision for Genic differentiation for all population pairs.
- test_genotypic_diff_all(self, fname, dememorization=10000, batches=20, iterations=5000)¶
Provision for Genotypic differentiation for all populations.
- test_genotypic_diff_pair(self, fname, dememorization=10000, batches=20, iterations=5000)¶
Provision for Genotypic differentiation for all population pairs.
- estimate_nm(self, fname)¶
Estimate the Number of Migrants.
- Parameters:
fname - file name
- Returns
Mean sample size
Mean frequency of private alleles
Number of migrants for Ne=10
Number of migrants for Ne=25
Number of migrants for Ne=50
Number of migrants after correcting for expected size
- calc_allele_genotype_freqs(self, fname)¶
Calculate allele and genotype frequencies per locus and per sample.
- Parameters:
fname - file name
- Returns tuple with 2 elements:
Population iterator with
population name
Locus dictionary with key = locus name and content tuple as Genotype List with (Allele1, Allele2, observed, expected) (expected homozygotes, observed hm, expected heterozygotes, observed ht) Allele frequency/Fis dictionary with allele as key and (count, frequency, Fis Weir & Cockerham)
Totals as a pair
count
Fis Weir & Cockerham,
Fis Robertson & Hill
Locus iterator with
Locus name
allele list
Population list with a triple
population name
list of allele frequencies in the same order as allele list above
number of genes
Will create a file called fname.INF
- calc_diversities_fis_with_identity(self, fname)¶
Compute identity-base Gene diversities and Fis.
- calc_diversities_fis_with_size(self, fname)¶
Provision to Computer Allele size-based Gene diversities and Fis.
- calc_fst_all(self, fname)¶
Execute GenePop and gets Fst/Fis/Fit (all populations).
- Parameters:
fname - file name
- Returns:
(multiLocusFis, multiLocusFst, multiLocus Fit),
Iterator of tuples (Locus name, Fis, Fst, Fit, Qintra, Qinter)
Will create a file called
fname.FST
.This does not return the genotype frequencies.
- calc_fst_pair(self, fname)¶
Estimate spatial structure from Allele identity for all population pairs.
- calc_rho_all(self, fname)¶
Provision for estimating spatial structure from Allele size for all populations.
- calc_rho_pair(self, fname)¶
Provision for estimating spatial structure from Allele size for all population pairs.
- calc_ibd_diplo(self, fname, stat='a', scale='Log', min_dist=1e-05)¶
Calculate isolation by distance statistics for diploid data.
See _calc_ibd for parameter details.
Note that each pop can only have a single individual and the individual name has to be the sample coordinates.
- calc_ibd_haplo(self, fname, stat='a', scale='Log', min_dist=1e-05)¶
Calculate isolation by distance statistics for haploid data.
See _calc_ibd for parameter details.
Note that each pop can only have a single individual and the individual name has to be the sample coordinates.