Bio.PopGen.GenePop.Controller module

Module to control GenePop.

class Bio.PopGen.GenePop.Controller.GenePopController(genepop_dir=None)

Bases: object

Define a class to interface with the GenePop program.

__init__(self, genepop_dir=None)

Initialize the controller.

genepop_dir is the directory where GenePop is.

The binary should be called Genepop (capital G)

test_pop_hz_deficiency(self, fname, enum_test=True, dememorization=10000, batches=20, iterations=5000)

Use Hardy-Weinberg test for heterozygote deficiency.

Returns a population iterator containing a dictionary wehre dictionary[locus]=(P-val, SE, Fis-WC, Fis-RH, steps).

Some loci have a None if the info is not available. SE might be none (for enumerations).

test_pop_hz_excess(self, fname, enum_test=True, dememorization=10000, batches=20, iterations=5000)

Use Hardy-Weinberg test for heterozygote deficiency.

Returns a population iterator containing a dictionary where dictionary[locus]=(P-val, SE, Fis-WC, Fis-RH, steps).

Some loci have a None if the info is not available. SE might be none (for enumerations).

test_pop_hz_prob(self, fname, ext, enum_test=False, dememorization=10000, batches=20, iterations=5000)

Use Hardy-Weinberg test based on probability.

Returns 2 iterators and a final tuple:

  1. Returns a loci iterator containing:
    • A dictionary[pop_pos]=(P-val, SE, Fis-WC, Fis-RH, steps). Some pops have a None if the info is not available. SE might be none (for enumerations).

    • Result of Fisher’s test (Chi2, deg freedom, prob).

  2. Returns a population iterator containing:
    • A dictionary[locus]=(P-val, SE, Fis-WC, Fis-RH, steps). Some loci have a None if the info is not available. SE might be none (for enumerations).

    • Result of Fisher’s test (Chi2, deg freedom, prob).

  3. Final tuple (Chi2, deg freedom, prob).

test_global_hz_deficiency(self, fname, enum_test=True, dememorization=10000, batches=20, iterations=5000)

Use Global Hardy-Weinberg test for heterozygote deficiency.

Returns a triple with:
  • An list per population containing (pop_name, P-val, SE, switches). Some pops have a None if the info is not available. SE might be none (for enumerations).

  • An list per loci containing (locus_name, P-val, SE, switches). Some loci have a None if the info is not available. SE might be none (for enumerations).

  • Overall results (P-val, SE, switches).

test_global_hz_excess(self, fname, enum_test=True, dememorization=10000, batches=20, iterations=5000)

Use Global Hardy-Weinberg test for heterozygote excess.

Returns a triple with:
  • A list per population containing (pop_name, P-val, SE, switches). Some pops have a None if the info is not available. SE might be none (for enumerations).

  • A list per loci containing (locus_name, P-val, SE, switches). Some loci have a None if the info is not available. SE might be none (for enumerations).

  • Overall results (P-val, SE, switches)

test_ld(self, fname, dememorization=10000, batches=20, iterations=5000)

Test for linkage disequilibrium on each pair of loci in each population.

create_contingency_tables(self, fname)

Provision for creating Genotypic contingency tables.

test_genic_diff_all(self, fname, dememorization=10000, batches=20, iterations=5000)

Provision for Genic differentiation for all populations.

test_genic_diff_pair(self, fname, dememorization=10000, batches=20, iterations=5000)

Provision for Genic differentiation for all population pairs.

test_genotypic_diff_all(self, fname, dememorization=10000, batches=20, iterations=5000)

Provision for Genotypic differentiation for all populations.

test_genotypic_diff_pair(self, fname, dememorization=10000, batches=20, iterations=5000)

Provision for Genotypic differentiation for all population pairs.

estimate_nm(self, fname)

Estimate the Number of Migrants.

Parameters:
  • fname - file name

Returns
  • Mean sample size

  • Mean frequency of private alleles

  • Number of migrants for Ne=10

  • Number of migrants for Ne=25

  • Number of migrants for Ne=50

  • Number of migrants after correcting for expected size

calc_allele_genotype_freqs(self, fname)

Calculate allele and genotype frequencies per locus and per sample.

Parameters:
  • fname - file name

Returns tuple with 2 elements:
  • Population iterator with

    • population name

    • Locus dictionary with key = locus name and content tuple as Genotype List with (Allele1, Allele2, observed, expected) (expected homozygotes, observed hm, expected heterozygotes, observed ht) Allele frequency/Fis dictionary with allele as key and (count, frequency, Fis Weir & Cockerham)

    • Totals as a pair

    • count

    • Fis Weir & Cockerham,

    • Fis Robertson & Hill

  • Locus iterator with

    • Locus name

    • allele list

    • Population list with a triple

      • population name

      • list of allele frequencies in the same order as allele list above

      • number of genes

Will create a file called fname.INF

calc_diversities_fis_with_identity(self, fname)

Compute identity-base Gene diversities and Fis.

calc_diversities_fis_with_size(self, fname)

Provision to Computer Allele size-based Gene diversities and Fis.

calc_fst_all(self, fname)

Execute GenePop and gets Fst/Fis/Fit (all populations).

Parameters:
  • fname - file name

Returns:
  • (multiLocusFis, multiLocusFst, multiLocus Fit),

  • Iterator of tuples (Locus name, Fis, Fst, Fit, Qintra, Qinter)

Will create a file called fname.FST.

This does not return the genotype frequencies.

calc_fst_pair(self, fname)

Estimate spatial structure from Allele identity for all population pairs.

calc_rho_all(self, fname)

Provision for estimating spatial structure from Allele size for all populations.

calc_rho_pair(self, fname)

Provision for estimating spatial structure from Allele size for all population pairs.

calc_ibd_diplo(self, fname, stat='a', scale='Log', min_dist=1e-05)

Calculate isolation by distance statistics for diploid data.

See _calc_ibd for parameter details.

Note that each pop can only have a single individual and the individual name has to be the sample coordinates.

calc_ibd_haplo(self, fname, stat='a', scale='Log', min_dist=1e-05)

Calculate isolation by distance statistics for haploid data.

See _calc_ibd for parameter details.

Note that each pop can only have a single individual and the individual name has to be the sample coordinates.