Bio.Phylo.Applications package
Module contents
Phylogenetics command line tool wrappers (OBSOLETE).
We have decided to remove this module in future, and instead recommend building your command and invoking it via the subprocess module directly.
- class Bio.Phylo.Applications.PhymlCommandline(cmd='phyml', **kwargs)
Bases:
AbstractCommandline
Command-line wrapper for the tree inference program PhyML.
Homepage: http://www.atgc-montpellier.fr/phyml
References
Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology, 2003 Oct;52(5):696-704. PubMed PMID: 14530136.
Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0. Systematic Biology, 2010 59(3):307-21.
- __init__(cmd='phyml', **kwargs)
Initialize the class.
- property alpha
Distribution of the gamma distribution shape parameter.
Can be a fixed positive value, or ‘e’ to get the maximum-likelihood estimate.
This controls the addition of the -a parameter and its associated value. Set this property to the argument value required.
- property bootstrap
Number of bootstrap replicates, if value is > 0.
Otherwise:
0: neither approximate likelihood ratio test nor bootstrap values are computed.
-1: approximate likelihood ratio test returning aLRT statistics.
-2: approximate likelihood ratio test returning Chi2-based parametric branch supports.
-4: SH-like branch supports alone.
This controls the addition of the -b parameter and its associated value. Set this property to the argument value required.
- property datatype
Datatype ‘nt’ for nucleotide (default) or ‘aa’ for amino-acids.
This controls the addition of the -d parameter and its associated value. Set this property to the argument value required.
- property frequencies
Character frequencies.
-f e, m, or “fA fC fG fT”
e : Empirical frequencies, determined as follows :
Nucleotide sequences: (Empirical) the equilibrium base frequencies are estimated by counting the occurrence of the different bases in the alignment.
Amino-acid sequences: (Empirical) the equilibrium amino-acid frequencies are estimated by counting the occurrence of the different amino-acids in the alignment.
m : ML/model-based frequencies, determined as follows :
Nucleotide sequences: (ML) the equilibrium base frequencies are estimated using maximum likelihood
Amino-acid sequences: (Model) the equilibrium amino-acid frequencies are estimated using the frequencies defined by the substitution model.
“fA fC fG fT” : only valid for nucleotide-based models. fA, fC, fG and fT are floating-point numbers that correspond to the frequencies of A, C, G and T, respectively.
This controls the addition of the -f parameter and its associated value. Set this property to the argument value required.
- property input
PHYLIP format input nucleotide or amino-acid sequence filenam.
This controls the addition of the -i parameter and its associated value. Set this property to the argument value required.
- property input_tree
Starting tree filename. The tree must be in Newick format.
This controls the addition of the -u parameter and its associated value. Set this property to the argument value required.
- property model
Substitution model name.
Nucleotide-based models:
HKY85 (default) | JC69 | K80 | F81 | F84 | TN93 | GTR | custom
For the custom option, a string of six digits identifies the model. For instance, 000000 corresponds to F81 (or JC69, provided the distribution of nucleotide frequencies is uniform). 012345 corresponds to GTR. This option can be used for encoding any model that is a nested within GTR.
Amino-acid based models:
LG (default) | WAG | JTT | MtREV | Dayhoff | DCMut | RtREV | CpREV | VT | Blosum62 | MtMam | MtArt | HIVw | HIVb | custom
This controls the addition of the -m parameter and its associated value. Set this property to the argument value required.
- property multiple
Number of data sets to analyse (integer).
This controls the addition of the -n parameter and its associated value. Set this property to the argument value required.
- property n_rand_starts
Number of initial random trees to be used.
Only valid if SPR searches are to be performed.
This controls the addition of the –n_rand_starts parameter and its associated value. Set this property to the argument value required.
- property nclasses
Number of relative substitution rate categories.
Default 1. Must be a positive integer.
This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.
- property optimize
Specific parameter optimisation.
tlr : tree topology (t), branch length (l) and rate parameters (r) are optimised.
tl : tree topology and branch length are optimised.
lr : branch length and rate parameters are optimised.
l : branch length are optimised.
r : rate parameters are optimised.
n : no parameter is optimised.
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
- property pars
Use a minimum parsimony starting tree.
This option is taken into account when the ‘-u’ option is absent and when tree topology modifications are to be done.
This property controls the addition of the -p switch, treat this property as a boolean.
- property print_site_lnl
Print the likelihood for each site in file *_phyml_lk.txt.
This property controls the addition of the –print_site_lnl switch, treat this property as a boolean.
- property print_trace
Print each phylogeny explored during the tree search process in file *_phyml_trace.txt.
This property controls the addition of the –print_trace switch, treat this property as a boolean.
- property prop_invar
Proportion of invariable sites.
Can be a fixed value in the range [0,1], or ‘e’ to get the maximum-likelihood estimate.
This controls the addition of the -v parameter and its associated value. Set this property to the argument value required.
- property quiet
No interactive questions (for running in batch mode).
This property controls the addition of the –quiet switch, treat this property as a boolean.
- property r_seed
Seed used to initiate the random number generator.
Must be an integer.
This controls the addition of the –r_seed parameter and its associated value. Set this property to the argument value required.
- property rand_start
Sets the initial tree to random.
Only valid if SPR searches are to be performed.
This property controls the addition of the –rand_start switch, treat this property as a boolean.
- property run_id
Append the given string at the end of each PhyML output file.
This option may be useful when running simulations involving PhyML.
This controls the addition of the –run_id parameter and its associated value. Set this property to the argument value required.
- property search
Tree topology search operation option.
Can be one of:
NNI : default, fast
SPR : a bit slower than NNI
BEST : best of NNI and SPR search
This controls the addition of the -s parameter and its associated value. Set this property to the argument value required.
- property sequential
Changes interleaved format (default) to sequential format.
This property controls the addition of the -q switch, treat this property as a boolean.
- property ts_tv_ratio
Transition/transversion ratio. (DNA sequences only.)
Can be a fixed positive value (ex:4.0) or e to get the maximum-likelihood estimate.
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
- class Bio.Phylo.Applications.RaxmlCommandline(cmd='raxmlHPC', **kwargs)
Bases:
AbstractCommandline
Command-line wrapper for the tree inference program RAxML.
The required parameters are ‘sequences’ (-s), ‘model’ (-m) and ‘name’ (-n). The parameter ‘parsimony_seed’ (-p) must also be set for RAxML, but if you do not specify it, this wrapper will set the seed to 10000 for you.
References
Stamatakis A. RAxML-VI-HPC: Maximum Likelihood-based Phylogenetic Analyses with Thousands of Taxa and Mixed Models. Bioinformatics 2006, 22(21):2688-2690.
Homepage: http://sco.h-its.org/exelixis/software.html
Examples
>>> from Bio.Phylo.Applications import RaxmlCommandline >>> raxml_cline = RaxmlCommandline(sequences="Tests/Phylip/interlaced2.phy", ... model="PROTCATWAG", name="interlaced2") >>> print(raxml_cline) raxmlHPC -m PROTCATWAG -n interlaced2 -p 10000 -s Tests/Phylip/interlaced2.phy
You would typically run the command line with raxml_cline() or via the Python subprocess module, as described in the Biopython tutorial.
- __init__(cmd='raxmlHPC', **kwargs)
Initialize the class.
- property parsimony_seed
Random number seed for the parsimony inferences. This allows you to reproduce your results and will help developers debug the program. This option HAS NO EFFECT in the parallel MPI version.
This controls the addition of the -p parameter and its associated value. Set this property to the argument value required.
- property algorithm
Select algorithm:
a: Rapid Bootstrap analysis and search for best-scoring ML tree in one program run.
b: Draw bipartition information on a tree provided with ‘-t’ based on multiple trees (e.g. form a bootstrap) in a file specified by ‘-z’.
c: Check if the alignment can be properly read by RAxML.
d: New rapid hill-climbing (DEFAULT).
e: Optimize model+branch lengths for given input tree under GAMMA/GAMMAI only.
g: Compute per site log Likelihoods for one or more trees passed via ‘-z’ and write them to a file that can be read by CONSEL.
h: Compute log likelihood test (SH-test) between best tree passed via ‘-t’ and a bunch of other trees passed via ‘-z’.
i: Perform a really thorough bootstrap, refinement of final bootstrap tree under GAMMA and a more exhaustive algorithm.
j: Generate a bunch of bootstrapped alignment files from an original alignment file.
m: Compare bipartitions between two bunches of trees passed via ‘-t’ and ‘-z’ respectively. This will return the Pearson correlation between all bipartitions found in the two tree files. A file called RAxML_bipartitionFrequencies.outputFileName will be printed that contains the pair-wise bipartition frequencies of the two sets.
n: Compute the log likelihood score of all trees contained in a tree file provided by ‘-z’ under GAMMA or GAMMA+P-Invar.
o: Old and slower rapid hill-climbing.
p: Perform pure stepwise MP addition of new sequences to an incomplete starting tree.
s: Split up a multi-gene partitioned alignment into the respective subalignments.
t: Do randomized tree searches on one fixed starting tree.
w: Compute ELW test on a bunch of trees passed via ‘-z’.
x: Compute pair-wise ML distances, ML model parameters will be estimated on an MP starting tree or a user-defined tree passed via ‘-t’, only allowed for GAMMA-based models of rate heterogeneity.
This controls the addition of the -f parameter and its associated value. Set this property to the argument value required.
- property binary_constraint
File name of a binary constraint tree. This tree does not need to be comprehensive, i.e. contain all taxa.
This controls the addition of the -r parameter and its associated value. Set this property to the argument value required.
- property bipartition_filename
Name of a file containing multiple trees, e.g. from a bootstrap run, that shall be used to draw bipartition values onto a tree provided with ‘-t’. It can also be used to compute per-site log likelihoods in combination with ‘-f g’, and to read a bunch of trees for a couple of other options (‘-f h’, ‘-f m’, ‘-f n’).
This controls the addition of the -z parameter and its associated value. Set this property to the argument value required.
- property bootstrap_branch_lengths
Print bootstrapped trees with branch lengths. The bootstraps will run a bit longer, because model parameters will be optimized at the end of each run. Use with CATMIX/PROTMIX or GAMMA/GAMMAI.
This property controls the addition of the -k switch, treat this property as a boolean.
- property bootstrap_seed
Random seed for bootstrapping.
This controls the addition of the -b parameter and its associated value. Set this property to the argument value required.
- property checkpoints
Write checkpoints (intermediate tree topologies).
This property controls the addition of the -j switch, treat this property as a boolean.
- property cluster_threshold
Threshold for sequence similarity clustering. RAxML will then print out an alignment to a file called sequenceFileName.reducedBy.threshold that only contains sequences <= the specified threshold that must be between 0.0 and 1.0. RAxML uses the QT-clustering algorithm to perform this task. In addition, a file called RAxML_reducedList.outputFileName will be written that contains clustering information.
This controls the addition of the -l parameter and its associated value. Set this property to the argument value required.
- property cluster_threshold_fast
Same functionality as ‘-l’, but uses a less exhaustive and thus faster clustering algorithm. This is intended for very large datasets with more than 20,000-30,000 sequences.
This controls the addition of the -L parameter and its associated value. Set this property to the argument value required.
- property epsilon
Set model optimization precision in log likelihood units for final optimization of tree topology under MIX/MIXI or GAMMA/GAMMAI.Default: 0.1 for models not using proportion of invariant sites estimate; 0.001 for models using proportion of invariant sites estimate.
This controls the addition of the -e parameter and its associated value. Set this property to the argument value required.
- property exclude_filename
An exclude file name, containing a specification of alignment positions you wish to exclude. Format is similar to Nexus, the file shall contain entries like ‘100-200 300-400’; to exclude a single column write, e.g., ‘100-100’. If you use a mixed model, an appropriately adapted model file will be written.
This controls the addition of the -E parameter and its associated value. Set this property to the argument value required.
- property grouping_constraint
File name of a multifurcating constraint tree. this tree does not need to be comprehensive, i.e. contain all taxa.
This controls the addition of the -g parameter and its associated value. Set this property to the argument value required.
- property model
Model of Nucleotide or Amino Acid Substitution:
NUCLEOTIDES:
GTRCAT : GTR + Optimization of substitution rates + Optimization of site-specific evolutionary rates which are categorized into numberOfCategories distinct rate categories for greater computational efficiency if you do a multiple analysis with ‘-#’ or ‘-N’ but without bootstrapping the program will use GTRMIX instead
GTRGAMMA : GTR + Optimization of substitution rates + GAMMA model of rate heterogeneity (alpha parameter will be estimated)
GTRMIX : Inference of the tree under GTRCAT and thereafter evaluation of the final tree topology under GTRGAMMA
GTRCAT_GAMMA : Inference of the tree with site-specific evolutionary rates. However, here rates are categorized using the 4 discrete GAMMA rates. Evaluation of the final tree topology under GTRGAMMA
GTRGAMMAI : Same as GTRGAMMA, but with estimate of proportion of invariable sites
GTRMIXI : Same as GTRMIX, but with estimate of proportion of invariable sites
GTRCAT_GAMMAI : Same as GTRCAT_GAMMA, but with estimate of proportion of invariable sites
AMINO ACIDS:
PROTCATmatrixName[F] : specified AA matrix + Optimization of substitution rates + Optimization of site-specific evolutionary rates which are categorized into numberOfCategories distinct rate categories for greater computational efficiency if you do a multiple analysis with ‘-#’ or ‘-N’ but without bootstrapping the program will use PROTMIX… instead
PROTGAMMAmatrixName[F] : specified AA matrix + Optimization of substitution rates + GAMMA model of rate heterogeneity (alpha parameter will be estimated)
PROTMIXmatrixName[F] : Inference of the tree under specified AA matrix + CAT and thereafter evaluation of the final tree topology under specified AA matrix + GAMMA
PROTCAT_GAMMAmatrixName[F] : Inference of the tree under specified AA matrix and site-specific evolutionary rates. However, here rates are categorized using the 4 discrete GAMMA rates. Evaluation of the final tree topology under specified AA matrix + GAMMA
PROTGAMMAImatrixName[F] : Same as PROTGAMMAmatrixName[F], but with estimate of proportion of invariable sites
PROTMIXImatrixName[F] : Same as PROTMIXmatrixName[F], but with estimate of proportion of invariable sites
PROTCAT_GAMMAImatrixName[F] : Same as PROTCAT_GAMMAmatrixName[F], but with estimate of proportion of invariable sites
Available AA substitution models: DAYHOFF, DCMUT, JTT, MTREV, WAG, RTREV, CPREV, VT, BLOSUM62, MTMAM, GTR With the optional ‘F’ appendix you can specify if you want to use empirical base frequencies Please not that for mixed models you can in addition specify the per-gene AA model in the mixed model file (see manual for details)
This controls the addition of the -m parameter and its associated value. Set this property to the argument value required.
- property name
Name used in the output files.
This controls the addition of the -n parameter and its associated value. Set this property to the argument value required.
- property num_bootstrap_searches
Number of multiple bootstrap searches per replicate. Use this to obtain better ML trees for each replicate. Default: 1 ML search per bootstrap replicate.
This controls the addition of the -u parameter and its associated value. Set this property to the argument value required.
- property num_categories
Number of distinct rate categories for RAxML when evolution model is set to GTRCAT or GTRMIX.Individual per-site rates are categorized into this many rate categories to accelerate computations. Default: 25.
This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.
- property num_replicates
Number of alternative runs on distinct starting trees. In combination with the ‘-b’ option, this will invoke a multiple bootstrap analysis. DEFAULT: 1 single analysis.Note that ‘-N’ has been added as an alternative since ‘-#’ sometimes caused problems with certain MPI job submission systems, since ‘-#’ is often used to start comments.
This controls the addition of the -N parameter and its associated value. Set this property to the argument value required.
- property outgroup
Name of a single outgroup or a comma-separated list of outgroups, eg ‘-o Rat’ or ‘-o Rat,Mouse’. In case that multiple outgroups are not monophyletic the first name in the list will be selected as outgroup. Don’t leave spaces between taxon names!
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
- property parsimony
Only compute a parsimony starting tree, then exit.
This property controls the addition of the -y switch, treat this property as a boolean.
- property partition_branch_lengths
Switch on estimation of individual per-partition branch lengths. Only has effect when used in combination with ‘partition_filename’ (‘-q’). Branch lengths for individual partitions will be printed to separate files. A weighted average of the branch lengths is computed by using the respective partition lengths.
This property controls the addition of the -M switch, treat this property as a boolean.
- property partition_filename
File name containing the assignment of models to alignment partitions for multiple models of substitution. For the syntax of this file please consult the RAxML manual.
This controls the addition of the -q parameter and its associated value. Set this property to the argument value required.
- property protein_model
File name of a user-defined AA (Protein) substitution model. This file must contain 420 entries, the first 400 being the AA substitution rates (this must be a symmetric matrix) and the last 20 are the empirical base frequencies.
This controls the addition of the -P parameter and its associated value. Set this property to the argument value required.
- property random_starting_tree
Start ML optimization from random starting tree.
This property controls the addition of the -d switch, treat this property as a boolean.
- property rapid_bootstrap_seed
Random seed for rapid bootstrapping.
This controls the addition of the -x parameter and its associated value. Set this property to the argument value required.
- property rearrangements
Initial rearrangement setting for the subsequent application of topological changes phase.
This controls the addition of the -i parameter and its associated value. Set this property to the argument value required.
- property sequences
Name of the alignment data file, in PHYLIP format.
This controls the addition of the -s parameter and its associated value. Set this property to the argument value required.
- property starting_tree
File name of a user starting tree, in Newick format.
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
- property threads
Number of threads to run. PTHREADS VERSION ONLY! Make sure to set this at most the number of CPUs you have on your machine, otherwise, there will be a huge performance decrease!
This controls the addition of the -T parameter and its associated value. Set this property to the argument value required.
- property version
Display version information.
This property controls the addition of the -v switch, treat this property as a boolean.
- property weight_filename
Name of a column weight file to assign individual weights to each column of the alignment. Those weights must be integers separated by any type and number of whitespaces within a separate file.
This controls the addition of the -a parameter and its associated value. Set this property to the argument value required.
- property working_dir
Name of the working directory where RAxML will write its output files. Default: current directory.
This controls the addition of the -w parameter and its associated value. Set this property to the argument value required.
- class Bio.Phylo.Applications.FastTreeCommandline(cmd='fasttree', **kwargs)
Bases:
AbstractCommandline
Command-line wrapper for FastTree.
Only the
input
andout
parameters are mandatory.From the terminal command line use
fasttree.exe -help
orfasttree.exe -expert
for more explanation of usage options.Homepage: http://www.microbesonline.org/fasttree/
References
Price, M.N., Dehal, P.S., and Arkin, A.P. (2010) FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE, 5(3):e9490. https://doi.org/10.1371/journal.pone.0009490.
Examples
This is an example on Windows:
import _Fasttree fasttree_exe = r"C:\FasttreeWin32\fasttree.exe" cmd = _Fasttree.FastTreeCommandline(fasttree_exe, ... input=r'C:\Input\ExampleAlignment.fsa', ... out=r'C:\Output\ExampleTree.tree') print(cmd) out, err = cmd() print(out) print(err)
- __init__(cmd='fasttree', **kwargs)
Initialize the class.
- property bionj
Join options: weighted joins as in BIONJ.
FastTree will also weight joins during NNIs.
This property controls the addition of the -bionj switch, treat this property as a boolean.
- property boot
Specify the number of resamples for support values.
Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1
Use -nosupport to turn off support values or -boot 100 to use just 100 resamples.
This controls the addition of the -boot parameter and its associated value. Set this property to the argument value required.
- property cat
Maximum likelihood model options.
Specify the number of rate categories of sites (default 20).
This controls the addition of the -cat parameter and its associated value. Set this property to the argument value required.
- property close
Modify the close heuristic for the top-hit list
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search -close 0.75 – modify the close heuristic, lower is more conservative.
This controls the addition of the -close parameter and its associated value. Set this property to the argument value required.
- property constraintWeight
Weight strength of constraints in topology searching.
Constrained topology search options: -constraintWeight – how strongly to weight the constraints. A value of 1 means a penalty of 1 in tree length for violating a constraint Default: 100.0
This controls the addition of the -constraintWeight parameter and its associated value. Set this property to the argument value required.
- property constraints
Specifies an alignment file for use with constrained topology searching
Constrained topology search options: -constraints alignmentfile – an alignment with values of 0, 1, and - Not all sequences need be present. A column of 0s and 1s defines a constrained split. Some constraints may be violated (see ‘violating constraints:’ in standard error).
This controls the addition of the -constraints parameter and its associated value. Set this property to the argument value required.
- property expert
Show the expert level help.
This property controls the addition of the -expert switch, treat this property as a boolean.
- property fastest
Search the visible set (the top hit for each node) only.
Searching for the best join: By default, FastTree combines the ‘visible set’ of fast neighbor-joining with local hill-climbing as in relaxed neighbor-joining -fastest – search the visible set (the top hit for each node) only Unlike the original fast neighbor-joining, -fastest updates visible(C) after joining A and B if join(AB,C) is better than join(C,visible(C)) -fastest also updates out-distances in a very lazy way, -fastest sets -2nd on as well, use -fastest -no2nd to avoid this
This property controls the addition of the -fastest switch, treat this property as a boolean.
- property gamma
Report the likelihood under the discrete gamma model.
Maximum likelihood model options: -gamma – after the final round of optimizing branch lengths with the CAT model, report the likelihood under the discrete gamma model with the same number of categories. FastTree uses the same branch lengths but optimizes the gamma shape parameter and the scale of the lengths. The final tree will have rescaled lengths. Used with -log, this also generates per-site likelihoods for use with CONSEL, see GammaLogToPaup.pl and documentation on the FastTree web site.
This property controls the addition of the -gamma switch, treat this property as a boolean.
- property gtr
Maximum likelihood model options.
Use generalized time-reversible instead of (default) Jukes-Cantor (nt only)
This property controls the addition of the -gtr switch, treat this property as a boolean.
- property gtrfreq
-gtrfreq A C G T
This controls the addition of the -gtrfreq parameter and its associated value. Set this property to the argument value required.
- property gtrrates
-gtrrates ac ag at cg ct gt
This controls the addition of the -gtrrates parameter and its associated value. Set this property to the argument value required.
- property help
Show the help.
This property controls the addition of the -help switch, treat this property as a boolean.
- property input
Enter <input file>
An input file of sequence alignments in fasta or phylip format is needed. By default FastTree expects protein alignments, use -nt for nucleotides.
This controls the addition of the input parameter and its associated value. Set this property to the argument value required.
- property intree
-intree newickfile – read the starting tree in from newickfile.
Any branch lengths in the starting trees are ignored. -intree with -n will read a separate starting tree for each alignment.
This controls the addition of the -intree parameter and its associated value. Set this property to the argument value required.
- property intree1
intree1 newickfile – read the same starting tree for each alignment.
This controls the addition of the -intree1 parameter and its associated value. Set this property to the argument value required.
- property log
Create log files of data such as intermediate trees and per-site rates
-log logfile – save intermediate trees so you can extract the trees and restart long-running jobs if they crash -log also reports the per-site rates (1 means slowest category).
This controls the addition of the -log parameter and its associated value. Set this property to the argument value required.
- property makematrix
-makematrix [alignment]
This controls the addition of the -makematrix parameter and its associated value. Set this property to the argument value required.
- property matrix
Specify a matrix for nucleotide or amino acid distances
Distances: Default: For protein sequences, log-corrected distances and an amino acid dissimilarity matrix derived from BLOSUM45 or for nucleotide sequences, Jukes-Cantor distances To specify a different matrix, use -matrix FilePrefix or -nomatrix
This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.
- property mlacc
Option for optimization of branches at each NNI.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mlacc 2 or -mlacc 3 to always optimize all 5 branches at each NNI, and to optimize all 5 branches in 2 or 3 rounds.
This controls the addition of the -mlacc parameter and its associated value. Set this property to the argument value required.
- property mllen
Optimize branch lengths on a fixed topology.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mllen to optimize branch lengths without ML NNIs Use -mllen -nome with -intree to optimize branch lengths on a fixed topology.
This property controls the addition of the -mllen switch, treat this property as a boolean.
- property mlnni
Set the number of rounds of maximum-likelihood NNIs.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mlnni to set the number of rounds of maximum-likelihood NNIs.
This controls the addition of the -mlnni parameter and its associated value. Set this property to the argument value required.
- property n
-n – read N multiple alignments in.
This only works with phylip interleaved format. For example, you can use it with the output from phylip’s seqboot. If you use -n, FastTree will write 1 tree per line to standard output.
This controls the addition of the -n parameter and its associated value. Set this property to the argument value required.
- property nj
Join options: regular (unweighted) neighbor-joining (default)
This property controls the addition of the -nj switch, treat this property as a boolean.
- property nni
Set the rounds of minimum-evolution nearest-neighbor interchanges
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs.
This controls the addition of the -nni parameter and its associated value. Set this property to the argument value required.
- property no2nd
Turn 2nd-level top hits heuristic off.
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other
-2nd or -no2nd to turn 2nd-level top hits heuristic on or off This reduces memory usage and running time but may lead to marginal reductions in tree quality. (By default, -fastest turns on -2nd.)
This property controls the addition of the -no2nd switch, treat this property as a boolean.
- property nocat
Maximum likelihood model options: No CAT model (just 1 category)
This property controls the addition of the -nocat switch, treat this property as a boolean.
- property nomatrix
Specify that no matrix should be used for nucleotide or amino acid distances
Distances: Default: For protein sequences, log-corrected distances and an amino acid dissimilarity matrix derived from BLOSUM45 or for nucleotide sequences, Jukes-Cantor distances To specify a different matrix, use -matrix FilePrefix or -nomatrix
This property controls the addition of the -nomatrix switch, treat this property as a boolean.
- property nome
Changes support values calculation to a minimum-evolution bootstrap method.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mllen to optimize branch lengths without ML NNIs Use -mllen -nome with -intree to optimize branch lengths on a fixed topology
Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1.
This property controls the addition of the -nome switch, treat this property as a boolean.
- property noml
Deactivate min-evo NNIs and SPRs.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -noml to turn off both min-evo NNIs and SPRs (useful if refining an approximately maximum-likelihood tree with further NNIs).
This property controls the addition of the -noml switch, treat this property as a boolean.
- property nopr
-nopr – do not write the progress indicator to stderr.
This property controls the addition of the -nopr switch, treat this property as a boolean.
- property nosupport
Turn off support values.
Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1
Use -nosupport to turn off support values or -boot 100 to use just 100 resamples.
This property controls the addition of the -nosupport switch, treat this property as a boolean.
- property notop
Turn off top-hit list to speed up search
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other.
This property controls the addition of the -notop switch, treat this property as a boolean.
- property nt
By default FastTree expects protein alignments, use -nt for nucleotides
This property controls the addition of the -nt switch, treat this property as a boolean.
- property out
Enter <output file>
The path to a Newick Tree output file needs to be specified.
This controls the addition of the -out parameter and its associated value. Set this property to the argument value required.
- property pseudo
-pseudo [weight] – Pseudocounts are used with sequence distance estimation.
Use pseudocounts to estimate distances between sequences with little or no overlap. (Off by default.) Recommended if analyzing the alignment has sequences with little or no overlap. If the weight is not specified, it is 1.0
This controls the addition of the -pseudo parameter and its associated value. Set this property to the argument value required.
- property quiet
-quiet – do not write to standard error during normal operation
(no progress indicator, no options summary, no likelihood values, etc.)
This property controls the addition of the -quiet switch, treat this property as a boolean.
- property quote
-quote – add quotes to sequence names in output.
Quote sequence names in the output and allow spaces, commas, parentheses, and colons in them but not ‘ characters (fasta files only).
This property controls the addition of the -quote switch, treat this property as a boolean.
- property rawdist
Turn off or adjust log-correction in AA or NT distances.
Use -rawdist to turn the log-correction off or to use %different instead of Jukes-Cantor in AA or NT distances
Distances: Default: For protein sequences, log-corrected distances and an amino acid dissimilarity matrix derived from BLOSUM45 or for nucleotide sequences, Jukes-Cantor distances To specify a different matrix, use -matrix FilePrefix or -nomatrix
This property controls the addition of the -rawdist switch, treat this property as a boolean.
- property refresh
Parameter for conditions that joined nodes are compared to other nodes
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search -refresh 0.8 – compare a joined node to all other nodes if its top-hit list is less than 80% of the desired length, or if the age of the top-hit list is log2(m) or greater.
This controls the addition of the -refresh parameter and its associated value. Set this property to the argument value required.
- property second
Turn 2nd-level top hits heuristic on.
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other
-2nd or -no2nd to turn 2nd-level top hits heuristic on or off This reduces memory usage and running time but may lead to marginal reductions in tree quality. (By default, -fastest turns on -2nd.)
This property controls the addition of the -2nd switch, treat this property as a boolean.
- property seed
Use -seed to initialize the random number generator.
Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1.
This controls the addition of the -seed parameter and its associated value. Set this property to the argument value required.
- property slow
Use an exhaustive search.
Searching for the best join: By default, FastTree combines the ‘visible set’ of fast neighbor-joining with local hill-climbing as in relaxed neighbor-joining -slow – exhaustive search (like NJ or BIONJ, but different gap handling) -slow takes half an hour instead of 8 seconds for 1,250 proteins
This property controls the addition of the -slow switch, treat this property as a boolean.
- property slownni
Turn off heuristics to avoid constant subtrees with NNIs.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -slownni to turn off heuristics to avoid constant subtrees (affects both ML and ME NNIs).
This property controls the addition of the -slownni switch, treat this property as a boolean.
- property spr
Set the rounds of subtree-prune-regraft moves
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs.
This controls the addition of the -spr parameter and its associated value. Set this property to the argument value required.
- property sprlength
Set maximum SPR move length in topology refinement (default 10).
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs.
This controls the addition of the -sprlength parameter and its associated value. Set this property to the argument value required.
- property top
Top-hit list to speed up search
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other.
This property controls the addition of the -top switch, treat this property as a boolean.
- property topm
Change the top hits calculation method
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search -topm 1.0 – set the top-hit list size to parameter*sqrt(N) FastTree estimates the top m hits of a leaf from the top 2*m hits of a ‘close’ neighbor, where close is defined as d(seed,close) < 0.75 * d(seed, hit of rank 2*m), and updates the top-hits as joins proceed.
This controls the addition of the -topm parameter and its associated value. Set this property to the argument value required.
- property wag
Maximum likelihood model options.
Whelan-And-Goldman 2001 model instead of (default) Jones-Taylor-Thorton 1992 model (a.a. only)
This property controls the addition of the -wag switch, treat this property as a boolean.