Bio.Phylo.Applications package¶
Module contents¶
Phylogenetics command line tool wrappers.
-
class
Bio.Phylo.Applications.
PhymlCommandline
(cmd='phyml', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command-line wrapper for the tree inference program PhyML.
Homepage: http://www.atgc-montpellier.fr/phyml
References
Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology, 2003 Oct;52(5):696-704. PubMed PMID: 14530136.
Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0. Systematic Biology, 2010 59(3):307-21.
-
__init__
(self, cmd='phyml', **kwargs)¶ Initialize the class.
-
property
alpha
¶ Distribution of the gamma distribution shape parameter.
Can be a fixed positive value, or ‘e’ to get the maximum-likelihood estimate.
This controls the addition of the -a parameter and its associated value. Set this property to the argument value required.
-
property
bootstrap
¶ Number of bootstrap replicates, if value is > 0.
Otherwise:
- 0: neither approximate likelihood ratio test nor bootstrap
values are computed.
-1: approximate likelihood ratio test returning aLRT statistics. -2: approximate likelihood ratio test returning Chi2-based
parametric branch supports.
-4: SH-like branch supports alone.
This controls the addition of the -b parameter and its associated value. Set this property to the argument value required.
-
property
datatype
¶ Datatype ‘nt’ for nucleotide (default) or ‘aa’ for amino-acids.
This controls the addition of the -d parameter and its associated value. Set this property to the argument value required.
-
property
frequencies
¶ Character frequencies.
-f e, m, or “fA fC fG fT”
e : Empirical frequencies, determined as follows :
Nucleotide sequences: (Empirical) the equilibrium base frequencies are estimated by counting the occurrence of the different bases in the alignment.
Amino-acid sequences: (Empirical) the equilibrium amino-acid frequencies are estimated by counting the occurrence of the different amino-acids in the alignment.
m : ML/model-based frequencies, determined as follows :
Nucleotide sequences: (ML) the equilibrium base frequencies are estimated using maximum likelihood
Amino-acid sequences: (Model) the equilibrium amino-acid frequencies are estimated using the frequencies defined by the substitution model.
- “fA fC fG fT”only valid for nucleotide-based models.
fA, fC, fG and fT are floating-point numbers that correspond to the frequencies of A, C, G and T, respectively.
This controls the addition of the -f parameter and its associated value. Set this property to the argument value required.
-
property
input
¶ PHYLIP format input nucleotide or amino-acid sequence filenam.
This controls the addition of the -i parameter and its associated value. Set this property to the argument value required.
-
property
input_tree
¶ Starting tree filename. The tree must be in Newick format.
This controls the addition of the -u parameter and its associated value. Set this property to the argument value required.
-
property
model
¶ Substitution model name.
Nucleotide-based models:
HKY85 (default) | JC69 | K80 | F81 | F84 | TN93 | GTR | custom
For the custom option, a string of six digits identifies the model. For instance, 000000 corresponds to F81 (or JC69, provided the distribution of nucleotide frequencies is uniform). 012345 corresponds to GTR. This option can be used for encoding any model that is a nested within GTR.
Amino-acid based models:
LG (default) | WAG | JTT | MtREV | Dayhoff | DCMut | RtREV | CpREV | VT | Blosum62 | MtMam | MtArt | HIVw | HIVb | custom
This controls the addition of the -m parameter and its associated value. Set this property to the argument value required.
-
property
multiple
¶ Number of data sets to analyse (integer).
This controls the addition of the -n parameter and its associated value. Set this property to the argument value required.
-
property
n_rand_starts
¶ Number of initial random trees to be used.
Only valid if SPR searches are to be performed.
This controls the addition of the –n_rand_starts parameter and its associated value. Set this property to the argument value required.
-
property
nclasses
¶ Number of relative substitution rate categories.
Default 1. Must be a positive integer.
This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.
-
property
optimize
¶ Specific parameter optimisation.
- tlrtree topology (t), branch length (l) and
rate parameters (r) are optimised.
tl : tree topology and branch length are optimised. lr : branch length and rate parameters are optimised. l : branch length are optimised. r : rate parameters are optimised. n : no parameter is optimised.
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
-
property
pars
¶ Use a minimum parsimony starting tree.
This option is taken into account when the ‘-u’ option is absent and when tree topology modifications are to be done.
This property controls the addition of the -p switch, treat this property as a boolean.
-
property
print_site_lnl
¶ Print the likelihood for each site in file *_phyml_lk.txt.
This property controls the addition of the –print_site_lnl switch, treat this property as a boolean.
-
property
print_trace
¶ - Print each phylogeny explored during the tree search process
in file *_phyml_trace.txt.
This property controls the addition of the –print_trace switch, treat this property as a boolean.
-
property
prop_invar
¶ Proportion of invariable sites.
Can be a fixed value in the range [0,1], or ‘e’ to get the maximum-likelihood estimate.
This controls the addition of the -v parameter and its associated value. Set this property to the argument value required.
-
property
quiet
¶ No interactive questions (for running in batch mode).
This property controls the addition of the –quiet switch, treat this property as a boolean.
-
property
r_seed
¶ Seed used to initiate the random number generator.
Must be an integer.
This controls the addition of the –r_seed parameter and its associated value. Set this property to the argument value required.
-
property
rand_start
¶ Sets the initial tree to random.
Only valid if SPR searches are to be performed.
This property controls the addition of the –rand_start switch, treat this property as a boolean.
-
property
run_id
¶ Append the given string at the end of each PhyML output file.
This option may be useful when running simulations involving PhyML.
This controls the addition of the –run_id parameter and its associated value. Set this property to the argument value required.
-
property
search
¶ Tree topology search operation option.
Can be one of:
NNI : default, fast SPR : a bit slower than NNI BEST : best of NNI and SPR search
This controls the addition of the -s parameter and its associated value. Set this property to the argument value required.
-
property
sequential
¶ Changes interleaved format (default) to sequential format.
This property controls the addition of the -q switch, treat this property as a boolean.
-
property
ts_tv_ratio
¶ Transition/transversion ratio. (DNA sequences only.)
Can be a fixed positive value (ex:4.0) or e to get the maximum-likelihood estimate.
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
-
-
class
Bio.Phylo.Applications.
RaxmlCommandline
(cmd='raxmlHPC', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command-line wrapper for the tree inference program RAxML.
The required parameters are ‘sequences’ (-s), ‘model’ (-m) and ‘name’ (-n). The parameter ‘parsimony_seed’ (-p) must also be set for RAxML, but if you do not specify it, this wrapper will set the seed to 10000 for you.
References
Stamatakis A. RAxML-VI-HPC: Maximum Likelihood-based Phylogenetic Analyses with Thousands of Taxa and Mixed Models. Bioinformatics 2006, 22(21):2688-2690.
Homepage: http://sco.h-its.org/exelixis/software.html
Examples
>>> from Bio.Phylo.Applications import RaxmlCommandline >>> raxml_cline = RaxmlCommandline(sequences="Tests/Phylip/interlaced2.phy", ... model="PROTCATWAG", name="interlaced2") >>> print(raxml_cline) raxmlHPC -m PROTCATWAG -n interlaced2 -p 10000 -s Tests/Phylip/interlaced2.phy
You would typically run the command line with raxml_cline() or via the Python subprocess module, as described in the Biopython tutorial.
-
__init__
(self, cmd='raxmlHPC', **kwargs)¶ Initialize the class.
-
property
parsimony_seed
¶ Random number seed for the parsimony inferences. This allows you to reproduce your results and will help developers debug the program. This option HAS NO EFFECT in the parallel MPI version.
This controls the addition of the -p parameter and its associated value. Set this property to the argument value required.
-
property
algorithm
¶ Select algorithm:
- a: Rapid Bootstrap analysis and search for best-scoring ML
tree in one program run.
- b: Draw bipartition information on a tree provided with ‘-t’
based on multiple trees (e.g. form a bootstrap) in a file specifed by ‘-z’.
c: Check if the alignment can be properly read by RAxML. d: New rapid hill-climbing (DEFAULT). e: Optimize model+branch lengths for given input tree under
GAMMA/GAMMAI only.
- g: Compute per site log Likelihoods for one ore more trees
passed via ‘-z’ and write them to a file that can be read by CONSEL.
- h: Compute log likelihood test (SH-test) between best tree
passed via ‘-t’ and a bunch of other trees passed via ‘-z’.
- i: Perform a really thorough bootstrap, refinement of final
bootstrap tree under GAMMA and a more exhaustive algorithm.
- j: Generate a bunch of bootstrapped alignment files from an
original alignemnt file.
- m: Compare bipartitions between two bunches of trees passed
via ‘-t’ and ‘-z’ respectively. This will return the Pearson correlation between all bipartitions found in the two tree files. A file called RAxML_bipartitionFrequencies.outputFileName will be printed that contains the pair-wise bipartition frequencies of the two sets.
- n: Compute the log likelihood score of all trees contained
in a tree file provided by ‘-z’ under GAMMA or GAMMA+P-Invar.
o: Old and slower rapid hill-climbing. p: Perform pure stepwise MP addition of new sequences to an
incomplete starting tree.
- s: Split up a multi-gene partitioned alignment into the
respective subalignments.
t: Do randomized tree searches on one fixed starting tree. w: Compute ELW test on a bunch of trees passed via ‘-z’. x: Compute pair-wise ML distances, ML model parameters will
be estimated on an MP starting tree or a user-defined tree passed via ‘-t’, only allowed for GAMMA-based models of rate heterogeneity.
This controls the addition of the -f parameter and its associated value. Set this property to the argument value required.
-
property
binary_constraint
¶ File name of a binary constraint tree. This tree does not need to be comprehensive, i.e. contain all taxa.
This controls the addition of the -r parameter and its associated value. Set this property to the argument value required.
-
property
bipartition_filename
¶ Name of a file containing multiple trees, e.g. from a bootstrap run, that shall be used to draw bipartition values onto a tree provided with ‘-t’. It can also be used to compute per-site log likelihoods in combination with ‘-f g’, and to read a bunch of trees for a couple of other options (‘-f h’, ‘-f m’, ‘-f n’).
This controls the addition of the -z parameter and its associated value. Set this property to the argument value required.
-
property
bootstrap_branch_lengths
¶ Print bootstrapped trees with branch lengths. The bootstraps will run a bit longer, because model parameters will be optimized at the end of each run. Use with CATMIX/PROTMIX or GAMMA/GAMMAI.
This property controls the addition of the -k switch, treat this property as a boolean.
-
property
bootstrap_seed
¶ Random seed for bootstrapping.
This controls the addition of the -b parameter and its associated value. Set this property to the argument value required.
-
property
checkpoints
¶ Write checkpoints (intermediate tree topologies).
This property controls the addition of the -j switch, treat this property as a boolean.
-
property
cluster_threshold
¶ Threshold for sequence similarity clustering. RAxML will then print out an alignment to a file called sequenceFileName.reducedBy.threshold that only contains sequences <= the specified threshold that must be between 0.0 and 1.0. RAxML uses the QT-clustering algorithm to perform this task. In addition, a file called RAxML_reducedList.outputFileName will be written that contains clustering information.
This controls the addition of the -l parameter and its associated value. Set this property to the argument value required.
-
property
cluster_threshold_fast
¶ Same functionality as ‘-l’, but uses a less exhaustive and thus faster clustering algorithm. This is intended for very large datasets with more than 20,000-30,000 sequences.
This controls the addition of the -L parameter and its associated value. Set this property to the argument value required.
-
property
epsilon
¶ Set model optimization precision in log likelihood units for final optimization of tree topology under MIX/MIXI or GAMMA/GAMMAI.Default: 0.1 for models not using proportion of invariant sites estimate; 0.001 for models using proportion of invariant sites estimate.
This controls the addition of the -e parameter and its associated value. Set this property to the argument value required.
-
property
exclude_filename
¶ An exclude file name, containing a specification of alignment positions you wish to exclude. Format is similar to Nexus, the file shall contain entries like ‘100-200 300-400’; to exclude a single column write, e.g., ‘100-100’. If you use a mixed model, an appropriately adapted model file will be written.
This controls the addition of the -E parameter and its associated value. Set this property to the argument value required.
-
property
grouping_constraint
¶ File name of a multifurcating constraint tree. this tree does not need to be comprehensive, i.e. contain all taxa.
This controls the addition of the -g parameter and its associated value. Set this property to the argument value required.
-
property
model
¶ Model of Nucleotide or Amino Acid Substitution:
NUCLEOTIDES:
- GTRCATGTR + Optimization of substitution rates + Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct rate categories for greater computational efficiency if you do a multiple analysis with ‘-#’ or ‘-N’ but without bootstrapping the program will use GTRMIX instead
- GTRGAMMAGTR + Optimization of substitution rates + GAMMA model of rate
heterogeneity (alpha parameter will be estimated)
- GTRMIXInference of the tree under GTRCAT
and thereafter evaluation of the final tree topology under GTRGAMMA
- GTRCAT_GAMMAInference of the tree with site-specific evolutionary rates.
However, here rates are categorized using the 4 discrete GAMMA rates. Evaluation of the final tree topology under GTRGAMMA
GTRGAMMAI : Same as GTRGAMMA, but with estimate of proportion of invariable sites GTRMIXI : Same as GTRMIX, but with estimate of proportion of invariable sites GTRCAT_GAMMAI : Same as GTRCAT_GAMMA, but with estimate of proportion of invariable sites
AMINO ACIDS:
- PROTCATmatrixName[F]specified AA matrix + Optimization of substitution rates + Optimization of site-specific
evolutionary rates which are categorized into numberOfCategories distinct rate categories for greater computational efficiency if you do a multiple analysis with ‘-#’ or ‘-N’ but without bootstrapping the program will use PROTMIX… instead
- PROTGAMMAmatrixName[F]specified AA matrix + Optimization of substitution rates + GAMMA model of rate
heterogeneity (alpha parameter will be estimated)
- PROTMIXmatrixName[F]Inference of the tree under specified AA matrix + CAT
and thereafter evaluation of the final tree topology under specified AA matrix + GAMMA
- PROTCAT_GAMMAmatrixName[F]Inference of the tree under specified AA matrix and site-specific evolutionary rates.
However, here rates are categorized using the 4 discrete GAMMA rates. Evaluation of the final tree topology under specified AA matrix + GAMMA
PROTGAMMAImatrixName[F] : Same as PROTGAMMAmatrixName[F], but with estimate of proportion of invariable sites PROTMIXImatrixName[F] : Same as PROTMIXmatrixName[F], but with estimate of proportion of invariable sites PROTCAT_GAMMAImatrixName[F] : Same as PROTCAT_GAMMAmatrixName[F], but with estimate of proportion of invariable sites
Available AA substitution models: DAYHOFF, DCMUT, JTT, MTREV, WAG, RTREV, CPREV, VT, BLOSUM62, MTMAM, GTR With the optional ‘F’ appendix you can specify if you want to use empirical base frequencies Please not that for mixed models you can in addition specify the per-gene AA model in the mixed model file (see manual for details)
This controls the addition of the -m parameter and its associated value. Set this property to the argument value required.
-
property
name
¶ Name used in the output files.
This controls the addition of the -n parameter and its associated value. Set this property to the argument value required.
-
property
num_bootstrap_searches
¶ Number of multiple bootstrap searches per replicate. Use this to obtain better ML trees for each replicate. Default: 1 ML search per bootstrap replicate.
This controls the addition of the -u parameter and its associated value. Set this property to the argument value required.
-
property
num_categories
¶ Number of distinct rate categories for RAxML when evolution model is set to GTRCAT or GTRMIX.Individual per-site rates are categorized into this many rate categories to accelerate computations. Default: 25.
This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.
-
property
num_replicates
¶ Number of alternative runs on distinct starting trees. In combination with the ‘-b’ option, this will invoke a multiple bootstrap analysis. DEFAULT: 1 single analysis.Note that ‘-N’ has been added as an alternative since ‘-#’ sometimes caused problems with certain MPI job submission systems, since ‘-#’ is often used to start comments.
This controls the addition of the -N parameter and its associated value. Set this property to the argument value required.
-
property
outgroup
¶ Name of a single outgroup or a comma-separated list of outgroups, eg ‘-o Rat’ or ‘-o Rat,Mouse’. In case that multiple outgroups are not monophyletic the first name in the list will be selected as outgroup. Don’t leave spaces between taxon names!
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
-
property
parsimony
¶ Only compute a parsimony starting tree, then exit.
This property controls the addition of the -y switch, treat this property as a boolean.
-
property
partition_branch_lengths
¶ Switch on estimation of individual per-partition branch lengths. Only has effect when used in combination with ‘partition_filename’ (‘-q’). Branch lengths for individual partitions will be printed to separate files. A weighted average of the branch lengths is computed by using the respective partition lengths.
This property controls the addition of the -M switch, treat this property as a boolean.
-
property
partition_filename
¶ File name containing the assignment of models to alignment partitions for multiple models of substitution. For the syntax of this file please consult the RAxML manual.
This controls the addition of the -q parameter and its associated value. Set this property to the argument value required.
-
property
protein_model
¶ File name of a user-defined AA (Protein) substitution model. This file must contain 420 entries, the first 400 being the AA substitution rates (this must be a symmetric matrix) and the last 20 are the empirical base frequencies.
This controls the addition of the -P parameter and its associated value. Set this property to the argument value required.
-
property
random_starting_tree
¶ Start ML optimization from random starting tree.
This property controls the addition of the -d switch, treat this property as a boolean.
-
property
rapid_bootstrap_seed
¶ Random seed for rapid bootstrapping.
This controls the addition of the -x parameter and its associated value. Set this property to the argument value required.
-
property
rearrangements
¶ Initial rearrangement setting for the subsequent application of topological changes phase.
This controls the addition of the -i parameter and its associated value. Set this property to the argument value required.
-
property
sequences
¶ Name of the alignment data file, in PHYLIP format.
This controls the addition of the -s parameter and its associated value. Set this property to the argument value required.
-
property
starting_tree
¶ File name of a user starting tree, in Newick format.
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
-
property
threads
¶ Number of threads to run. PTHREADS VERSION ONLY! Make sure to set this at most the number of CPUs you have on your machine, otherwise, there will be a huge performance decrease!
This controls the addition of the -T parameter and its associated value. Set this property to the argument value required.
-
property
version
¶ Display version information.
This property controls the addition of the -v switch, treat this property as a boolean.
-
property
weight_filename
¶ Name of a column weight file to assign individual weights to each column of the alignment. Those weights must be integers separated by any type and number of whitespaces within a separate file.
This controls the addition of the -a parameter and its associated value. Set this property to the argument value required.
-
property
working_dir
¶ Name of the working directory where RAxML will write its output files. Default: current directory.
This controls the addition of the -w parameter and its associated value. Set this property to the argument value required.
-
-
class
Bio.Phylo.Applications.
FastTreeCommandline
(cmd='fasttree', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command-line wrapper for FastTree.
Only the
input
andout
parameters are mandatory.From the terminal command line use
fasttree.exe -help
orfasttree.exe -expert
for more explanation of usage options.Homepage: http://www.microbesonline.org/fasttree/
References
Price, M.N., Dehal, P.S., and Arkin, A.P. (2010) FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE, 5(3):e9490. https://doi.org/10.1371/journal.pone.0009490.
Examples
This is an example on Windows:
import _Fasttree fasttree_exe = r"C:\FasttreeWin32\fasttree.exe" cmd = _Fasttree.FastTreeCommandline(fasttree_exe, ... input=r'C:\Input\ExampleAlignment.fsa', ... out=r'C:\Output\ExampleTree.tree') print(cmd) out, err = cmd() print(out) print(err)
-
__init__
(self, cmd='fasttree', **kwargs)¶ Initialize the class.
-
property
bionj
¶ Join options: weighted joins as in BIONJ.
FastTree will also weight joins during NNIs.
This property controls the addition of the -bionj switch, treat this property as a boolean.
-
property
boot
¶ Specify the number of resamples for support values.
Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1
Use -nosupport to turn off support values or -boot 100 to use just 100 resamples.
This controls the addition of the -boot parameter and its associated value. Set this property to the argument value required.
-
property
cat
¶ Maximum likelihood model options.
Specify the number of rate categories of sites (default 20).
This controls the addition of the -cat parameter and its associated value. Set this property to the argument value required.
-
property
close
¶ Modify the close heuristic for the top-hit list
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search -close 0.75 – modify the close heuristic, lower is more conservative.
This controls the addition of the -close parameter and its associated value. Set this property to the argument value required.
-
property
constraintWeight
¶ Weight strength of contraints in topology searching.
Constrained topology search options: -constraintWeight – how strongly to weight the constraints. A value of 1 means a penalty of 1 in tree length for violating a constraint Default: 100.0
This controls the addition of the -constraintWeight parameter and its associated value. Set this property to the argument value required.
-
property
constraints
¶ Specifies an alignment file for use with constrained topology searching
Constrained topology search options: -constraints alignmentfile – an alignment with values of 0, 1, and - Not all sequences need be present. A column of 0s and 1s defines a constrained split. Some constraints may be violated (see ‘violating constraints:’ in standard error).
This controls the addition of the -constraints parameter and its associated value. Set this property to the argument value required.
-
property
expert
¶ Show the expert level help.
This property controls the addition of the -expert switch, treat this property as a boolean.
-
property
fastest
¶ Search the visible set (the top hit for each node) only.
Searching for the best join: By default, FastTree combines the ‘visible set’ of fast neighbor-joining with local hill-climbing as in relaxed neighbor-joining -fastest – search the visible set (the top hit for each node) only Unlike the original fast neighbor-joining, -fastest updates visible(C) after joining A and B if join(AB,C) is better than join(C,visible(C)) -fastest also updates out-distances in a very lazy way, -fastest sets -2nd on as well, use -fastest -no2nd to avoid this
This property controls the addition of the -fastest switch, treat this property as a boolean.
-
property
gamma
¶ Report the likelihood under the discrete gamma model.
Maximum likelihood model options: -gamma – after the final round of optimizing branch lengths with the CAT model, report the likelihood under the discrete gamma model with the same number of categories. FastTree uses the same branch lengths but optimizes the gamma shape parameter and the scale of the lengths. The final tree will have rescaled lengths. Used with -log, this also generates per-site likelihoods for use with CONSEL, see GammaLogToPaup.pl and documentation on the FastTree web site.
This property controls the addition of the -gamma switch, treat this property as a boolean.
-
property
gtr
¶ Maximum likelihood model options.
Use generalized time-reversible instead of (default) Jukes-Cantor (nt only)
This property controls the addition of the -gtr switch, treat this property as a boolean.
-
property
gtrfreq
¶ -gtrfreq A C G T
This controls the addition of the -gtrfreq parameter and its associated value. Set this property to the argument value required.
-
property
gtrrates
¶ -gtrrates ac ag at cg ct gt
This controls the addition of the -gtrrates parameter and its associated value. Set this property to the argument value required.
-
property
help
¶ Show the help.
This property controls the addition of the -help switch, treat this property as a boolean.
-
property
input
¶ Enter <input file>
An input file of sequence alignments in fasta or phylip format is needed. By default FastTree expects protein alignments, use -nt for nucleotides.
This controls the addition of the input parameter and its associated value. Set this property to the argument value required.
-
property
intree
¶ -intree newickfile – read the starting tree in from newickfile.
Any branch lengths in the starting trees are ignored. -intree with -n will read a separate starting tree for each alignment.
This controls the addition of the -intree parameter and its associated value. Set this property to the argument value required.
-
property
intree1
¶ intree1 newickfile – read the same starting tree for each alignment.
This controls the addition of the -intree1 parameter and its associated value. Set this property to the argument value required.
-
property
log
¶ Create log files of data such as intermediate trees and per-site rates
-log logfile – save intermediate trees so you can extract the trees and restart long-running jobs if they crash -log also reports the per-site rates (1 means slowest category).
This controls the addition of the -log parameter and its associated value. Set this property to the argument value required.
-
property
makematrix
¶ -makematrix [alignment]
This controls the addition of the -makematrix parameter and its associated value. Set this property to the argument value required.
-
property
matrix
¶ Specify a matrix for nucleotide or amino acid distances
Distances: Default: For protein sequences, log-corrected distances and an amino acid dissimilarity matrix derived from BLOSUM45 or for nucleotide sequences, Jukes-Cantor distances To specify a different matrix, use -matrix FilePrefix or -nomatrix
This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.
-
property
mlacc
¶ Option for optimization of branches at each NNI.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mlacc 2 or -mlacc 3 to always optimize all 5 branches at each NNI, and to optimize all 5 branches in 2 or 3 rounds.
This controls the addition of the -mlacc parameter and its associated value. Set this property to the argument value required.
-
property
mllen
¶ Optimize branch lengths on a fixed topology.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mllen to optimize branch lengths without ML NNIs Use -mllen -nome with -intree to optimize branch lengths on a fixed topology.
This property controls the addition of the -mllen switch, treat this property as a boolean.
-
property
mlnni
¶ Set the number of rounds of maximum-likelihood NNIs.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mlnni to set the number of rounds of maximum-likelihood NNIs.
This controls the addition of the -mlnni parameter and its associated value. Set this property to the argument value required.
-
property
n
¶ -n – read N multiple alignments in.
This only works with phylip interleaved format. For example, you can use it with the output from phylip’s seqboot. If you use -n, FastTree will write 1 tree per line to standard output.
This controls the addition of the -n parameter and its associated value. Set this property to the argument value required.
-
property
nj
¶ Join options: regular (unweighted) neighbor-joining (default)
This property controls the addition of the -nj switch, treat this property as a boolean.
-
property
nni
¶ Set the rounds of minimum-evolution nearest-neighbor interchanges
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs.
This controls the addition of the -nni parameter and its associated value. Set this property to the argument value required.
-
property
no2nd
¶ Turn 2nd-level top hits heuristic off.
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other
-2nd or -no2nd to turn 2nd-level top hits heuristic on or off This reduces memory usage and running time but may lead to marginal reductions in tree quality. (By default, -fastest turns on -2nd.)
This property controls the addition of the -no2nd switch, treat this property as a boolean.
-
property
nocat
¶ Maximum likelihood model options: No CAT model (just 1 category)
This property controls the addition of the -nocat switch, treat this property as a boolean.
-
property
nomatrix
¶ Specify that no matrix should be used for nucleotide or amino acid distances
Distances: Default: For protein sequences, log-corrected distances and an amino acid dissimilarity matrix derived from BLOSUM45 or for nucleotide sequences, Jukes-Cantor distances To specify a different matrix, use -matrix FilePrefix or -nomatrix
This property controls the addition of the -nomatrix switch, treat this property as a boolean.
-
property
nome
¶ Changes support values calculation to a minimum-evolution bootstrap method.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mllen to optimize branch lengths without ML NNIs Use -mllen -nome with -intree to optimize branch lengths on a fixed topology
Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1.
This property controls the addition of the -nome switch, treat this property as a boolean.
-
property
noml
¶ Deactivate min-evo NNIs and SPRs.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -noml to turn off both min-evo NNIs and SPRs (useful if refining an approximately maximum-likelihood tree with further NNIs).
This property controls the addition of the -noml switch, treat this property as a boolean.
-
property
nopr
¶ -nopr – do not write the progress indicator to stderr.
This property controls the addition of the -nopr switch, treat this property as a boolean.
-
property
nosupport
¶ Turn off support values.
Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1
Use -nosupport to turn off support values or -boot 100 to use just 100 resamples.
This property controls the addition of the -nosupport switch, treat this property as a boolean.
-
property
notop
¶ Turn off top-hit list to speed up search
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other.
This property controls the addition of the -notop switch, treat this property as a boolean.
-
property
nt
¶ By default FastTree expects protein alignments, use -nt for nucleotides
This property controls the addition of the -nt switch, treat this property as a boolean.
-
property
out
¶ Enter <output file>
The path to a Newick Tree output file needs to be specified.
This controls the addition of the -out parameter and its associated value. Set this property to the argument value required.
-
property
pseudo
¶ -pseudo [weight] – Pseudocounts are used with sequence distance estimation.
Use pseudocounts to estimate distances between sequences with little or no overlap. (Off by default.) Recommended if analyzing the alignment has sequences with little or no overlap. If the weight is not specified, it is 1.0
This controls the addition of the -pseudo parameter and its associated value. Set this property to the argument value required.
-
property
quiet
¶ -quiet – do not write to standard error during normal operation
(no progress indicator, no options summary, no likelihood values, etc.)
This property controls the addition of the -quiet switch, treat this property as a boolean.
-
property
quote
¶ -quote – add quotes to sequence names in output.
Quote sequence names in the output and allow spaces, commas, parentheses, and colons in them but not ‘ characters (fasta files only).
This property controls the addition of the -quote switch, treat this property as a boolean.
-
property
rawdist
¶ Turn off or adjust log-correction in AA or NT distances.
Use -rawdist to turn the log-correction off or to use %different instead of Jukes-Cantor in AA or NT distances
Distances: Default: For protein sequences, log-corrected distances and an amino acid dissimilarity matrix derived from BLOSUM45 or for nucleotide sequences, Jukes-Cantor distances To specify a different matrix, use -matrix FilePrefix or -nomatrix
This property controls the addition of the -rawdist switch, treat this property as a boolean.
-
property
refresh
¶ Parameter for conditions that joined nodes are compared to other nodes
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search -refresh 0.8 – compare a joined node to all other nodes if its top-hit list is less than 80% of the desired length, or if the age of the top-hit list is log2(m) or greater.
This controls the addition of the -refresh parameter and its associated value. Set this property to the argument value required.
-
property
second
¶ Turn 2nd-level top hits heuristic on.
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other
-2nd or -no2nd to turn 2nd-level top hits heuristic on or off This reduces memory usage and running time but may lead to marginal reductions in tree quality. (By default, -fastest turns on -2nd.)
This property controls the addition of the -2nd switch, treat this property as a boolean.
-
property
seed
¶ Use -seed to initialize the random number generator.
Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1.
This controls the addition of the -seed parameter and its associated value. Set this property to the argument value required.
-
property
slow
¶ Use an exhaustive search.
Searching for the best join: By default, FastTree combines the ‘visible set’ of fast neighbor-joining with local hill-climbing as in relaxed neighbor-joining -slow – exhaustive search (like NJ or BIONJ, but different gap handling) -slow takes half an hour instead of 8 seconds for 1,250 proteins
This property controls the addition of the -slow switch, treat this property as a boolean.
-
property
slownni
¶ Turn off heuristics to avoid constant subtrees with NNIs.
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -slownni to turn off heuristics to avoid constant subtrees (affects both ML and ME NNIs).
This property controls the addition of the -slownni switch, treat this property as a boolean.
-
property
spr
¶ Set the rounds of subtree-prune-regraft moves
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs.
This controls the addition of the -spr parameter and its associated value. Set this property to the argument value required.
-
property
sprlength
¶ Set maximum SPR move length in topology refinement (default 10).
Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs.
This controls the addition of the -sprlength parameter and its associated value. Set this property to the argument value required.
-
property
top
¶ Top-hit list to speed up search
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other.
This property controls the addition of the -top switch, treat this property as a boolean.
-
property
topm
¶ Change the top hits calculation method
Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search -topm 1.0 – set the top-hit list size to parameter*sqrt(N) FastTree estimates the top m hits of a leaf from the top 2*m hits of a ‘close’ neighbor, where close is defined as d(seed,close) < 0.75 * d(seed, hit of rank 2*m), and updates the top-hits as joins proceed.
This controls the addition of the -topm parameter and its associated value. Set this property to the argument value required.
-
property
wag
¶ Maximum likelihood model options.
Whelan-And-Goldman 2001 model instead of (default) Jones-Taylor-Thorton 1992 model (a.a. only)
This property controls the addition of the -wag switch, treat this property as a boolean.
-