Bio.Phylo.Applications package

Module contents

Phylogenetics command line tool wrappers.

class Bio.Phylo.Applications.PhymlCommandline(cmd='phyml', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command-line wrapper for the tree inference program PhyML.

Homepage: http://www.atgc-montpellier.fr/phyml

References

Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology, 2003 Oct;52(5):696-704. PubMed PMID: 14530136.

Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0. Systematic Biology, 2010 59(3):307-21.

__init__(self, cmd='phyml', **kwargs)

Initialize the class.

property alpha

Distribution of the gamma distribution shape parameter.

Can be a fixed positive value, or ‘e’ to get the maximum-likelihood estimate.

This controls the addition of the -a parameter and its associated value. Set this property to the argument value required.

property bootstrap

Number of bootstrap replicates, if value is > 0.

Otherwise:

0: neither approximate likelihood ratio test nor bootstrap

values are computed.

-1: approximate likelihood ratio test returning aLRT statistics. -2: approximate likelihood ratio test returning Chi2-based

parametric branch supports.

-4: SH-like branch supports alone.

This controls the addition of the -b parameter and its associated value. Set this property to the argument value required.

property datatype

Datatype ‘nt’ for nucleotide (default) or ‘aa’ for amino-acids.

This controls the addition of the -d parameter and its associated value. Set this property to the argument value required.

property frequencies

Character frequencies.

-f e, m, or “fA fC fG fT”

e : Empirical frequencies, determined as follows :

  • Nucleotide sequences: (Empirical) the equilibrium base frequencies are estimated by counting the occurrence of the different bases in the alignment.

  • Amino-acid sequences: (Empirical) the equilibrium amino-acid frequencies are estimated by counting the occurrence of the different amino-acids in the alignment.

m : ML/model-based frequencies, determined as follows :

  • Nucleotide sequences: (ML) the equilibrium base frequencies are estimated using maximum likelihood

  • Amino-acid sequences: (Model) the equilibrium amino-acid frequencies are estimated using the frequencies defined by the substitution model.

“fA fC fG fT”only valid for nucleotide-based models.

fA, fC, fG and fT are floating-point numbers that correspond to the frequencies of A, C, G and T, respectively.

This controls the addition of the -f parameter and its associated value. Set this property to the argument value required.

property input

PHYLIP format input nucleotide or amino-acid sequence filenam.

This controls the addition of the -i parameter and its associated value. Set this property to the argument value required.

property input_tree

Starting tree filename. The tree must be in Newick format.

This controls the addition of the -u parameter and its associated value. Set this property to the argument value required.

property model

Substitution model name.

Nucleotide-based models:

HKY85 (default) | JC69 | K80 | F81 | F84 | TN93 | GTR | custom

For the custom option, a string of six digits identifies the model. For instance, 000000 corresponds to F81 (or JC69, provided the distribution of nucleotide frequencies is uniform). 012345 corresponds to GTR. This option can be used for encoding any model that is a nested within GTR.

Amino-acid based models:

LG (default) | WAG | JTT | MtREV | Dayhoff | DCMut | RtREV | CpREV | VT | Blosum62 | MtMam | MtArt | HIVw | HIVb | custom

This controls the addition of the -m parameter and its associated value. Set this property to the argument value required.

property multiple

Number of data sets to analyse (integer).

This controls the addition of the -n parameter and its associated value. Set this property to the argument value required.

property n_rand_starts

Number of initial random trees to be used.

Only valid if SPR searches are to be performed.

This controls the addition of the –n_rand_starts parameter and its associated value. Set this property to the argument value required.

property nclasses

Number of relative substitution rate categories.

Default 1. Must be a positive integer.

This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.

property optimize

Specific parameter optimisation.

tlrtree topology (t), branch length (l) and

rate parameters (r) are optimised.

tl : tree topology and branch length are optimised. lr : branch length and rate parameters are optimised. l : branch length are optimised. r : rate parameters are optimised. n : no parameter is optimised.

This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.

property pars

Use a minimum parsimony starting tree.

This option is taken into account when the ‘-u’ option is absent and when tree topology modifications are to be done.

This property controls the addition of the -p switch, treat this property as a boolean.

property print_site_lnl

Print the likelihood for each site in file *_phyml_lk.txt.

This property controls the addition of the –print_site_lnl switch, treat this property as a boolean.

property print_trace
Print each phylogeny explored during the tree search process

in file *_phyml_trace.txt.

This property controls the addition of the –print_trace switch, treat this property as a boolean.

property prop_invar

Proportion of invariable sites.

Can be a fixed value in the range [0,1], or ‘e’ to get the maximum-likelihood estimate.

This controls the addition of the -v parameter and its associated value. Set this property to the argument value required.

property quiet

No interactive questions (for running in batch mode).

This property controls the addition of the –quiet switch, treat this property as a boolean.

property r_seed

Seed used to initiate the random number generator.

Must be an integer.

This controls the addition of the –r_seed parameter and its associated value. Set this property to the argument value required.

property rand_start

Sets the initial tree to random.

Only valid if SPR searches are to be performed.

This property controls the addition of the –rand_start switch, treat this property as a boolean.

property run_id

Append the given string at the end of each PhyML output file.

This option may be useful when running simulations involving PhyML.

This controls the addition of the –run_id parameter and its associated value. Set this property to the argument value required.

property search

Tree topology search operation option.

Can be one of:

NNI : default, fast SPR : a bit slower than NNI BEST : best of NNI and SPR search

This controls the addition of the -s parameter and its associated value. Set this property to the argument value required.

property sequential

Changes interleaved format (default) to sequential format.

This property controls the addition of the -q switch, treat this property as a boolean.

property ts_tv_ratio

Transition/transversion ratio. (DNA sequences only.)

Can be a fixed positive value (ex:4.0) or e to get the maximum-likelihood estimate.

This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.

class Bio.Phylo.Applications.RaxmlCommandline(cmd='raxmlHPC', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command-line wrapper for the tree inference program RAxML.

The required parameters are ‘sequences’ (-s), ‘model’ (-m) and ‘name’ (-n). The parameter ‘parsimony_seed’ (-p) must also be set for RAxML, but if you do not specify it, this wrapper will set the seed to 10000 for you.

References

Stamatakis A. RAxML-VI-HPC: Maximum Likelihood-based Phylogenetic Analyses with Thousands of Taxa and Mixed Models. Bioinformatics 2006, 22(21):2688-2690.

Homepage: http://sco.h-its.org/exelixis/software.html

Examples

>>> from Bio.Phylo.Applications import RaxmlCommandline
>>> raxml_cline = RaxmlCommandline(sequences="Tests/Phylip/interlaced2.phy",
...                                model="PROTCATWAG", name="interlaced2")
>>> print(raxml_cline)
raxmlHPC -m PROTCATWAG -n interlaced2 -p 10000 -s Tests/Phylip/interlaced2.phy

You would typically run the command line with raxml_cline() or via the Python subprocess module, as described in the Biopython tutorial.

__init__(self, cmd='raxmlHPC', **kwargs)

Initialize the class.

property parsimony_seed

Random number seed for the parsimony inferences. This allows you to reproduce your results and will help developers debug the program. This option HAS NO EFFECT in the parallel MPI version.

This controls the addition of the -p parameter and its associated value. Set this property to the argument value required.

property algorithm

Select algorithm:

a: Rapid Bootstrap analysis and search for best-scoring ML

tree in one program run.

b: Draw bipartition information on a tree provided with ‘-t’

based on multiple trees (e.g. form a bootstrap) in a file specifed by ‘-z’.

c: Check if the alignment can be properly read by RAxML. d: New rapid hill-climbing (DEFAULT). e: Optimize model+branch lengths for given input tree under

GAMMA/GAMMAI only.

g: Compute per site log Likelihoods for one ore more trees

passed via ‘-z’ and write them to a file that can be read by CONSEL.

h: Compute log likelihood test (SH-test) between best tree

passed via ‘-t’ and a bunch of other trees passed via ‘-z’.

i: Perform a really thorough bootstrap, refinement of final

bootstrap tree under GAMMA and a more exhaustive algorithm.

j: Generate a bunch of bootstrapped alignment files from an

original alignemnt file.

m: Compare bipartitions between two bunches of trees passed

via ‘-t’ and ‘-z’ respectively. This will return the Pearson correlation between all bipartitions found in the two tree files. A file called RAxML_bipartitionFrequencies.outputFileName will be printed that contains the pair-wise bipartition frequencies of the two sets.

n: Compute the log likelihood score of all trees contained

in a tree file provided by ‘-z’ under GAMMA or GAMMA+P-Invar.

o: Old and slower rapid hill-climbing. p: Perform pure stepwise MP addition of new sequences to an

incomplete starting tree.

s: Split up a multi-gene partitioned alignment into the

respective subalignments.

t: Do randomized tree searches on one fixed starting tree. w: Compute ELW test on a bunch of trees passed via ‘-z’. x: Compute pair-wise ML distances, ML model parameters will

be estimated on an MP starting tree or a user-defined tree passed via ‘-t’, only allowed for GAMMA-based models of rate heterogeneity.

This controls the addition of the -f parameter and its associated value. Set this property to the argument value required.

property binary_constraint

File name of a binary constraint tree. This tree does not need to be comprehensive, i.e. contain all taxa.

This controls the addition of the -r parameter and its associated value. Set this property to the argument value required.

property bipartition_filename

Name of a file containing multiple trees, e.g. from a bootstrap run, that shall be used to draw bipartition values onto a tree provided with ‘-t’. It can also be used to compute per-site log likelihoods in combination with ‘-f g’, and to read a bunch of trees for a couple of other options (‘-f h’, ‘-f m’, ‘-f n’).

This controls the addition of the -z parameter and its associated value. Set this property to the argument value required.

property bootstrap_branch_lengths

Print bootstrapped trees with branch lengths. The bootstraps will run a bit longer, because model parameters will be optimized at the end of each run. Use with CATMIX/PROTMIX or GAMMA/GAMMAI.

This property controls the addition of the -k switch, treat this property as a boolean.

property bootstrap_seed

Random seed for bootstrapping.

This controls the addition of the -b parameter and its associated value. Set this property to the argument value required.

property checkpoints

Write checkpoints (intermediate tree topologies).

This property controls the addition of the -j switch, treat this property as a boolean.

property cluster_threshold

Threshold for sequence similarity clustering. RAxML will then print out an alignment to a file called sequenceFileName.reducedBy.threshold that only contains sequences <= the specified threshold that must be between 0.0 and 1.0. RAxML uses the QT-clustering algorithm to perform this task. In addition, a file called RAxML_reducedList.outputFileName will be written that contains clustering information.

This controls the addition of the -l parameter and its associated value. Set this property to the argument value required.

property cluster_threshold_fast

Same functionality as ‘-l’, but uses a less exhaustive and thus faster clustering algorithm. This is intended for very large datasets with more than 20,000-30,000 sequences.

This controls the addition of the -L parameter and its associated value. Set this property to the argument value required.

property epsilon

Set model optimization precision in log likelihood units for final optimization of tree topology under MIX/MIXI or GAMMA/GAMMAI.Default: 0.1 for models not using proportion of invariant sites estimate; 0.001 for models using proportion of invariant sites estimate.

This controls the addition of the -e parameter and its associated value. Set this property to the argument value required.

property exclude_filename

An exclude file name, containing a specification of alignment positions you wish to exclude. Format is similar to Nexus, the file shall contain entries like ‘100-200 300-400’; to exclude a single column write, e.g., ‘100-100’. If you use a mixed model, an appropriately adapted model file will be written.

This controls the addition of the -E parameter and its associated value. Set this property to the argument value required.

property grouping_constraint

File name of a multifurcating constraint tree. this tree does not need to be comprehensive, i.e. contain all taxa.

This controls the addition of the -g parameter and its associated value. Set this property to the argument value required.

property model

Model of Nucleotide or Amino Acid Substitution:

NUCLEOTIDES:

GTRCATGTR + Optimization of substitution rates + Optimization of site-specific

evolutionary rates which are categorized into numberOfCategories distinct rate categories for greater computational efficiency if you do a multiple analysis with ‘-#’ or ‘-N’ but without bootstrapping the program will use GTRMIX instead

GTRGAMMAGTR + Optimization of substitution rates + GAMMA model of rate

heterogeneity (alpha parameter will be estimated)

GTRMIXInference of the tree under GTRCAT

and thereafter evaluation of the final tree topology under GTRGAMMA

GTRCAT_GAMMAInference of the tree with site-specific evolutionary rates.

However, here rates are categorized using the 4 discrete GAMMA rates. Evaluation of the final tree topology under GTRGAMMA

GTRGAMMAI : Same as GTRGAMMA, but with estimate of proportion of invariable sites GTRMIXI : Same as GTRMIX, but with estimate of proportion of invariable sites GTRCAT_GAMMAI : Same as GTRCAT_GAMMA, but with estimate of proportion of invariable sites

AMINO ACIDS:

PROTCATmatrixName[F]specified AA matrix + Optimization of substitution rates + Optimization of site-specific

evolutionary rates which are categorized into numberOfCategories distinct rate categories for greater computational efficiency if you do a multiple analysis with ‘-#’ or ‘-N’ but without bootstrapping the program will use PROTMIX… instead

PROTGAMMAmatrixName[F]specified AA matrix + Optimization of substitution rates + GAMMA model of rate

heterogeneity (alpha parameter will be estimated)

PROTMIXmatrixName[F]Inference of the tree under specified AA matrix + CAT

and thereafter evaluation of the final tree topology under specified AA matrix + GAMMA

PROTCAT_GAMMAmatrixName[F]Inference of the tree under specified AA matrix and site-specific evolutionary rates.

However, here rates are categorized using the 4 discrete GAMMA rates. Evaluation of the final tree topology under specified AA matrix + GAMMA

PROTGAMMAImatrixName[F] : Same as PROTGAMMAmatrixName[F], but with estimate of proportion of invariable sites PROTMIXImatrixName[F] : Same as PROTMIXmatrixName[F], but with estimate of proportion of invariable sites PROTCAT_GAMMAImatrixName[F] : Same as PROTCAT_GAMMAmatrixName[F], but with estimate of proportion of invariable sites

Available AA substitution models: DAYHOFF, DCMUT, JTT, MTREV, WAG, RTREV, CPREV, VT, BLOSUM62, MTMAM, GTR With the optional ‘F’ appendix you can specify if you want to use empirical base frequencies Please not that for mixed models you can in addition specify the per-gene AA model in the mixed model file (see manual for details)

This controls the addition of the -m parameter and its associated value. Set this property to the argument value required.

property name

Name used in the output files.

This controls the addition of the -n parameter and its associated value. Set this property to the argument value required.

property num_bootstrap_searches

Number of multiple bootstrap searches per replicate. Use this to obtain better ML trees for each replicate. Default: 1 ML search per bootstrap replicate.

This controls the addition of the -u parameter and its associated value. Set this property to the argument value required.

property num_categories

Number of distinct rate categories for RAxML when evolution model is set to GTRCAT or GTRMIX.Individual per-site rates are categorized into this many rate categories to accelerate computations. Default: 25.

This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.

property num_replicates

Number of alternative runs on distinct starting trees. In combination with the ‘-b’ option, this will invoke a multiple bootstrap analysis. DEFAULT: 1 single analysis.Note that ‘-N’ has been added as an alternative since ‘-#’ sometimes caused problems with certain MPI job submission systems, since ‘-#’ is often used to start comments.

This controls the addition of the -N parameter and its associated value. Set this property to the argument value required.

property outgroup

Name of a single outgroup or a comma-separated list of outgroups, eg ‘-o Rat’ or ‘-o Rat,Mouse’. In case that multiple outgroups are not monophyletic the first name in the list will be selected as outgroup. Don’t leave spaces between taxon names!

This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.

property parsimony

Only compute a parsimony starting tree, then exit.

This property controls the addition of the -y switch, treat this property as a boolean.

property partition_branch_lengths

Switch on estimation of individual per-partition branch lengths. Only has effect when used in combination with ‘partition_filename’ (‘-q’). Branch lengths for individual partitions will be printed to separate files. A weighted average of the branch lengths is computed by using the respective partition lengths.

This property controls the addition of the -M switch, treat this property as a boolean.

property partition_filename

File name containing the assignment of models to alignment partitions for multiple models of substitution. For the syntax of this file please consult the RAxML manual.

This controls the addition of the -q parameter and its associated value. Set this property to the argument value required.

property protein_model

File name of a user-defined AA (Protein) substitution model. This file must contain 420 entries, the first 400 being the AA substitution rates (this must be a symmetric matrix) and the last 20 are the empirical base frequencies.

This controls the addition of the -P parameter and its associated value. Set this property to the argument value required.

property random_starting_tree

Start ML optimization from random starting tree.

This property controls the addition of the -d switch, treat this property as a boolean.

property rapid_bootstrap_seed

Random seed for rapid bootstrapping.

This controls the addition of the -x parameter and its associated value. Set this property to the argument value required.

property rearrangements

Initial rearrangement setting for the subsequent application of topological changes phase.

This controls the addition of the -i parameter and its associated value. Set this property to the argument value required.

property sequences

Name of the alignment data file, in PHYLIP format.

This controls the addition of the -s parameter and its associated value. Set this property to the argument value required.

property starting_tree

File name of a user starting tree, in Newick format.

This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.

property threads

Number of threads to run. PTHREADS VERSION ONLY! Make sure to set this at most the number of CPUs you have on your machine, otherwise, there will be a huge performance decrease!

This controls the addition of the -T parameter and its associated value. Set this property to the argument value required.

property version

Display version information.

This property controls the addition of the -v switch, treat this property as a boolean.

property weight_filename

Name of a column weight file to assign individual weights to each column of the alignment. Those weights must be integers separated by any type and number of whitespaces within a separate file.

This controls the addition of the -a parameter and its associated value. Set this property to the argument value required.

property working_dir

Name of the working directory where RAxML will write its output files. Default: current directory.

This controls the addition of the -w parameter and its associated value. Set this property to the argument value required.

class Bio.Phylo.Applications.FastTreeCommandline(cmd='fasttree', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command-line wrapper for FastTree.

Only the input and out parameters are mandatory.

From the terminal command line use fasttree.exe -help or fasttree.exe -expert for more explanation of usage options.

Homepage: http://www.microbesonline.org/fasttree/

References

Price, M.N., Dehal, P.S., and Arkin, A.P. (2010) FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE, 5(3):e9490. https://doi.org/10.1371/journal.pone.0009490.

Examples

>>> import _Fasttree
>>> fasttree_exe = r"C:\FasttreeWin32\fasttree.exe"
>>> cmd = _Fasttree.FastTreeCommandline(fasttree_exe,
...                                     input=r'C:\Input\ExampleAlignment.fsa',
...                                     out=r'C:\Output\ExampleTree.tree')
>>> print(cmd)
>>> out, err = cmd()
>>> print(out)
>>> print(err)
__init__(self, cmd='fasttree', **kwargs)

Initialize the class.

property bionj

Join options: weighted joins as in BIONJ.

FastTree will also weight joins during NNIs.

This property controls the addition of the -bionj switch, treat this property as a boolean.

property boot

Specify the number of resamples for support values.

Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1

Use -nosupport to turn off support values or -boot 100 to use just 100 resamples.

This controls the addition of the -boot parameter and its associated value. Set this property to the argument value required.

property cat

Maximum likelihood model options.

Specify the number of rate categories of sites (default 20).

This controls the addition of the -cat parameter and its associated value. Set this property to the argument value required.

property close

Modify the close heuristic for the top-hit list

Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search -close 0.75 – modify the close heuristic, lower is more conservative.

This controls the addition of the -close parameter and its associated value. Set this property to the argument value required.

property constraintWeight

Weight strength of contraints in topology searching.

Constrained topology search options: -constraintWeight – how strongly to weight the constraints. A value of 1 means a penalty of 1 in tree length for violating a constraint Default: 100.0

This controls the addition of the -constraintWeight parameter and its associated value. Set this property to the argument value required.

property constraints

Specifies an alignment file for use with constrained topology searching

Constrained topology search options: -constraints alignmentfile – an alignment with values of 0, 1, and - Not all sequences need be present. A column of 0s and 1s defines a constrained split. Some constraints may be violated (see ‘violating constraints:’ in standard error).

This controls the addition of the -constraints parameter and its associated value. Set this property to the argument value required.

property expert

Show the expert level help.

This property controls the addition of the -expert switch, treat this property as a boolean.

property fastest

Search the visible set (the top hit for each node) only.

Searching for the best join: By default, FastTree combines the ‘visible set’ of fast neighbor-joining with local hill-climbing as in relaxed neighbor-joining -fastest – search the visible set (the top hit for each node) only Unlike the original fast neighbor-joining, -fastest updates visible(C) after joining A and B if join(AB,C) is better than join(C,visible(C)) -fastest also updates out-distances in a very lazy way, -fastest sets -2nd on as well, use -fastest -no2nd to avoid this

This property controls the addition of the -fastest switch, treat this property as a boolean.

property gamma

Report the likelihood under the discrete gamma model.

Maximum likelihood model options: -gamma – after the final round of optimizing branch lengths with the CAT model, report the likelihood under the discrete gamma model with the same number of categories. FastTree uses the same branch lengths but optimizes the gamma shape parameter and the scale of the lengths. The final tree will have rescaled lengths. Used with -log, this also generates per-site likelihoods for use with CONSEL, see GammaLogToPaup.pl and documentation on the FastTree web site.

This property controls the addition of the -gamma switch, treat this property as a boolean.

property gtr

Maximum likelihood model options.

Use generalized time-reversible instead of (default) Jukes-Cantor (nt only)

This property controls the addition of the -gtr switch, treat this property as a boolean.

property gtrfreq

-gtrfreq A C G T

This controls the addition of the -gtrfreq parameter and its associated value. Set this property to the argument value required.

property gtrrates

-gtrrates ac ag at cg ct gt

This controls the addition of the -gtrrates parameter and its associated value. Set this property to the argument value required.

property help

Show the help.

This property controls the addition of the -help switch, treat this property as a boolean.

property input

Enter <input file>

An input file of sequence alignments in fasta or phylip format is needed. By default FastTree expects protein alignments, use -nt for nucleotides.

This controls the addition of the input parameter and its associated value. Set this property to the argument value required.

property intree

-intree newickfile – read the starting tree in from newickfile.

Any branch lengths in the starting trees are ignored. -intree with -n will read a separate starting tree for each alignment.

This controls the addition of the -intree parameter and its associated value. Set this property to the argument value required.

property intree1

intree1 newickfile – read the same starting tree for each alignment.

This controls the addition of the -intree1 parameter and its associated value. Set this property to the argument value required.

property log

Create log files of data such as intermediate trees and per-site rates

-log logfile – save intermediate trees so you can extract the trees and restart long-running jobs if they crash -log also reports the per-site rates (1 means slowest category).

This controls the addition of the -log parameter and its associated value. Set this property to the argument value required.

property makematrix

-makematrix [alignment]

This controls the addition of the -makematrix parameter and its associated value. Set this property to the argument value required.

property matrix

Specify a matrix for nucleotide or amino acid distances

Distances: Default: For protein sequences, log-corrected distances and an amino acid dissimilarity matrix derived from BLOSUM45 or for nucleotide sequences, Jukes-Cantor distances To specify a different matrix, use -matrix FilePrefix or -nomatrix

This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.

property mlacc

Option for optimization of branches at each NNI.

Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mlacc 2 or -mlacc 3 to always optimize all 5 branches at each NNI, and to optimize all 5 branches in 2 or 3 rounds.

This controls the addition of the -mlacc parameter and its associated value. Set this property to the argument value required.

property mllen

Optimize branch lengths on a fixed topology.

Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mllen to optimize branch lengths without ML NNIs Use -mllen -nome with -intree to optimize branch lengths on a fixed topology.

This property controls the addition of the -mllen switch, treat this property as a boolean.

property mlnni

Set the number of rounds of maximum-likelihood NNIs.

Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mlnni to set the number of rounds of maximum-likelihood NNIs.

This controls the addition of the -mlnni parameter and its associated value. Set this property to the argument value required.

property n

-n – read N multiple alignments in.

This only works with phylip interleaved format. For example, you can use it with the output from phylip’s seqboot. If you use -n, FastTree will write 1 tree per line to standard output.

This controls the addition of the -n parameter and its associated value. Set this property to the argument value required.

property nj

Join options: regular (unweighted) neighbor-joining (default)

This property controls the addition of the -nj switch, treat this property as a boolean.

property nni

Set the rounds of minimum-evolution nearest-neighbor interchanges

Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs.

This controls the addition of the -nni parameter and its associated value. Set this property to the argument value required.

property no2nd

Turn 2nd-level top hits heuristic off.

Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other

-2nd or -no2nd to turn 2nd-level top hits heuristic on or off This reduces memory usage and running time but may lead to marginal reductions in tree quality. (By default, -fastest turns on -2nd.)

This property controls the addition of the -no2nd switch, treat this property as a boolean.

property nocat

Maximum likelihood model options: No CAT model (just 1 category)

This property controls the addition of the -nocat switch, treat this property as a boolean.

property nomatrix

Specify that no matrix should be used for nucleotide or amino acid distances

Distances: Default: For protein sequences, log-corrected distances and an amino acid dissimilarity matrix derived from BLOSUM45 or for nucleotide sequences, Jukes-Cantor distances To specify a different matrix, use -matrix FilePrefix or -nomatrix

This property controls the addition of the -nomatrix switch, treat this property as a boolean.

property nome

Changes support values calculation to a minimum-evolution bootstrap method.

Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -mllen to optimize branch lengths without ML NNIs Use -mllen -nome with -intree to optimize branch lengths on a fixed topology

Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1.

This property controls the addition of the -nome switch, treat this property as a boolean.

property noml

Deactivate min-evo NNIs and SPRs.

Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -noml to turn off both min-evo NNIs and SPRs (useful if refining an approximately maximum-likelihood tree with further NNIs).

This property controls the addition of the -noml switch, treat this property as a boolean.

property nopr

-nopr – do not write the progress indicator to stderr.

This property controls the addition of the -nopr switch, treat this property as a boolean.

property nosupport

Turn off support values.

Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1

Use -nosupport to turn off support values or -boot 100 to use just 100 resamples.

This property controls the addition of the -nosupport switch, treat this property as a boolean.

property notop

Turn off top-hit list to speed up search

Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other.

This property controls the addition of the -notop switch, treat this property as a boolean.

property nt

By default FastTree expects protein alignments, use -nt for nucleotides

This property controls the addition of the -nt switch, treat this property as a boolean.

property out

Enter <output file>

The path to a Newick Tree output file needs to be specified.

This controls the addition of the -out parameter and its associated value. Set this property to the argument value required.

property pseudo

-pseudo [weight] – Pseudocounts are used with sequence distance estimation.

Use pseudocounts to estimate distances between sequences with little or no overlap. (Off by default.) Recommended if analyzing the alignment has sequences with little or no overlap. If the weight is not specified, it is 1.0

This controls the addition of the -pseudo parameter and its associated value. Set this property to the argument value required.

property quiet

-quiet – do not write to standard error during normal operation

(no progress indicator, no options summary, no likelihood values, etc.)

This property controls the addition of the -quiet switch, treat this property as a boolean.

property quote

-quote – add quotes to sequence names in output.

Quote sequence names in the output and allow spaces, commas, parentheses, and colons in them but not ‘ characters (fasta files only).

This property controls the addition of the -quote switch, treat this property as a boolean.

property rawdist

Turn off or adjust log-correction in AA or NT distances.

Use -rawdist to turn the log-correction off or to use %different instead of Jukes-Cantor in AA or NT distances

Distances: Default: For protein sequences, log-corrected distances and an amino acid dissimilarity matrix derived from BLOSUM45 or for nucleotide sequences, Jukes-Cantor distances To specify a different matrix, use -matrix FilePrefix or -nomatrix

This property controls the addition of the -rawdist switch, treat this property as a boolean.

property refresh

Parameter for conditions that joined nodes are compared to other nodes

Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search -refresh 0.8 – compare a joined node to all other nodes if its top-hit list is less than 80% of the desired length, or if the age of the top-hit list is log2(m) or greater.

This controls the addition of the -refresh parameter and its associated value. Set this property to the argument value required.

property second

Turn 2nd-level top hits heuristic on.

Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other

-2nd or -no2nd to turn 2nd-level top hits heuristic on or off This reduces memory usage and running time but may lead to marginal reductions in tree quality. (By default, -fastest turns on -2nd.)

This property controls the addition of the -2nd switch, treat this property as a boolean.

property seed

Use -seed to initialize the random number generator.

Support value options: By default, FastTree computes local support values by resampling the site likelihoods 1,000 times and the Shimodaira Hasegawa test. If you specify -nome, it will compute minimum-evolution bootstrap supports instead In either case, the support values are proportions ranging from 0 to 1.

This controls the addition of the -seed parameter and its associated value. Set this property to the argument value required.

property slow

Use an exhaustive search.

Searching for the best join: By default, FastTree combines the ‘visible set’ of fast neighbor-joining with local hill-climbing as in relaxed neighbor-joining -slow – exhaustive search (like NJ or BIONJ, but different gap handling) -slow takes half an hour instead of 8 seconds for 1,250 proteins

This property controls the addition of the -slow switch, treat this property as a boolean.

property slownni

Turn off heuristics to avoid constant subtrees with NNIs.

Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs. Use -slownni to turn off heuristics to avoid constant subtrees (affects both ML and ME NNIs).

This property controls the addition of the -slownni switch, treat this property as a boolean.

property spr

Set the rounds of subtree-prune-regraft moves

Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs.

This controls the addition of the -spr parameter and its associated value. Set this property to the argument value required.

property sprlength

Set maximum SPR move length in topology refinement (default 10).

Topology refinement: By default, FastTree tries to improve the tree with up to 4*log2(N) rounds of minimum-evolution nearest-neighbor interchanges (NNI), where N is the number of unique sequences, 2 rounds of subtree-prune-regraft (SPR) moves (also min. evo.), and up to 2*log(N) rounds of maximum-likelihood NNIs. Use -nni to set the number of rounds of min. evo. NNIs, and -spr to set the rounds of SPRs.

This controls the addition of the -sprlength parameter and its associated value. Set this property to the argument value required.

property top

Top-hit list to speed up search

Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search Use -notop (or -slow) to turn this feature off and compare all leaves to each other, and all new joined nodes to each other.

This property controls the addition of the -top switch, treat this property as a boolean.

property topm

Change the top hits calculation method

Top-hit heuristics: By default, FastTree uses a top-hit list to speed up search -topm 1.0 – set the top-hit list size to parameter*sqrt(N) FastTree estimates the top m hits of a leaf from the top 2*m hits of a ‘close’ neighbor, where close is defined as d(seed,close) < 0.75 * d(seed, hit of rank 2*m), and updates the top-hits as joins proceed.

This controls the addition of the -topm parameter and its associated value. Set this property to the argument value required.

property wag

Maximum likelihood model options.

Whelan-And-Goldman 2001 model instead of (default) Jones-Taylor-Thorton 1992 model (a.a. only)

This property controls the addition of the -wag switch, treat this property as a boolean.