Bio.Align.Applications package

Module contents

Alignment command line tool wrappers.

class Bio.Align.Applications.MuscleCommandline(cmd='muscle', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command line wrapper for the multiple alignment program MUSCLE.

http://www.drive5.com/muscle/

Notes

Last checked against version: 3.7, briefly against 3.8

References

Edgar, Robert C. (2004), MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Research 32(5), 1792-97.

Edgar, R.C. (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5(1): 113.

Examples

>>> from Bio.Align.Applications import MuscleCommandline
>>> muscle_exe = r"C:\Program Files\Aligments\muscle3.8.31_i86win32.exe"
>>> in_file = r"C:\My Documents\unaligned.fasta"
>>> out_file = r"C:\My Documents\aligned.fasta"
>>> muscle_cline = MuscleCommandline(muscle_exe, input=in_file, out=out_file)
>>> print(muscle_cline)
"C:\Program Files\Aligments\muscle3.8.31_i86win32.exe" -in "C:\My Documents\unaligned.fasta" -out "C:\My Documents\aligned.fasta"

You would typically run the command line with muscle_cline() or via the Python subprocess module, as described in the Biopython tutorial.

__init__(self, cmd='muscle', **kwargs)

Initialize the class.

property anchors

Use anchor optimisation in tree dependent refinement iterations

This property controls the addition of the -anchors switch, treat this property as a boolean.

property anchorspacing

Minimum spacing between anchor columns

This controls the addition of the -anchorspacing parameter and its associated value. Set this property to the argument value required.

property brenner

Use Steve Brenner’s root alignment method

This property controls the addition of the -brenner switch, treat this property as a boolean.

property center

Center parameter - should be negative

This controls the addition of the -center parameter and its associated value. Set this property to the argument value required.

property cluster

Perform fast clustering of input sequences, use -tree1 to save tree

This property controls the addition of the -cluster switch, treat this property as a boolean.

property cluster1

Clustering method used in iteration 1

This controls the addition of the -cluster1 parameter and its associated value. Set this property to the argument value required.

property cluster2

Clustering method used in iteration 2

This controls the addition of the -cluster2 parameter and its associated value. Set this property to the argument value required.

property clw

Write output in CLUSTALW format (with a MUSCLE header)

This property controls the addition of the -clw switch, treat this property as a boolean.

property clwout

Write CLUSTALW output (with MUSCLE header) to specified filename

This controls the addition of the -clwout parameter and its associated value. Set this property to the argument value required.

property clwstrict

Write output in CLUSTALW format with version1.81 header

This property controls the addition of the -clwstrict switch, treat this property as a boolean.

property clwstrictout

Write CLUSTALW output (with version 1.81 header) to specified filename

This controls the addition of the -clwstrictout parameter and its associated value. Set this property to the argument value required.

property core

Do not catch exceptions

This property controls the addition of the -core switch, treat this property as a boolean.

property diagbreak

Maximum distance between two diagonals that allows them to merge into one diagonal

This controls the addition of the -diagbreak parameter and its associated value. Set this property to the argument value required.

property diaglength

Minimum length of diagonal

This controls the addition of the -diaglength parameter and its associated value. Set this property to the argument value required.

property diagmargin

Discard this many positions at ends of diagonal

This controls the addition of the -diagmargin parameter and its associated value. Set this property to the argument value required.

property diags

Find diagonals (faster for similar sequences)

This property controls the addition of the -diags switch, treat this property as a boolean.

property dimer

Use faster (slightly less accurate) dimer approximationfor the SP score

This property controls the addition of the -dimer switch, treat this property as a boolean.

property distance1

Distance measure for iteration 1

This controls the addition of the -distance1 parameter and its associated value. Set this property to the argument value required.

property distance2

Distance measure for iteration 2

This controls the addition of the -distance2 parameter and its associated value. Set this property to the argument value required.

property fasta

Write output in FASTA format

This property controls the addition of the -fasta switch, treat this property as a boolean.

property fastaout

Write FASTA format output to specified filename

This controls the addition of the -fastaout parameter and its associated value. Set this property to the argument value required.

property gapextend

Gap extension penalty

This controls the addition of the -gapextend parameter and its associated value. Set this property to the argument value required.

property gapopen

Gap open score - negative number

This controls the addition of the -gapopen parameter and its associated value. Set this property to the argument value required.

property group

Group similar sequences in output

This property controls the addition of the -group switch, treat this property as a boolean.

property html

Write output in HTML format

This property controls the addition of the -html switch, treat this property as a boolean.

property htmlout

Write HTML output to specified filename

This controls the addition of the -htmlout parameter and its associated value. Set this property to the argument value required.

property hydro

Window size for hydrophobic region

This controls the addition of the -hydro parameter and its associated value. Set this property to the argument value required.

property hydrofactor

Multiplier for gap penalties in hydrophobic regions

This controls the addition of the -hydrofactor parameter and its associated value. Set this property to the argument value required.

property in1

First input filename for profile alignment

This controls the addition of the -in1 parameter and its associated value. Set this property to the argument value required.

property in2

Second input filename for a profile alignment

This controls the addition of the -in2 parameter and its associated value. Set this property to the argument value required.

property input

Input filename

This controls the addition of the -in parameter and its associated value. Set this property to the argument value required.

property le

Use log-expectation profile score (VTML240)

This property controls the addition of the -le switch, treat this property as a boolean.

property log

Log file name

This controls the addition of the -log parameter and its associated value. Set this property to the argument value required.

property loga

Log file name (append to existing file)

This controls the addition of the -loga parameter and its associated value. Set this property to the argument value required.

property matrix

path to NCBI or WU-BLAST format protein substitution matrix - also set -gapopen, -gapextend and -center

This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.

property maxdiagbreak

Deprecated in v3.8, use -diagbreak instead.

This controls the addition of the -maxdiagbreak parameter and its associated value. Set this property to the argument value required.

property maxhours

Maximum time to run in hours

This controls the addition of the -maxhours parameter and its associated value. Set this property to the argument value required.

property maxiters

Maximum number of iterations

This controls the addition of the -maxiters parameter and its associated value. Set this property to the argument value required.

property maxtrees

Maximum number of trees to build in iteration 2

This controls the addition of the -maxtrees parameter and its associated value. Set this property to the argument value required.

property minbestcolscore

Minimum score a column must have to be an anchor

This controls the addition of the -minbestcolscore parameter and its associated value. Set this property to the argument value required.

property minsmoothscore

Minimum smoothed score a column must have to be an anchor

This controls the addition of the -minsmoothscore parameter and its associated value. Set this property to the argument value required.

property msf

Write output in MSF format

This property controls the addition of the -msf switch, treat this property as a boolean.

property msfout

Write MSF format output to specified filename

This controls the addition of the -msfout parameter and its associated value. Set this property to the argument value required.

property noanchors

Do not use anchor optimisation in tree dependent refinement iterations

This property controls the addition of the -noanchors switch, treat this property as a boolean.

property nocore

Catch exceptions

This property controls the addition of the -nocore switch, treat this property as a boolean.

property objscore

Objective score used by tree dependent refinement

This controls the addition of the -objscore parameter and its associated value. Set this property to the argument value required.

property out

Output filename

This controls the addition of the -out parameter and its associated value. Set this property to the argument value required.

property phyi

Write output in PHYLIP interleaved format

This property controls the addition of the -phyi switch, treat this property as a boolean.

property phyiout

Write PHYLIP interleaved output to specified filename

This controls the addition of the -phyiout parameter and its associated value. Set this property to the argument value required.

property phys

Write output in PHYLIP sequential format

This property controls the addition of the -phys switch, treat this property as a boolean.

property physout

Write PHYLIP sequential format to specified filename

This controls the addition of the -physout parameter and its associated value. Set this property to the argument value required.

property profile

Perform a profile alignment

This property controls the addition of the -profile switch, treat this property as a boolean.

property quiet

Do not display progress messages

This property controls the addition of the -quiet switch, treat this property as a boolean.

property refine

Only do tree dependent refinement

This property controls the addition of the -refine switch, treat this property as a boolean.

property refinew

Only do tree dependent refinement using sliding window approach

This property controls the addition of the -refinew switch, treat this property as a boolean.

property refinewindow

Length of window for -refinew

This controls the addition of the -refinewindow parameter and its associated value. Set this property to the argument value required.

property root1

Method used to root tree in iteration 1

This controls the addition of the -root1 parameter and its associated value. Set this property to the argument value required.

property root2

Method used to root tree in iteration 2

This controls the addition of the -root2 parameter and its associated value. Set this property to the argument value required.

property scorefile

Score file name, contains one line for each column in the alignment with average BLOSUM62 score

This controls the addition of the -scorefile parameter and its associated value. Set this property to the argument value required.

property seqtype

Sequence type

This controls the addition of the -seqtype parameter and its associated value. Set this property to the argument value required.

property smoothscoreceil

Maximum value of column score for smoothing

This controls the addition of the -smoothscoreceil parameter and its associated value. Set this property to the argument value required.

property smoothwindow

Window used for anchor column smoothing

This controls the addition of the -smoothwindow parameter and its associated value. Set this property to the argument value required.

property sp

Use sum-of-pairs protein profile score (PAM200)

This property controls the addition of the -sp switch, treat this property as a boolean.

property spn

Use sum-of-pairs protein nucleotide profile score

This property controls the addition of the -spn switch, treat this property as a boolean.

property spscore

Compute SP objective score of multiple alignment

This controls the addition of the -spscore parameter and its associated value. Set this property to the argument value required.

property stable

Do not group similar sequences in output (not supported in v3.8)

This property controls the addition of the -stable switch, treat this property as a boolean.

property sueff

Constant used in UPGMB clustering

This controls the addition of the -sueff parameter and its associated value. Set this property to the argument value required.

property sv

Use sum-of-pairs profile score (VTML240)

This property controls the addition of the -sv switch, treat this property as a boolean.

property tree1

Save Newick tree from iteration 1

This controls the addition of the -tree1 parameter and its associated value. Set this property to the argument value required.

property tree2

Save Newick tree from iteration 2

This controls the addition of the -tree2 parameter and its associated value. Set this property to the argument value required.

property usetree

Use given Newick tree as guide tree

This controls the addition of the -usetree parameter and its associated value. Set this property to the argument value required.

property verbose

Write parameter settings and progress

This property controls the addition of the -verbose switch, treat this property as a boolean.

property version

Write version string to stdout and exit

This property controls the addition of the -version switch, treat this property as a boolean.

property weight1

Weighting scheme used in iteration 1

This controls the addition of the -weight1 parameter and its associated value. Set this property to the argument value required.

property weight2

Weighting scheme used in iteration 2

This controls the addition of the -weight2 parameter and its associated value. Set this property to the argument value required.

class Bio.Align.Applications.ClustalwCommandline(cmd='clustalw', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command line wrapper for clustalw (version one or two).

http://www.clustal.org/

Notes

Last checked against versions: 1.83 and 2.1

References

Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947-2948.

Examples

>>> from Bio.Align.Applications import ClustalwCommandline
>>> in_file = "unaligned.fasta"
>>> clustalw_cline = ClustalwCommandline("clustalw2", infile=in_file)
>>> print(clustalw_cline)
clustalw2 -infile=unaligned.fasta

You would typically run the command line with clustalw_cline() or via the Python subprocess module, as described in the Biopython tutorial.

__init__(self, cmd='clustalw', **kwargs)

Initialize the class.

property align

Do full multiple alignment.

This property controls the addition of the -align switch, treat this property as a boolean.

property bootlabels

Node OR branch position of bootstrap values in tree display

This controls the addition of the -bootlabels parameter and its associated value. Set this property to the argument value required.

property bootstrap

Bootstrap a NJ tree (n= number of bootstraps; def. = 1000).

This controls the addition of the -bootstrap parameter and its associated value. Set this property to the argument value required.

property case

LOWER or UPPER (for GDE output only)

This controls the addition of the -case parameter and its associated value. Set this property to the argument value required.

property check

Outline the command line params.

This property controls the addition of the -check switch, treat this property as a boolean.

property clustering

NJ or UPGMA

This controls the addition of the -clustering parameter and its associated value. Set this property to the argument value required.

property convert

Output the input sequences in a different file format.

This property controls the addition of the -convert switch, treat this property as a boolean.

property dnamatrix

DNA weight matrix=IUB, CLUSTALW or filename

This controls the addition of the -dnamatrix parameter and its associated value. Set this property to the argument value required.

property endgaps

No end gap separation pen.

This property controls the addition of the -endgaps switch, treat this property as a boolean.

property fullhelp

Output full help content.

This property controls the addition of the -fullhelp switch, treat this property as a boolean.

property gapdist

Gap separation pen. range

This controls the addition of the -gapdist parameter and its associated value. Set this property to the argument value required.

property gapext

Gap extension penalty

This controls the addition of the -gapext parameter and its associated value. Set this property to the argument value required.

property gapopen

Gap opening penalty

This controls the addition of the -gapopen parameter and its associated value. Set this property to the argument value required.

property helixendin

Number of residues inside helix to be treated as terminal

This controls the addition of the -helixendin parameter and its associated value. Set this property to the argument value required.

property helixendout

Number of residues outside helix to be treated as terminal

This controls the addition of the -helixendout parameter and its associated value. Set this property to the argument value required.

property helixgap

Gap penalty for helix core residues

This controls the addition of the -helixgap parameter and its associated value. Set this property to the argument value required.

property help

Outline the command line params.

This property controls the addition of the -help switch, treat this property as a boolean.

property hgapresidues

List hydrophilic res.

This property controls the addition of the -hgapresidues switch, treat this property as a boolean.

property infile

Input sequences.

This controls the addition of the -infile parameter and its associated value. Set this property to the argument value required.

property iteration

NONE or TREE or ALIGNMENT

This controls the addition of the -iteration parameter and its associated value. Set this property to the argument value required.

property kimura

Use Kimura’s correction.

This property controls the addition of the -kimura switch, treat this property as a boolean.

property ktuple

Word size

This controls the addition of the -ktuple parameter and its associated value. Set this property to the argument value required.

property loopgap

Gap penalty for loop regions

This controls the addition of the -loopgap parameter and its associated value. Set this property to the argument value required.

property matrix

Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename

This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.

property maxdiv

% ident. for delay

This controls the addition of the -maxdiv parameter and its associated value. Set this property to the argument value required.

property maxseqlen

Maximum allowed input sequence length

This controls the addition of the -maxseqlen parameter and its associated value. Set this property to the argument value required.

property negative

Protein alignment with negative values in matrix

This property controls the addition of the -negative switch, treat this property as a boolean.

property newtree

Output file name for newly created guide tree

This controls the addition of the -newtree parameter and its associated value. Set this property to the argument value required.

property newtree1

Output file name for new guide tree of profile1

This controls the addition of the -newtree1 parameter and its associated value. Set this property to the argument value required.

property newtree2

Output file for new guide tree of profile2

This controls the addition of the -newtree2 parameter and its associated value. Set this property to the argument value required.

property nohgap

Hydrophilic gaps off

This property controls the addition of the -nohgap switch, treat this property as a boolean.

property nopgap

Residue-specific gaps off

This property controls the addition of the -nopgap switch, treat this property as a boolean.

property nosecstr1

Do not use secondary structure-gap penalty mask for profile 1

This property controls the addition of the -nosecstr1 switch, treat this property as a boolean.

property nosecstr2

Do not use secondary structure-gap penalty mask for profile 2

This property controls the addition of the -nosecstr2 switch, treat this property as a boolean.

property noweights

Disable sequence weighting

This property controls the addition of the -noweights switch, treat this property as a boolean.

property numiter

maximum number of iterations to perform

This controls the addition of the -numiter parameter and its associated value. Set this property to the argument value required.

property options

List the command line parameters

This property controls the addition of the -options switch, treat this property as a boolean.

property outfile

Output sequence alignment file name

This controls the addition of the -outfile parameter and its associated value. Set this property to the argument value required.

property outorder

Output taxon order: INPUT or ALIGNED

This controls the addition of the -outorder parameter and its associated value. Set this property to the argument value required.

property output

Output format: CLUSTAL(default), GCG, GDE, PHYLIP, PIR, NEXUS and FASTA

This controls the addition of the -output parameter and its associated value. Set this property to the argument value required.

property outputtree

nj OR phylip OR dist OR nexus

This controls the addition of the -outputtree parameter and its associated value. Set this property to the argument value required.

property pairgap

Gap penalty

This controls the addition of the -pairgap parameter and its associated value. Set this property to the argument value required.

property pim

Output percent identity matrix (while calculating the tree).

This property controls the addition of the -pim switch, treat this property as a boolean.

property profile

Merge two alignments by profile alignment

This property controls the addition of the -profile switch, treat this property as a boolean.

property profile1

Profiles (old alignment).

This controls the addition of the -profile1 parameter and its associated value. Set this property to the argument value required.

property profile2

Profiles (old alignment).

This controls the addition of the -profile2 parameter and its associated value. Set this property to the argument value required.

property pwdnamatrix

DNA weight matrix=IUB, CLUSTALW or filename

This controls the addition of the -pwdnamatrix parameter and its associated value. Set this property to the argument value required.

property pwgapext

Gap extension penalty

This controls the addition of the -pwgapext parameter and its associated value. Set this property to the argument value required.

property pwgapopen

Gap opening penalty

This controls the addition of the -pwgapopen parameter and its associated value. Set this property to the argument value required.

property pwmatrix

Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename

This controls the addition of the -pwmatrix parameter and its associated value. Set this property to the argument value required.

property quicktree

Use FAST algorithm for the alignment guide tree

This property controls the addition of the -quicktree switch, treat this property as a boolean.

property quiet

Reduce console output to minimum

This property controls the addition of the -quiet switch, treat this property as a boolean.

property range

Sequence range to write starting m to m+n. Input as string eg. ‘24,200’

This controls the addition of the -range parameter and its associated value. Set this property to the argument value required.

property score

Either: PERCENT or ABSOLUTE

This controls the addition of the -score parameter and its associated value. Set this property to the argument value required.

property secstrout

STRUCTURE or MASK or BOTH or NONE output in alignment file

This controls the addition of the -secstrout parameter and its associated value. Set this property to the argument value required.

property seed

Seed number for bootstraps.

This controls the addition of the -seed parameter and its associated value. Set this property to the argument value required.

property seqno_range

OFF or ON (NEW- for all output formats)

This controls the addition of the -seqno_range parameter and its associated value. Set this property to the argument value required.

property seqnos

OFF or ON (for Clustal output only)

This controls the addition of the -seqnos parameter and its associated value. Set this property to the argument value required.

property sequences

Sequentially add profile2 sequences to profile1 alignment

This property controls the addition of the -sequences switch, treat this property as a boolean.

property stats

Log some alignment statistics to file

This controls the addition of the -stats parameter and its associated value. Set this property to the argument value required.

property strandendin

Number of residues inside strand to be treated as terminal

This controls the addition of the -strandendin parameter and its associated value. Set this property to the argument value required.

property strandendout

Number of residues outside strand to be treated as terminal

This controls the addition of the -strandendout parameter and its associated value. Set this property to the argument value required.

property strandgap

gap penalty for strand core residues

This controls the addition of the -strandgap parameter and its associated value. Set this property to the argument value required.

property terminalgap

Gap penalty for structure termini

This controls the addition of the -terminalgap parameter and its associated value. Set this property to the argument value required.

property topdiags

Number of best diags.

This controls the addition of the -topdiags parameter and its associated value. Set this property to the argument value required.

property tossgaps

Ignore positions with gaps.

This property controls the addition of the -tossgaps switch, treat this property as a boolean.

property transweight

Transitions weighting

This controls the addition of the -transweight parameter and its associated value. Set this property to the argument value required.

property tree

Calculate NJ tree.

This property controls the addition of the -tree switch, treat this property as a boolean.

property type

PROTEIN or DNA sequences

This controls the addition of the -type parameter and its associated value. Set this property to the argument value required.

property usetree

File name of guide tree

This controls the addition of the -usetree parameter and its associated value. Set this property to the argument value required.

property usetree1

File name of guide tree for profile1

This controls the addition of the -usetree1 parameter and its associated value. Set this property to the argument value required.

property usetree2

File name of guide tree for profile2

This controls the addition of the -usetree2 parameter and its associated value. Set this property to the argument value required.

property window

Window around best diags.

This controls the addition of the -window parameter and its associated value. Set this property to the argument value required.

class Bio.Align.Applications.ClustalOmegaCommandline(cmd='clustalo', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command line wrapper for clustal omega.

http://www.clustal.org/omega

Notes

Last checked against version: 1.2.0

References

Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology 7:539 https://doi.org/10.1038/msb.2011.75

Examples

>>> from Bio.Align.Applications import ClustalOmegaCommandline
>>> in_file = "unaligned.fasta"
>>> out_file = "aligned.fasta"
>>> clustalomega_cline = ClustalOmegaCommandline(infile=in_file, outfile=out_file, verbose=True, auto=True)
>>> print(clustalomega_cline)
clustalo -i unaligned.fasta -o aligned.fasta --auto -v

You would typically run the command line with clustalomega_cline() or via the Python subprocess module, as described in the Biopython tutorial.

__init__(self, cmd='clustalo', **kwargs)

Initialize the class.

property auto

Set options automatically (might overwrite some of your options)

This property controls the addition of the –auto switch, treat this property as a boolean.

property clusteringout

Clustering output file

This controls the addition of the –clustering-out parameter and its associated value. Set this property to the argument value required.

property clustersize

soft maximum of sequences in sub-clusters

This controls the addition of the –cluster-size parameter and its associated value. Set this property to the argument value required.

property dealign

Dealign input sequences

This property controls the addition of the –dealign switch, treat this property as a boolean.

property distmat_full

Use full distance matrix for guide-tree calculation (slow; mBed is default)

This property controls the addition of the –full switch, treat this property as a boolean.

property distmat_full_iter

Use full distance matrix for guide-tree calculation during iteration (mBed is default)

This property controls the addition of the –full-iter switch, treat this property as a boolean.

property distmat_in

Pairwise distance matrix input file (skips distance computation).

This controls the addition of the –distmat-in parameter and its associated value. Set this property to the argument value required.

property distmat_out

Pairwise distance matrix output file.

This controls the addition of the –distmat-out parameter and its associated value. Set this property to the argument value required.

property force

Force file overwriting.

This property controls the addition of the –force switch, treat this property as a boolean.

property guidetree_in

Guide tree input file (skips distance computation and guide-tree clustering step).

This controls the addition of the –guidetree-in parameter and its associated value. Set this property to the argument value required.

property guidetree_out

Guide tree output file.

This controls the addition of the –guidetree-out parameter and its associated value. Set this property to the argument value required.

property help

Print help and exit.

This property controls the addition of the -h switch, treat this property as a boolean.

property hmm_input

HMM input files

This controls the addition of the –hmm-in parameter and its associated value. Set this property to the argument value required.

property infile

Multiple sequence input file

This controls the addition of the -i parameter and its associated value. Set this property to the argument value required.

property infmt

Forced sequence input file format (default: auto)

Allowed values: a2m, fa[sta], clu[stal], msf, phy[lip], selex, st[ockholm], vie[nna]

This controls the addition of the –infmt parameter and its associated value. Set this property to the argument value required.

property isprofile

disable check if profile, force profile (default no)

This property controls the addition of the –is-profile switch, treat this property as a boolean.

property iterations

Number of (combined guide-tree/HMM) iterations

This controls the addition of the –iterations parameter and its associated value. Set this property to the argument value required.

property log

Log all non-essential output to this file.

This controls the addition of the -l parameter and its associated value. Set this property to the argument value required.

property long_version

Print long version information and exit

This property controls the addition of the –long-version switch, treat this property as a boolean.

property max_guidetree_iterations

Maximum number of guidetree iterations

This controls the addition of the –max-guidetree-iterations parameter and its associated value. Set this property to the argument value required.

property max_hmm_iterations

Maximum number of HMM iterations

This controls the addition of the –max-hmm-iterations parameter and its associated value. Set this property to the argument value required.

property maxnumseq

Maximum allowed number of sequences

This controls the addition of the –maxnumseq parameter and its associated value. Set this property to the argument value required.

property maxseqlen

Maximum allowed sequence length

This controls the addition of the –maxseqlen parameter and its associated value. Set this property to the argument value required.

property outfile

Multiple sequence alignment output file (default: stdout).

This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.

property outfmt

MSA output file format: a2m=fa[sta],clu[stal],msf,phy[lip],selex,st[ockholm],vie[nna] (default: fasta).

This controls the addition of the –outfmt parameter and its associated value. Set this property to the argument value required.

property outputorder

MSA output order like in input/guide-tree

This controls the addition of the –output-order parameter and its associated value. Set this property to the argument value required.

property percentid

convert distances into percent identities (default no)

This property controls the addition of the –percent-id switch, treat this property as a boolean.

property profile1

Pre-aligned multiple sequence file (aligned columns will be kept fix).

This controls the addition of the –profile1 parameter and its associated value. Set this property to the argument value required.

property profile2

Pre-aligned multiple sequence file (aligned columns will be kept fix).

This controls the addition of the –profile2 parameter and its associated value. Set this property to the argument value required.

property residuenumber

in Clustal format print residue numbers (default no)

This property controls the addition of the –residuenumber switch, treat this property as a boolean.

property seqtype

{Protein, RNA, DNA} Force a sequence type (default: auto).

This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.

property threads

Number of processors to use

This controls the addition of the –threads parameter and its associated value. Set this property to the argument value required.

property usekimura

use Kimura distance correction for aligned sequences (default no)

This property controls the addition of the –use-kimura switch, treat this property as a boolean.

property verbose

Verbose output

This property controls the addition of the -v switch, treat this property as a boolean.

property version

Print version information and exit

This property controls the addition of the –version switch, treat this property as a boolean.

property wrap

number of residues before line-wrap in output

This controls the addition of the –wrap parameter and its associated value. Set this property to the argument value required.

class Bio.Align.Applications.PrankCommandline(cmd='prank', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command line wrapper for the multiple alignment program PRANK.

http://www.ebi.ac.uk/goldman-srv/prank/prank/

Notes

Last checked against version: 081202

References

Loytynoja, A. and Goldman, N. 2005. An algorithm for progressive multiple alignment of sequences with insertions. Proceedings of the National Academy of Sciences, 102: 10557–10562.

Loytynoja, A. and Goldman, N. 2008. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science, 320: 1632.

Examples

To align a FASTA file (unaligned.fasta) with the output in aligned FASTA format with the output filename starting with “aligned” (you can’t pick the filename explicitly), no tree output and no XML output, use:

>>> from Bio.Align.Applications import PrankCommandline
>>> prank_cline = PrankCommandline(d="unaligned.fasta",
...                                o="aligned", # prefix only!
...                                f=8, # FASTA output
...                                notree=True, noxml=True)
>>> print(prank_cline)
prank -d=unaligned.fasta -o=aligned -f=8 -noxml -notree

You would typically run the command line with prank_cline() or via the Python subprocess module, as described in the Biopython tutorial.

__init__(self, cmd='prank', **kwargs)

Initialize the class.

property F

Force insertions to be always skipped: same as +F

This property controls the addition of the -F switch, treat this property as a boolean.

property codon

Codon aware alignment or not

This property controls the addition of the -codon switch, treat this property as a boolean.

property convert

Convert input alignment to new format. Do not perform alignment

This property controls the addition of the -convert switch, treat this property as a boolean.

property d

Input filename

This controls the addition of the -d parameter and its associated value. Set this property to the argument value required.

property dnafreqs

DNA frequencies - ‘A,C,G,T’. eg ‘25,25,25,25’ as a quote surrounded string value. Default: empirical

This controls the addition of the -dnafreqs parameter and its associated value. Set this property to the argument value required.

property dots

Show insertion gaps as dots

This property controls the addition of the -dots switch, treat this property as a boolean.

property f

Output alignment format. Default: 8 FASTA Option are: 1. IG/Stanford 8. Pearson/Fasta 2. GenBank/GB 11. Phylip3.2 3. NBRF 12. Phylip 4. EMBL 14. PIR/CODATA 6. DNAStrider 15. MSF 7. Fitch 17. PAUP/NEXUS

This controls the addition of the -f parameter and its associated value. Set this property to the argument value required.

property fixedbranches

Use fixed branch lengths of input value

This controls the addition of the -fixedbranches parameter and its associated value. Set this property to the argument value required.

property gapext

Gap extension probability. Default: dna 0.5 / prot 0.5

This controls the addition of the -gapext parameter and its associated value. Set this property to the argument value required.

property gaprate

Gap opening rate. Default: dna 0.025 prot 0.0025

This controls the addition of the -gaprate parameter and its associated value. Set this property to the argument value required.

property kappa

Transition/transversion ratio. Default: 2

This controls the addition of the -kappa parameter and its associated value. Set this property to the argument value required.

property longseq

Save space in pairwise alignments

This property controls the addition of the -longseq switch, treat this property as a boolean.

property m

User-defined alignment model filename. Default: HKY2/WAG

This controls the addition of the -m parameter and its associated value. Set this property to the argument value required.

property matinitsize

Matrix initial size multiplier

This controls the addition of the -matinitsize parameter and its associated value. Set this property to the argument value required.

property matresize

Matrix resizing multiplier

This controls the addition of the -matresize parameter and its associated value. Set this property to the argument value required.

property maxbranches

Use maximum branch lengths of input value

This controls the addition of the -maxbranches parameter and its associated value. Set this property to the argument value required.

property mttranslate

Translate to protein using mt table

This property controls the addition of the -mttranslate switch, treat this property as a boolean.

property nopost

Do not compute posterior support. Default: compute

This property controls the addition of the -nopost switch, treat this property as a boolean.

property notree

Do not output dnd tree files (PRANK versions earlier than v.120626)

This property controls the addition of the -notree switch, treat this property as a boolean.

property noxml

Do not output XML files (PRANK versions earlier than v.120626)

This property controls the addition of the -noxml switch, treat this property as a boolean.

property o
Output filenames prefix. Default: ‘output’

Will write: output.?.fas (depending on requested format), output.?.xml and output.?.dnd

This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.

property once

Run only once. Default: twice if no guidetree given

This property controls the addition of the -once switch, treat this property as a boolean.

property printnodes

Output each node; mostly for debugging

This property controls the addition of the -printnodes switch, treat this property as a boolean.

property pwdist

Expected pairwise distance for computing guidetree. Default: dna 0.25 / prot 0.5

This controls the addition of the -pwdist parameter and its associated value. Set this property to the argument value required.

property pwgenomic

Do pairwise alignment, no guidetree

This property controls the addition of the -pwgenomic switch, treat this property as a boolean.

property pwgenomicdist

Distance for pairwise alignment. Default: 0.3

This controls the addition of the -pwgenomicdist parameter and its associated value. Set this property to the argument value required.

property quiet

Reduce verbosity

This property controls the addition of the -quiet switch, treat this property as a boolean.

property realbranches

Disable branch length truncation

This property controls the addition of the -realbranches switch, treat this property as a boolean.

property rho

Purine/pyrimidine ratio. Default: 1

This controls the addition of the -rho parameter and its associated value. Set this property to the argument value required.

property scalebranches

Scale branch lengths. Default: dna 1 / prot 2

This controls the addition of the -scalebranches parameter and its associated value. Set this property to the argument value required.

property shortnames

Truncate names at first space

This property controls the addition of the -shortnames switch, treat this property as a boolean.

property showtree

Output dnd tree files (PRANK v.120626 and later)

This property controls the addition of the -showtree switch, treat this property as a boolean.

property showxml

Output XML files (PRANK v.120626 and later)

This property controls the addition of the -showxml switch, treat this property as a boolean.

property skipins

Skip insertions in posterior support

This property controls the addition of the -skipins switch, treat this property as a boolean.

property t

Input guide tree filename

This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.

property termgap

Penalise terminal gaps normally

This property controls the addition of the -termgap switch, treat this property as a boolean.

property translate

Translate to protein

This property controls the addition of the -translate switch, treat this property as a boolean.

property tree

Input guide tree as Newick string

This controls the addition of the -tree parameter and its associated value. Set this property to the argument value required.

property twice

Always run twice

This property controls the addition of the -twice switch, treat this property as a boolean.

property uselogs

Slower but should work for a greater number of sequences

This property controls the addition of the -uselogs switch, treat this property as a boolean.

property writeanc

Output ancestral sequences

This property controls the addition of the -writeanc switch, treat this property as a boolean.

class Bio.Align.Applications.MafftCommandline(cmd='mafft', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command line wrapper for the multiple alignment program MAFFT.

http://align.bmr.kyushu-u.ac.jp/mafft/software/

Notes

Last checked against version: MAFFT v6.717b (2009/12/03)

References

Katoh, Toh (BMC Bioinformatics 9:212, 2008) Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework (describes RNA structural alignment methods)

Katoh, Toh (Briefings in Bioinformatics 9:286-298, 2008) Recent developments in the MAFFT multiple sequence alignment program (outlines version 6)

Katoh, Toh (Bioinformatics 23:372-374, 2007) Errata PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences (describes the PartTree algorithm)

Katoh, Kuma, Toh, Miyata (Nucleic Acids Res. 33:511-518, 2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment (describes [ancestral versions of] the G-INS-i, L-INS-i and E-INS-i strategies)

Katoh, Misawa, Kuma, Miyata (Nucleic Acids Res. 30:3059-3066, 2002)

Examples

>>> from Bio.Align.Applications import MafftCommandline
>>> mafft_exe = "/opt/local/mafft"
>>> in_file = "../Doc/examples/opuntia.fasta"
>>> mafft_cline = MafftCommandline(mafft_exe, input=in_file)
>>> print(mafft_cline)
/opt/local/mafft ../Doc/examples/opuntia.fasta

If the mafft binary is on the path (typically the case on a Unix style operating system) then you don’t need to supply the executable location:

>>> from Bio.Align.Applications import MafftCommandline
>>> in_file = "../Doc/examples/opuntia.fasta"
>>> mafft_cline = MafftCommandline(input=in_file)
>>> print(mafft_cline)
mafft ../Doc/examples/opuntia.fasta

You would typically run the command line with mafft_cline() or via the Python subprocess module, as described in the Biopython tutorial.

Note that MAFFT will write the alignment to stdout, which you may want to save to a file and then parse, e.g.:

stdout, stderr = mafft_cline()
with open("aligned.fasta", "w") as handle:
    handle.write(stdout)
from Bio import AlignIO
align = AlignIO.read("aligned.fasta", "fasta")

Alternatively, to parse the output with AlignIO directly you can use StringIO to turn the string into a handle:

stdout, stderr = mafft_cline()
from StringIO import StringIO
from Bio import AlignIO
align = AlignIO.read(StringIO(stdout), "fasta")
__init__(self, cmd='mafft', **kwargs)

Initialize the class.

property LEXP

Gap extension penalty to skip the alignment. Default: 0.00

This controls the addition of the –LEXP parameter and its associated value. Set this property to the argument value required.

property LOP

Gap opening penalty to skip the alignment. Default: -6.00

This controls the addition of the –LOP parameter and its associated value. Set this property to the argument value required.

property aamatrix

Use a user-defined AA scoring matrix. Default: BLOSUM62

This controls the addition of the –aamatrix parameter and its associated value. Set this property to the argument value required.

property adjustdirection

Adjust direction according to the first sequence. Default off.

This property controls the addition of the –adjustdirection switch, treat this property as a boolean.

property adjustdirectionaccurately

Adjust direction according to the first sequence,for highly diverged data; very slowDefault off.

This property controls the addition of the –adjustdirectionaccurately switch, treat this property as a boolean.

property amino

Assume the sequences are amino acid (True/False). Default: auto

This property controls the addition of the –amino switch, treat this property as a boolean.

property auto

Automatically select strategy. Default off.

This property controls the addition of the –auto switch, treat this property as a boolean.

property bl

BLOSUM number matrix is used. Default: 62

This controls the addition of the –bl parameter and its associated value. Set this property to the argument value required.

property clustalout

Output format: clustal (True) or fasta (False, default)

This property controls the addition of the –clustalout switch, treat this property as a boolean.

property dpparttree

The PartTree algorithm is used with distances based on DP. Default: off

This property controls the addition of the –dpparttree switch, treat this property as a boolean.

property ep

Offset value, which works like gap extension penalty, for group-to- group alignment. Default: 0.123

This controls the addition of the –ep parameter and its associated value. Set this property to the argument value required.

property fastapair

All pairwise alignments are computed with FASTA (Pearson and Lipman 1988). Default: off

This property controls the addition of the –fastapair switch, treat this property as a boolean.

property fastaparttree

The PartTree algorithm is used with distances based on FASTA. Default: off

This property controls the addition of the –fastaparttree switch, treat this property as a boolean.

property fft

Use FFT approximation in group-to-group alignment. Default: on

This property controls the addition of the –fft switch, treat this property as a boolean.

property fmodel

Incorporate the AA/nuc composition information into the scoring matrix (True) or not (False, default)

This property controls the addition of the –fmodel switch, treat this property as a boolean.

property genafpair

All pairwise alignments are computed with a local algorithm with the generalized affine gap cost (Altschul 1998). Default: off

This property controls the addition of the –genafpair switch, treat this property as a boolean.

property globalpair

All pairwise alignments are computed with the Needleman-Wunsch algorithm. Default: off

This property controls the addition of the –globalpair switch, treat this property as a boolean.

property groupsize

Do not make alignment larger than number sequences. Default: the number of input sequences

This property controls the addition of the –groupsize switch, treat this property as a boolean.

property input

Input file name

This controls the addition of the input parameter and its associated value. Set this property to the argument value required.

property input1

Second input file name for the mafft-profile command

This controls the addition of the input1 parameter and its associated value. Set this property to the argument value required.

property inputorder

Output order: same as input (True, default) or alignment based (False)

This property controls the addition of the –inputorder switch, treat this property as a boolean.

property jtt

JTT PAM number (Jones et al. 1992) matrix is used. number>0. Default: BLOSUM62

This controls the addition of the –jtt parameter and its associated value. Set this property to the argument value required.

property lep

Offset value at local pairwise alignment. Default: 0.1

This controls the addition of the –lep parameter and its associated value. Set this property to the argument value required.

property lexp

Gap extension penalty at local pairwise alignment. Default: -0.1

This controls the addition of the –lexp parameter and its associated value. Set this property to the argument value required.

property localpair

All pairwise alignments are computed with the Smith-Waterman algorithm. Default: off

This property controls the addition of the –localpair switch, treat this property as a boolean.

property lop

Gap opening penalty at local pairwise alignment. Default: 0.123

This controls the addition of the –lop parameter and its associated value. Set this property to the argument value required.

property maxiterate

Number cycles of iterative refinement are performed. Default: 0

This controls the addition of the –maxiterate parameter and its associated value. Set this property to the argument value required.

property memsave

Use the Myers-Miller (1988) algorithm. Default: automatically turned on when the alignment length exceeds 10,000 (aa/nt).

This property controls the addition of the –memsave switch, treat this property as a boolean.

property namelength

Name length in CLUSTAL and PHYLIP output.

MAFFT v6.847 (2011) added –namelength for use with the –clustalout option for CLUSTAL output.

MAFFT v7.024 (2013) added support for this with the –phylipout option for PHYLIP output (default 10).

This controls the addition of the –namelength parameter and its associated value. Set this property to the argument value required.

property nofft

Do not use FFT approximation in group-to-group alignment. Default: off

This property controls the addition of the –nofft switch, treat this property as a boolean.

property noscore

Alignment score is not checked in the iterative refinement stage. Default: off (score is checked)

This property controls the addition of the –noscore switch, treat this property as a boolean.

property nuc

Assume the sequences are nucleotide (True/False). Default: auto

This property controls the addition of the –nuc switch, treat this property as a boolean.

property op

Gap opening penalty at group-to-group alignment. Default: 1.53

This controls the addition of the –op parameter and its associated value. Set this property to the argument value required.

property partsize

The number of partitions in the PartTree algorithm. Default: 50

This controls the addition of the –partsize parameter and its associated value. Set this property to the argument value required.

property parttree

Use a fast tree-building method with the 6mer distance. Default: off

This property controls the addition of the –parttree switch, treat this property as a boolean.

property phylipout

Output format: phylip (True), or fasta (False, default)

This property controls the addition of the –phylipout switch, treat this property as a boolean.

property quiet

Do not report progress (True) or not (False, default).

This property controls the addition of the –quiet switch, treat this property as a boolean.

property reorder

Output order: aligned (True) or in input order (False, default)

This property controls the addition of the –reorder switch, treat this property as a boolean.

property retree

Guide tree is built number times in the progressive stage. Valid with 6mer distance. Default: 2

This controls the addition of the –retree parameter and its associated value. Set this property to the argument value required.

property seed

Seed alignments given in alignment_n (fasta format) are aligned with sequences in input.

This controls the addition of the –seed parameter and its associated value. Set this property to the argument value required.

property sixmerpair

Distance is calculated based on the number of shared 6mers. Default: on

This property controls the addition of the –6merpair switch, treat this property as a boolean.

property thread

Number of threads to use. Default: 1

This controls the addition of the –thread parameter and its associated value. Set this property to the argument value required.

property tm

Transmembrane PAM number (Jones et al. 1994) matrix is used. number>0. Default: BLOSUM62

This controls the addition of the –tm parameter and its associated value. Set this property to the argument value required.

property treeout

Guide tree is output to the input.tree file (True) or not (False, default)

This property controls the addition of the –treeout switch, treat this property as a boolean.

property weighti

Weighting factor for the consistency term calculated from pairwise alignments. Default: 2.7

This controls the addition of the –weighti parameter and its associated value. Set this property to the argument value required.

class Bio.Align.Applications.DialignCommandline(cmd='dialign2-2', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command line wrapper for the multiple alignment program DIALIGN2-2.

http://bibiserv.techfak.uni-bielefeld.de/dialign/welcome.html

Notes

Last checked against version: 2.2

References

B. Morgenstern (2004). DIALIGN: Multiple DNA and Protein Sequence Alignment at BiBiServ. Nucleic Acids Research 32, W33-W36.

Examples

To align a FASTA file (unaligned.fasta) with the output files names aligned.* including a FASTA output file (aligned.fa), use:

>>> from Bio.Align.Applications import DialignCommandline
>>> dialign_cline = DialignCommandline(input="unaligned.fasta",
...                                    fn="aligned", fa=True)
>>> print(dialign_cline)
dialign2-2 -fa -fn aligned unaligned.fasta

You would typically run the command line with dialign_cline() or via the Python subprocess module, as described in the Biopython tutorial.

__init__(self, cmd='dialign2-2', **kwargs)

Initialize the class.

property afc

Creates additional output file ‘*.afc’ containing data of all fragments considered for alignment WARNING: this file can be HUGE !

This property controls the addition of the -afc switch, treat this property as a boolean.

property afc_v

Like ‘-afc’ but verbose: fragments are explicitly printed. WARNING: this file can be EVEN BIGGER !

This property controls the addition of the -afc_v switch, treat this property as a boolean.

property anc

Anchored alignment. Requires a file <seq_file>.anc containing anchor points.

This property controls the addition of the -anc switch, treat this property as a boolean.

property cs

If segments are translated, not only the ‘Watson strand’ but also the ‘Crick strand’ is looked at.

This property controls the addition of the -cs switch, treat this property as a boolean.

property cw

Additional output file in CLUSTAL W format.

This property controls the addition of the -cw switch, treat this property as a boolean.

property ds

‘dna alignment speed up’ - non-translated nucleic acid fragments are taken into account only if they start with at least two matches. Speeds up DNA alignment at the expense of sensitivity.

This property controls the addition of the -ds switch, treat this property as a boolean.

property fa

Additional output file in FASTA format.

This property controls the addition of the -fa switch, treat this property as a boolean.

property ff

Creates file *.frg containing information about all fragments that are part of the respective optimal pairwise alignmnets plus information about consistency in the multiple alignment

This property controls the addition of the -ff switch, treat this property as a boolean.

property fn

Output files are named <out_file>.<extension>.

This controls the addition of the -fn parameter and its associated value. Set this property to the argument value required.

property fop

Creates file *.fop containing coordinates of all fragments that are part of the respective pairwise alignments.

This property controls the addition of the -fop switch, treat this property as a boolean.

property fsm

Creates file *.fsm containing coordinates of all fragments that are part of the final alignment

This property controls the addition of the -fsm switch, treat this property as a boolean.

property input

Input file name. Must be FASTA format

This controls the addition of the input parameter and its associated value. Set this property to the argument value required.

property iw

Overlap weights switched off (by default, overlap weights are used if up to 35 sequences are aligned). This option speeds up the alignment but may lead to reduced alignment quality.

This property controls the addition of the -iw switch, treat this property as a boolean.

property lgs

‘long genomic sequences’ - combines the following options: -ma, -thr 2, -lmax 30, -smin 8, -nta, -ff, -fop, -ff, -cs, -ds, -pst

This property controls the addition of the -lgs switch, treat this property as a boolean.

property lgs_t

Like ‘-lgs’ but with all segment pairs assessed at the peptide level (rather than ‘mixed alignments’ as with the ‘-lgs’ option). Therefore faster than -lgs but not very sensitive for non-coding regions.

This property controls the addition of the -lgs_t switch, treat this property as a boolean.

property lmax

Maximum fragment length = x (default: x = 40 or x = 120 for ‘translated’ fragments). Shorter x speeds up the program but may affect alignment quality.

This controls the addition of the -lmax parameter and its associated value. Set this property to the argument value required.

property lo

(Long Output) Additional file *.log with information about fragments selected for pairwise alignment and about consistency in multi-alignment procedure.

This property controls the addition of the -lo switch, treat this property as a boolean.

property ma

‘mixed alignments’ consisting of P-fragments and N-fragments if nucleic acid sequences are aligned.

This property controls the addition of the -ma switch, treat this property as a boolean.

property mask

Residues not belonging to selected fragments are replaced by ‘*’ characters in output alignment (rather than being printed in lower-case characters)

This property controls the addition of the -mask switch, treat this property as a boolean.

property mat

Creates file *mat with substitution counts derived from the fragments that have been selected for alignment.

This property controls the addition of the -mat switch, treat this property as a boolean.

property mat_thr

Like ‘-mat’ but only fragments with weight score > t are considered

This property controls the addition of the -mat_thr switch, treat this property as a boolean.

‘maximum linkage’ clustering used to construct sequence tree (instead of UPGMA).

This property controls the addition of the -max_link switch, treat this property as a boolean.

‘minimum linkage’ clustering used.

This property controls the addition of the -min_link switch, treat this property as a boolean.

property mot

‘motif’ option.

This controls the addition of the -mot parameter and its associated value. Set this property to the argument value required.

property msf

Separate output file in MSF format.

This property controls the addition of the -msf switch, treat this property as a boolean.

property n

Input sequences are nucleic acid sequences. No translation of fragments.

This property controls the addition of the -n switch, treat this property as a boolean.

property nt

Input sequences are nucleic acid sequences and ‘nucleic acid segments’ are translated to ‘peptide segments’.

This property controls the addition of the -nt switch, treat this property as a boolean.

property nta

‘no textual alignment’ - textual alignment suppressed. This option makes sense if other output files are of interest – e.g. the fragment files created with -ff, -fop, -fsm or -lo.

This property controls the addition of the -nta switch, treat this property as a boolean.

property o

Fast version, resulting alignments may be slightly different.

This property controls the addition of the -o switch, treat this property as a boolean.

property ow

Overlap weights enforced (By default, overlap weights are used only if up to 35 sequences are aligned since calculating overlap weights is time consuming).

This property controls the addition of the -ow switch, treat this property as a boolean.

property pst

‘print status’. Creates and updates a file *.sta with information about the current status of the program run. This option is recommended if large data sets are aligned since it allows the user to estimate the remaining running time.

This property controls the addition of the -pst switch, treat this property as a boolean.

property smin

Minimum similarity value for first residue pair (or codon pair) in fragments. Speeds up protein alignment or alignment of translated DNA fragments at the expense of sensitivity.

This property controls the addition of the -smin switch, treat this property as a boolean.

property stars

Maximum number of ‘*’ characters indicating degree of local similarity among sequences. By default, no stars are used but numbers between 0 and 9, instead.

This controls the addition of the -stars parameter and its associated value. Set this property to the argument value required.

property stdo

Results written to standard output.

This property controls the addition of the -stdo switch, treat this property as a boolean.

property ta

Standard textual alignment printed (overrides suppression of textual alignments in special options, e.g. -lgs)

This property controls the addition of the -ta switch, treat this property as a boolean.

property thr

Threshold T = x.

This controls the addition of the -thr parameter and its associated value. Set this property to the argument value required.

property xfr

‘exclude fragments’ - list of fragments can be specified that are NOT considered for pairwise alignment

This property controls the addition of the -xfr switch, treat this property as a boolean.

class Bio.Align.Applications.ProbconsCommandline(cmd='probcons', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command line wrapper for the multiple alignment program PROBCONS.

http://probcons.stanford.edu/

Notes

Last checked against version: 1.12

References

Do, C.B., Mahabhashyam, M.S.P., Brudno, M., and Batzoglou, S. 2005. PROBCONS: Probabilistic Consistency-based Multiple Sequence Alignment. Genome Research 15: 330-340.

Examples

To align a FASTA file (unaligned.fasta) with the output in ClustalW format, and otherwise default settings, use:

>>> from Bio.Align.Applications import ProbconsCommandline
>>> probcons_cline = ProbconsCommandline(input="unaligned.fasta",
...                                      clustalw=True)
>>> print(probcons_cline)
probcons -clustalw unaligned.fasta

You would typically run the command line with probcons_cline() or via the Python subprocess module, as described in the Biopython tutorial.

Note that PROBCONS will write the alignment to stdout, which you may want to save to a file and then parse, e.g.:

stdout, stderr = probcons_cline()
with open("aligned.aln", "w") as handle:
    handle.write(stdout)
from Bio import AlignIO
align = AlignIO.read("aligned.fasta", "clustalw")

Alternatively, to parse the output with AlignIO directly you can use StringIO to turn the string into a handle:

stdout, stderr = probcons_cline()
from StringIO import StringIO
from Bio import AlignIO
align = AlignIO.read(StringIO(stdout), "clustalw")
__init__(self, cmd='probcons', **kwargs)

Initialize the class.

property a

Print sequences in alignment order rather than input order (default: off)

This property controls the addition of the -a switch, treat this property as a boolean.

property annot

Write annotation for multiple alignment to FILENAME

This controls the addition of the -annot parameter and its associated value. Set this property to the argument value required.

property clustalw

Use CLUSTALW output format instead of MFA

This property controls the addition of the -clustalw switch, treat this property as a boolean.

property consistency

Use 0 <= REPS <= 5 (default: 2) passes of consistency transformation

This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.

property emissions

Also reestimate emission probabilities (default: off)

This property controls the addition of the -e switch, treat this property as a boolean.

property input

Input file name. Must be multiple FASTA alignment (MFA) format

This controls the addition of the input parameter and its associated value. Set this property to the argument value required.

property ir

Use 0 <= REPS <= 1000 (default: 100) passes of iterative-refinement

This controls the addition of the -ir parameter and its associated value. Set this property to the argument value required.

property pairs

Generate all-pairs pairwise alignments

This property controls the addition of the -pairs switch, treat this property as a boolean.

property paramfile

Read parameters from FILENAME

This controls the addition of the -p parameter and its associated value. Set this property to the argument value required.

property pre

Use 0 <= REPS <= 20 (default: 0) rounds of pretraining

This controls the addition of the -pre parameter and its associated value. Set this property to the argument value required.

property train

Compute EM transition probabilities, store in FILENAME (default: no training)

This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.

property verbose

Report progress while aligning (default: off)

This property controls the addition of the -verbose switch, treat this property as a boolean.

property viterbi

Use Viterbi algorithm to generate all pairs (automatically enables -pairs)

This property controls the addition of the -viterbi switch, treat this property as a boolean.

class Bio.Align.Applications.TCoffeeCommandline(cmd='t_coffee', **kwargs)

Bases: Bio.Application.AbstractCommandline

Commandline object for the TCoffee alignment program.

http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html

The T-Coffee command line tool has a lot of switches and options. This wrapper implements a VERY limited number of options - if you would like to help improve it please get in touch.

Notes

Last checked against: Version_6.92

References

T-Coffee: A novel method for multiple sequence alignments. Notredame, Higgins, Heringa, JMB,302(205-217) 2000

Examples

To align a FASTA file (unaligned.fasta) with the output in ClustalW format (file aligned.aln), and otherwise default settings, use:

>>> from Bio.Align.Applications import TCoffeeCommandline
>>> tcoffee_cline = TCoffeeCommandline(infile="unaligned.fasta",
...                                    output="clustalw",
...                                    outfile="aligned.aln")
>>> print(tcoffee_cline)
t_coffee -output clustalw -infile unaligned.fasta -outfile aligned.aln

You would typically run the command line with tcoffee_cline() or via the Python subprocess module, as described in the Biopython tutorial.

SEQ_TYPES = ['dna', 'protein', 'dna_protein']
__init__(self, cmd='t_coffee', **kwargs)

Initialize the class.

property convert

Specify you want to perform a file conversion

This property controls the addition of the -convert switch, treat this property as a boolean.

property gapext

Indicates the penalty applied for extending a gap (negative integer)

This controls the addition of the -gapext parameter and its associated value. Set this property to the argument value required.

property gapopen

Indicates the penalty applied for opening a gap (negative integer)

This controls the addition of the -gapopen parameter and its associated value. Set this property to the argument value required.

property infile

Specify the input file.

This controls the addition of the -infile parameter and its associated value. Set this property to the argument value required.

property matrix

Specify the filename of the substitution matrix to use.Default: blosum62mt

This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.

property mode

Specifies a special mode: genome, quickaln, dali, 3dcoffee

This controls the addition of the -mode parameter and its associated value. Set this property to the argument value required.

property outfile

Specify the output file. Default: <your sequences>.aln

This controls the addition of the -outfile parameter and its associated value. Set this property to the argument value required.

property outorder

Specify the order of sequence to outputEither ‘input’, ‘aligned’ or <filename> of Fasta file with sequence order

This controls the addition of the -outorder parameter and its associated value. Set this property to the argument value required.

property output

Specify the output type.

One (or more separated by a comma) of: ‘clustalw_aln’, ‘clustalw’, ‘gcg’, ‘msf_aln’, ‘pir_aln’, ‘fasta_aln’, ‘phylip’, ‘pir_seq’, ‘fasta_seq’

Note that of these Biopython’s AlignIO module will only read clustalw, pir, and fasta.

This controls the addition of the -output parameter and its associated value. Set this property to the argument value required.

property quiet

Turn off log output

This property controls the addition of the -quiet switch, treat this property as a boolean.

property type

Specify the type of sequence being aligned

This controls the addition of the -type parameter and its associated value. Set this property to the argument value required.

class Bio.Align.Applications.MSAProbsCommandline(cmd='msaprobs', **kwargs)

Bases: Bio.Application.AbstractCommandline

Command line wrapper for MSAProbs.

http://msaprobs.sourceforge.net

Notes

Last checked against version: 0.9.7

References

Yongchao Liu, Bertil Schmidt, Douglas L. Maskell: “MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities”. Bioinformatics, 2010, 26(16): 1958 -1964

Examples

>>> from Bio.Align.Applications import MSAProbsCommandline
>>> in_file = "unaligned.fasta"
>>> out_file = "aligned.cla"
>>> cline = MSAProbsCommandline(infile=in_file, outfile=out_file, clustalw=True)
>>> print(cline)
msaprobs -o aligned.cla -clustalw unaligned.fasta

You would typically run the command line with cline() or via the Python subprocess module, as described in the Biopython tutorial.

__init__(self, cmd='msaprobs', **kwargs)

Initialize the class.

property alignment_order

print sequences in alignment order rather than input order (default: off)

This property controls the addition of the -a switch, treat this property as a boolean.

property annot

write annotation for multiple alignment to FILENAME

This controls the addition of the -annot parameter and its associated value. Set this property to the argument value required.

property clustalw

use CLUSTALW output format instead of FASTA format

This property controls the addition of the -clustalw switch, treat this property as a boolean.

property consistency

use 0 <= REPS <= 5 (default: 2) passes of consistency transformation

This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.

property infile

Multiple sequence input file

This controls the addition of the infile parameter and its associated value. Set this property to the argument value required.

property iterative_refinement

use 0 <= REPS <= 1000 (default: 10) passes of iterative-refinement

This controls the addition of the -ir parameter and its associated value. Set this property to the argument value required.

property numthreads

specify the number of threads used, and otherwise detect automatically

This controls the addition of the -num_threads parameter and its associated value. Set this property to the argument value required.

property outfile

specify the output file name (STDOUT by default)

This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.

property verbose

report progress while aligning (default: off)

This property controls the addition of the -v switch, treat this property as a boolean.

property version

print out version of MSAPROBS

This controls the addition of the -version parameter and its associated value. Set this property to the argument value required.