Bio.Align.Applications package
Module contents
Alignment command line tool wrappers (OBSOLETE).
We have decided to remove this module in future, and instead recommend building your command and invoking it via the subprocess module directly.
- class Bio.Align.Applications.MuscleCommandline(cmd='muscle', **kwargs)
Bases:
AbstractCommandline
Command line wrapper for the multiple alignment program MUSCLE.
Notes
Last checked against version: 3.7, briefly against 3.8
References
Edgar, Robert C. (2004), MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Research 32(5), 1792-97.
Edgar, R.C. (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5(1): 113.
Examples
>>> from Bio.Align.Applications import MuscleCommandline >>> muscle_exe = r"C:\Program Files\Alignments\muscle3.8.31_i86win32.exe" >>> in_file = r"C:\My Documents\unaligned.fasta" >>> out_file = r"C:\My Documents\aligned.fasta" >>> muscle_cline = MuscleCommandline(muscle_exe, input=in_file, out=out_file) >>> print(muscle_cline) "C:\Program Files\Alignments\muscle3.8.31_i86win32.exe" -in "C:\My Documents\unaligned.fasta" -out "C:\My Documents\aligned.fasta"
You would typically run the command line with muscle_cline() or via the Python subprocess module, as described in the Biopython tutorial.
- __init__(cmd='muscle', **kwargs)
Initialize the class.
- property anchors
Use anchor optimisation in tree dependent refinement iterations
This property controls the addition of the -anchors switch, treat this property as a boolean.
- property anchorspacing
Minimum spacing between anchor columns
This controls the addition of the -anchorspacing parameter and its associated value. Set this property to the argument value required.
- property brenner
Use Steve Brenner’s root alignment method
This property controls the addition of the -brenner switch, treat this property as a boolean.
- property center
Center parameter - should be negative
This controls the addition of the -center parameter and its associated value. Set this property to the argument value required.
- property cluster
Perform fast clustering of input sequences, use -tree1 to save tree
This property controls the addition of the -cluster switch, treat this property as a boolean.
- property cluster1
Clustering method used in iteration 1
This controls the addition of the -cluster1 parameter and its associated value. Set this property to the argument value required.
- property cluster2
Clustering method used in iteration 2
This controls the addition of the -cluster2 parameter and its associated value. Set this property to the argument value required.
- property clw
Write output in CLUSTALW format (with a MUSCLE header)
This property controls the addition of the -clw switch, treat this property as a boolean.
- property clwout
Write CLUSTALW output (with MUSCLE header) to specified filename
This controls the addition of the -clwout parameter and its associated value. Set this property to the argument value required.
- property clwstrict
Write output in CLUSTALW format with version 1.81 header
This property controls the addition of the -clwstrict switch, treat this property as a boolean.
- property clwstrictout
Write CLUSTALW output (with version 1.81 header) to specified filename
This controls the addition of the -clwstrictout parameter and its associated value. Set this property to the argument value required.
- property core
Do not catch exceptions
This property controls the addition of the -core switch, treat this property as a boolean.
- property diagbreak
Maximum distance between two diagonals that allows them to merge into one diagonal
This controls the addition of the -diagbreak parameter and its associated value. Set this property to the argument value required.
- property diaglength
Minimum length of diagonal
This controls the addition of the -diaglength parameter and its associated value. Set this property to the argument value required.
- property diagmargin
Discard this many positions at ends of diagonal
This controls the addition of the -diagmargin parameter and its associated value. Set this property to the argument value required.
- property diags
Find diagonals (faster for similar sequences)
This property controls the addition of the -diags switch, treat this property as a boolean.
- property dimer
Use faster (slightly less accurate) dimer approximationfor the SP score
This property controls the addition of the -dimer switch, treat this property as a boolean.
- property distance1
Distance measure for iteration 1
This controls the addition of the -distance1 parameter and its associated value. Set this property to the argument value required.
- property distance2
Distance measure for iteration 2
This controls the addition of the -distance2 parameter and its associated value. Set this property to the argument value required.
- property fasta
Write output in FASTA format
This property controls the addition of the -fasta switch, treat this property as a boolean.
- property fastaout
Write FASTA format output to specified filename
This controls the addition of the -fastaout parameter and its associated value. Set this property to the argument value required.
- property gapextend
Gap extension penalty
This controls the addition of the -gapextend parameter and its associated value. Set this property to the argument value required.
- property gapopen
Gap open score - negative number
This controls the addition of the -gapopen parameter and its associated value. Set this property to the argument value required.
- property group
Group similar sequences in output
This property controls the addition of the -group switch, treat this property as a boolean.
- property html
Write output in HTML format
This property controls the addition of the -html switch, treat this property as a boolean.
- property htmlout
Write HTML output to specified filename
This controls the addition of the -htmlout parameter and its associated value. Set this property to the argument value required.
- property hydro
Window size for hydrophobic region
This controls the addition of the -hydro parameter and its associated value. Set this property to the argument value required.
- property hydrofactor
Multiplier for gap penalties in hydrophobic regions
This controls the addition of the -hydrofactor parameter and its associated value. Set this property to the argument value required.
- property in1
First input filename for profile alignment
This controls the addition of the -in1 parameter and its associated value. Set this property to the argument value required.
- property in2
Second input filename for a profile alignment
This controls the addition of the -in2 parameter and its associated value. Set this property to the argument value required.
- property input
Input filename
This controls the addition of the -in parameter and its associated value. Set this property to the argument value required.
- property le
Use log-expectation profile score (VTML240)
This property controls the addition of the -le switch, treat this property as a boolean.
- property log
Log file name
This controls the addition of the -log parameter and its associated value. Set this property to the argument value required.
- property loga
Log file name (append to existing file)
This controls the addition of the -loga parameter and its associated value. Set this property to the argument value required.
- property matrix
path to NCBI or WU-BLAST format protein substitution matrix - also set -gapopen, -gapextend and -center
This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.
- property maxdiagbreak
Deprecated in v3.8, use -diagbreak instead.
This controls the addition of the -maxdiagbreak parameter and its associated value. Set this property to the argument value required.
- property maxhours
Maximum time to run in hours
This controls the addition of the -maxhours parameter and its associated value. Set this property to the argument value required.
- property maxiters
Maximum number of iterations
This controls the addition of the -maxiters parameter and its associated value. Set this property to the argument value required.
- property maxtrees
Maximum number of trees to build in iteration 2
This controls the addition of the -maxtrees parameter and its associated value. Set this property to the argument value required.
- property minbestcolscore
Minimum score a column must have to be an anchor
This controls the addition of the -minbestcolscore parameter and its associated value. Set this property to the argument value required.
- property minsmoothscore
Minimum smoothed score a column must have to be an anchor
This controls the addition of the -minsmoothscore parameter and its associated value. Set this property to the argument value required.
- property msf
Write output in MSF format
This property controls the addition of the -msf switch, treat this property as a boolean.
- property msfout
Write MSF format output to specified filename
This controls the addition of the -msfout parameter and its associated value. Set this property to the argument value required.
- property noanchors
Do not use anchor optimisation in tree dependent refinement iterations
This property controls the addition of the -noanchors switch, treat this property as a boolean.
- property nocore
Catch exceptions
This property controls the addition of the -nocore switch, treat this property as a boolean.
- property objscore
Objective score used by tree dependent refinement
This controls the addition of the -objscore parameter and its associated value. Set this property to the argument value required.
- property out
Output filename
This controls the addition of the -out parameter and its associated value. Set this property to the argument value required.
- property phyi
Write output in PHYLIP interleaved format
This property controls the addition of the -phyi switch, treat this property as a boolean.
- property phyiout
Write PHYLIP interleaved output to specified filename
This controls the addition of the -phyiout parameter and its associated value. Set this property to the argument value required.
- property phys
Write output in PHYLIP sequential format
This property controls the addition of the -phys switch, treat this property as a boolean.
- property physout
Write PHYLIP sequential format to specified filename
This controls the addition of the -physout parameter and its associated value. Set this property to the argument value required.
- property profile
Perform a profile alignment
This property controls the addition of the -profile switch, treat this property as a boolean.
- property quiet
Do not display progress messages
This property controls the addition of the -quiet switch, treat this property as a boolean.
- property refine
Only do tree dependent refinement
This property controls the addition of the -refine switch, treat this property as a boolean.
- property refinew
Only do tree dependent refinement using sliding window approach
This property controls the addition of the -refinew switch, treat this property as a boolean.
- property refinewindow
Length of window for -refinew
This controls the addition of the -refinewindow parameter and its associated value. Set this property to the argument value required.
- property root1
Method used to root tree in iteration 1
This controls the addition of the -root1 parameter and its associated value. Set this property to the argument value required.
- property root2
Method used to root tree in iteration 2
This controls the addition of the -root2 parameter and its associated value. Set this property to the argument value required.
- property scorefile
Score file name, contains one line for each column in the alignment with average BLOSUM62 score
This controls the addition of the -scorefile parameter and its associated value. Set this property to the argument value required.
- property seqtype
Sequence type
This controls the addition of the -seqtype parameter and its associated value. Set this property to the argument value required.
- property smoothscoreceil
Maximum value of column score for smoothing
This controls the addition of the -smoothscoreceil parameter and its associated value. Set this property to the argument value required.
- property smoothwindow
Window used for anchor column smoothing
This controls the addition of the -smoothwindow parameter and its associated value. Set this property to the argument value required.
- property sp
Use sum-of-pairs protein profile score (PAM200)
This property controls the addition of the -sp switch, treat this property as a boolean.
- property spn
Use sum-of-pairs protein nucleotide profile score
This property controls the addition of the -spn switch, treat this property as a boolean.
- property spscore
Compute SP objective score of multiple alignment
This controls the addition of the -spscore parameter and its associated value. Set this property to the argument value required.
- property stable
Do not group similar sequences in output (not supported in v3.8)
This property controls the addition of the -stable switch, treat this property as a boolean.
- property sueff
Constant used in UPGMB clustering
This controls the addition of the -sueff parameter and its associated value. Set this property to the argument value required.
- property sv
Use sum-of-pairs profile score (VTML240)
This property controls the addition of the -sv switch, treat this property as a boolean.
- property tree1
Save Newick tree from iteration 1
This controls the addition of the -tree1 parameter and its associated value. Set this property to the argument value required.
- property tree2
Save Newick tree from iteration 2
This controls the addition of the -tree2 parameter and its associated value. Set this property to the argument value required.
- property usetree
Use given Newick tree as guide tree
This controls the addition of the -usetree parameter and its associated value. Set this property to the argument value required.
- property verbose
Write parameter settings and progress
This property controls the addition of the -verbose switch, treat this property as a boolean.
- property version
Write version string to stdout and exit
This property controls the addition of the -version switch, treat this property as a boolean.
- property weight1
Weighting scheme used in iteration 1
This controls the addition of the -weight1 parameter and its associated value. Set this property to the argument value required.
- property weight2
Weighting scheme used in iteration 2
This controls the addition of the -weight2 parameter and its associated value. Set this property to the argument value required.
- class Bio.Align.Applications.ClustalwCommandline(cmd='clustalw', **kwargs)
Bases:
AbstractCommandline
Command line wrapper for clustalw (version one or two).
Notes
Last checked against versions: 1.83 and 2.1
References
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947-2948.
Examples
>>> from Bio.Align.Applications import ClustalwCommandline >>> in_file = "unaligned.fasta" >>> clustalw_cline = ClustalwCommandline("clustalw2", infile=in_file) >>> print(clustalw_cline) clustalw2 -infile=unaligned.fasta
You would typically run the command line with clustalw_cline() or via the Python subprocess module, as described in the Biopython tutorial.
- __init__(cmd='clustalw', **kwargs)
Initialize the class.
- property align
Do full multiple alignment.
This property controls the addition of the -align switch, treat this property as a boolean.
- property bootlabels
Node OR branch position of bootstrap values in tree display
This controls the addition of the -bootlabels parameter and its associated value. Set this property to the argument value required.
- property bootstrap
Bootstrap a NJ tree (n= number of bootstraps; def. = 1000).
This controls the addition of the -bootstrap parameter and its associated value. Set this property to the argument value required.
- property case
LOWER or UPPER (for GDE output only)
This controls the addition of the -case parameter and its associated value. Set this property to the argument value required.
- property check
Outline the command line params.
This property controls the addition of the -check switch, treat this property as a boolean.
- property clustering
NJ or UPGMA
This controls the addition of the -clustering parameter and its associated value. Set this property to the argument value required.
- property convert
Output the input sequences in a different file format.
This property controls the addition of the -convert switch, treat this property as a boolean.
- property dnamatrix
DNA weight matrix=IUB, CLUSTALW or filename
This controls the addition of the -dnamatrix parameter and its associated value. Set this property to the argument value required.
- property endgaps
No end gap separation pen.
This property controls the addition of the -endgaps switch, treat this property as a boolean.
- property fullhelp
Output full help content.
This property controls the addition of the -fullhelp switch, treat this property as a boolean.
- property gapdist
Gap separation pen. range
This controls the addition of the -gapdist parameter and its associated value. Set this property to the argument value required.
- property gapext
Gap extension penalty
This controls the addition of the -gapext parameter and its associated value. Set this property to the argument value required.
- property gapopen
Gap opening penalty
This controls the addition of the -gapopen parameter and its associated value. Set this property to the argument value required.
- property helixendin
Number of residues inside helix to be treated as terminal
This controls the addition of the -helixendin parameter and its associated value. Set this property to the argument value required.
- property helixendout
Number of residues outside helix to be treated as terminal
This controls the addition of the -helixendout parameter and its associated value. Set this property to the argument value required.
- property helixgap
Gap penalty for helix core residues
This controls the addition of the -helixgap parameter and its associated value. Set this property to the argument value required.
- property help
Outline the command line params.
This property controls the addition of the -help switch, treat this property as a boolean.
- property hgapresidues
List hydrophilic res.
This property controls the addition of the -hgapresidues switch, treat this property as a boolean.
- property infile
Input sequences.
This controls the addition of the -infile parameter and its associated value. Set this property to the argument value required.
- property iteration
NONE or TREE or ALIGNMENT
This controls the addition of the -iteration parameter and its associated value. Set this property to the argument value required.
- property kimura
Use Kimura’s correction.
This property controls the addition of the -kimura switch, treat this property as a boolean.
- property ktuple
Word size
This controls the addition of the -ktuple parameter and its associated value. Set this property to the argument value required.
- property loopgap
Gap penalty for loop regions
This controls the addition of the -loopgap parameter and its associated value. Set this property to the argument value required.
- property matrix
Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.
- property maxdiv
% ident. for delay
This controls the addition of the -maxdiv parameter and its associated value. Set this property to the argument value required.
- property maxseqlen
Maximum allowed input sequence length
This controls the addition of the -maxseqlen parameter and its associated value. Set this property to the argument value required.
- property negative
Protein alignment with negative values in matrix
This property controls the addition of the -negative switch, treat this property as a boolean.
- property newtree
Output file name for newly created guide tree
This controls the addition of the -newtree parameter and its associated value. Set this property to the argument value required.
- property newtree1
Output file name for new guide tree of profile1
This controls the addition of the -newtree1 parameter and its associated value. Set this property to the argument value required.
- property newtree2
Output file for new guide tree of profile2
This controls the addition of the -newtree2 parameter and its associated value. Set this property to the argument value required.
- property nohgap
Hydrophilic gaps off
This property controls the addition of the -nohgap switch, treat this property as a boolean.
- property nopgap
Residue-specific gaps off
This property controls the addition of the -nopgap switch, treat this property as a boolean.
- property nosecstr1
Do not use secondary structure-gap penalty mask for profile 1
This property controls the addition of the -nosecstr1 switch, treat this property as a boolean.
- property nosecstr2
Do not use secondary structure-gap penalty mask for profile 2
This property controls the addition of the -nosecstr2 switch, treat this property as a boolean.
- property noweights
Disable sequence weighting
This property controls the addition of the -noweights switch, treat this property as a boolean.
- property numiter
maximum number of iterations to perform
This controls the addition of the -numiter parameter and its associated value. Set this property to the argument value required.
- property options
List the command line parameters
This property controls the addition of the -options switch, treat this property as a boolean.
- property outfile
Output sequence alignment file name
This controls the addition of the -outfile parameter and its associated value. Set this property to the argument value required.
- property outorder
Output taxon order: INPUT or ALIGNED
This controls the addition of the -outorder parameter and its associated value. Set this property to the argument value required.
- property output
Output format: CLUSTAL(default), GCG, GDE, PHYLIP, PIR, NEXUS and FASTA
This controls the addition of the -output parameter and its associated value. Set this property to the argument value required.
- property outputtree
nj OR phylip OR dist OR nexus
This controls the addition of the -outputtree parameter and its associated value. Set this property to the argument value required.
- property pairgap
Gap penalty
This controls the addition of the -pairgap parameter and its associated value. Set this property to the argument value required.
- property pim
Output percent identity matrix (while calculating the tree).
This property controls the addition of the -pim switch, treat this property as a boolean.
- property profile
Merge two alignments by profile alignment
This property controls the addition of the -profile switch, treat this property as a boolean.
- property profile1
Profiles (old alignment).
This controls the addition of the -profile1 parameter and its associated value. Set this property to the argument value required.
- property profile2
Profiles (old alignment).
This controls the addition of the -profile2 parameter and its associated value. Set this property to the argument value required.
- property pwdnamatrix
DNA weight matrix=IUB, CLUSTALW or filename
This controls the addition of the -pwdnamatrix parameter and its associated value. Set this property to the argument value required.
- property pwgapext
Gap extension penalty
This controls the addition of the -pwgapext parameter and its associated value. Set this property to the argument value required.
- property pwgapopen
Gap opening penalty
This controls the addition of the -pwgapopen parameter and its associated value. Set this property to the argument value required.
- property pwmatrix
Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
This controls the addition of the -pwmatrix parameter and its associated value. Set this property to the argument value required.
- property quicktree
Use FAST algorithm for the alignment guide tree
This property controls the addition of the -quicktree switch, treat this property as a boolean.
- property quiet
Reduce console output to minimum
This property controls the addition of the -quiet switch, treat this property as a boolean.
- property range
Sequence range to write starting m to m+n. Input as string eg. ‘24,200’
This controls the addition of the -range parameter and its associated value. Set this property to the argument value required.
- property score
Either: PERCENT or ABSOLUTE
This controls the addition of the -score parameter and its associated value. Set this property to the argument value required.
- property secstrout
STRUCTURE or MASK or BOTH or NONE output in alignment file
This controls the addition of the -secstrout parameter and its associated value. Set this property to the argument value required.
- property seed
Seed number for bootstraps.
This controls the addition of the -seed parameter and its associated value. Set this property to the argument value required.
- property seqno_range
OFF or ON (NEW- for all output formats)
This controls the addition of the -seqno_range parameter and its associated value. Set this property to the argument value required.
- property seqnos
OFF or ON (for Clustal output only)
This controls the addition of the -seqnos parameter and its associated value. Set this property to the argument value required.
- property sequences
Sequentially add profile2 sequences to profile1 alignment
This property controls the addition of the -sequences switch, treat this property as a boolean.
- property stats
Log some alignment statistics to file
This controls the addition of the -stats parameter and its associated value. Set this property to the argument value required.
- property strandendin
Number of residues inside strand to be treated as terminal
This controls the addition of the -strandendin parameter and its associated value. Set this property to the argument value required.
- property strandendout
Number of residues outside strand to be treated as terminal
This controls the addition of the -strandendout parameter and its associated value. Set this property to the argument value required.
- property strandgap
gap penalty for strand core residues
This controls the addition of the -strandgap parameter and its associated value. Set this property to the argument value required.
- property terminalgap
Gap penalty for structure termini
This controls the addition of the -terminalgap parameter and its associated value. Set this property to the argument value required.
- property topdiags
Number of best diags.
This controls the addition of the -topdiags parameter and its associated value. Set this property to the argument value required.
- property tossgaps
Ignore positions with gaps.
This property controls the addition of the -tossgaps switch, treat this property as a boolean.
- property transweight
Transitions weighting
This controls the addition of the -transweight parameter and its associated value. Set this property to the argument value required.
- property tree
Calculate NJ tree.
This property controls the addition of the -tree switch, treat this property as a boolean.
- property type
PROTEIN or DNA sequences
This controls the addition of the -type parameter and its associated value. Set this property to the argument value required.
- property usetree
File name of guide tree
This controls the addition of the -usetree parameter and its associated value. Set this property to the argument value required.
- property usetree1
File name of guide tree for profile1
This controls the addition of the -usetree1 parameter and its associated value. Set this property to the argument value required.
- property usetree2
File name of guide tree for profile2
This controls the addition of the -usetree2 parameter and its associated value. Set this property to the argument value required.
- property window
Window around best diags.
This controls the addition of the -window parameter and its associated value. Set this property to the argument value required.
- class Bio.Align.Applications.ClustalOmegaCommandline(cmd='clustalo', **kwargs)
Bases:
AbstractCommandline
Command line wrapper for clustal omega.
Notes
Last checked against version: 1.2.0
References
Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology 7:539 https://doi.org/10.1038/msb.2011.75
Examples
>>> from Bio.Align.Applications import ClustalOmegaCommandline >>> in_file = "unaligned.fasta" >>> out_file = "aligned.fasta" >>> clustalomega_cline = ClustalOmegaCommandline(infile=in_file, outfile=out_file, verbose=True, auto=True) >>> print(clustalomega_cline) clustalo -i unaligned.fasta -o aligned.fasta --auto -v
You would typically run the command line with clustalomega_cline() or via the Python subprocess module, as described in the Biopython tutorial.
- __init__(cmd='clustalo', **kwargs)
Initialize the class.
- property auto
Set options automatically (might overwrite some of your options)
This property controls the addition of the –auto switch, treat this property as a boolean.
- property clusteringout
Clustering output file
This controls the addition of the –clustering-out parameter and its associated value. Set this property to the argument value required.
- property clustersize
soft maximum of sequences in sub-clusters
This controls the addition of the –cluster-size parameter and its associated value. Set this property to the argument value required.
- property dealign
Dealign input sequences
This property controls the addition of the –dealign switch, treat this property as a boolean.
- property distmat_full
Use full distance matrix for guide-tree calculation (slow; mBed is default)
This property controls the addition of the –full switch, treat this property as a boolean.
- property distmat_full_iter
Use full distance matrix for guide-tree calculation during iteration (mBed is default)
This property controls the addition of the –full-iter switch, treat this property as a boolean.
- property distmat_in
Pairwise distance matrix input file (skips distance computation).
This controls the addition of the –distmat-in parameter and its associated value. Set this property to the argument value required.
- property distmat_out
Pairwise distance matrix output file.
This controls the addition of the –distmat-out parameter and its associated value. Set this property to the argument value required.
- property force
Force file overwriting.
This property controls the addition of the –force switch, treat this property as a boolean.
- property guidetree_in
Guide tree input file (skips distance computation and guide-tree clustering step).
This controls the addition of the –guidetree-in parameter and its associated value. Set this property to the argument value required.
- property guidetree_out
Guide tree output file.
This controls the addition of the –guidetree-out parameter and its associated value. Set this property to the argument value required.
- property help
Print help and exit.
This property controls the addition of the -h switch, treat this property as a boolean.
- property hmm_input
HMM input files
This controls the addition of the –hmm-in parameter and its associated value. Set this property to the argument value required.
- property infile
Multiple sequence input file
This controls the addition of the -i parameter and its associated value. Set this property to the argument value required.
- property infmt
Forced sequence input file format (default: auto)
Allowed values: a2m, fa[sta], clu[stal], msf, phy[lip], selex, st[ockholm], vie[nna]
This controls the addition of the –infmt parameter and its associated value. Set this property to the argument value required.
- property isprofile
disable check if profile, force profile (default no)
This property controls the addition of the –is-profile switch, treat this property as a boolean.
- property iterations
Number of (combined guide-tree/HMM) iterations
This controls the addition of the –iterations parameter and its associated value. Set this property to the argument value required.
- property log
Log all non-essential output to this file.
This controls the addition of the -l parameter and its associated value. Set this property to the argument value required.
- property long_version
Print long version information and exit
This property controls the addition of the –long-version switch, treat this property as a boolean.
- property max_guidetree_iterations
Maximum number of guidetree iterations
This controls the addition of the –max-guidetree-iterations parameter and its associated value. Set this property to the argument value required.
- property max_hmm_iterations
Maximum number of HMM iterations
This controls the addition of the –max-hmm-iterations parameter and its associated value. Set this property to the argument value required.
- property maxnumseq
Maximum allowed number of sequences
This controls the addition of the –maxnumseq parameter and its associated value. Set this property to the argument value required.
- property maxseqlen
Maximum allowed sequence length
This controls the addition of the –maxseqlen parameter and its associated value. Set this property to the argument value required.
- property outfile
Multiple sequence alignment output file (default: stdout).
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
- property outfmt
MSA output file format: a2m=fa[sta],clu[stal],msf,phy[lip],selex,st[ockholm],vie[nna] (default: fasta).
This controls the addition of the –outfmt parameter and its associated value. Set this property to the argument value required.
- property outputorder
MSA output order like in input/guide-tree
This controls the addition of the –output-order parameter and its associated value. Set this property to the argument value required.
- property percentid
convert distances into percent identities (default no)
This property controls the addition of the –percent-id switch, treat this property as a boolean.
- property profile1
Pre-aligned multiple sequence file (aligned columns will be kept fix).
This controls the addition of the –profile1 parameter and its associated value. Set this property to the argument value required.
- property profile2
Pre-aligned multiple sequence file (aligned columns will be kept fix).
This controls the addition of the –profile2 parameter and its associated value. Set this property to the argument value required.
- property residuenumber
in Clustal format print residue numbers (default no)
This property controls the addition of the –residuenumber switch, treat this property as a boolean.
- property seqtype
{Protein, RNA, DNA} Force a sequence type (default: auto).
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
- property threads
Number of processors to use
This controls the addition of the –threads parameter and its associated value. Set this property to the argument value required.
- property usekimura
use Kimura distance correction for aligned sequences (default no)
This property controls the addition of the –use-kimura switch, treat this property as a boolean.
- property verbose
Verbose output
This property controls the addition of the -v switch, treat this property as a boolean.
- property version
Print version information and exit
This property controls the addition of the –version switch, treat this property as a boolean.
- property wrap
number of residues before line-wrap in output
This controls the addition of the –wrap parameter and its associated value. Set this property to the argument value required.
- class Bio.Align.Applications.PrankCommandline(cmd='prank', **kwargs)
Bases:
AbstractCommandline
Command line wrapper for the multiple alignment program PRANK.
http://www.ebi.ac.uk/goldman-srv/prank/prank/
Notes
Last checked against version: 081202
References
Loytynoja, A. and Goldman, N. 2005. An algorithm for progressive multiple alignment of sequences with insertions. Proceedings of the National Academy of Sciences, 102: 10557–10562.
Loytynoja, A. and Goldman, N. 2008. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science, 320: 1632.
Examples
To align a FASTA file (unaligned.fasta) with the output in aligned FASTA format with the output filename starting with “aligned” (you can’t pick the filename explicitly), no tree output and no XML output, use:
>>> from Bio.Align.Applications import PrankCommandline >>> prank_cline = PrankCommandline(d="unaligned.fasta", ... o="aligned", # prefix only! ... f=8, # FASTA output ... notree=True, noxml=True) >>> print(prank_cline) prank -d=unaligned.fasta -o=aligned -f=8 -noxml -notree
You would typically run the command line with prank_cline() or via the Python subprocess module, as described in the Biopython tutorial.
- __init__(cmd='prank', **kwargs)
Initialize the class.
- property F
Force insertions to be always skipped: same as +F
This property controls the addition of the -F switch, treat this property as a boolean.
- property codon
Codon aware alignment or not
This property controls the addition of the -codon switch, treat this property as a boolean.
- property convert
Convert input alignment to new format. Do not perform alignment
This property controls the addition of the -convert switch, treat this property as a boolean.
- property d
Input filename
This controls the addition of the -d parameter and its associated value. Set this property to the argument value required.
- property dnafreqs
DNA frequencies - ‘A,C,G,T’. eg ‘25,25,25,25’ as a quote surrounded string value. Default: empirical
This controls the addition of the -dnafreqs parameter and its associated value. Set this property to the argument value required.
- property dots
Show insertion gaps as dots
This property controls the addition of the -dots switch, treat this property as a boolean.
- property f
Output alignment format. Default: 8 FASTA Option are: 1. IG/Stanford 8. Pearson/Fasta 2. GenBank/GB 11. Phylip3.2 3. NBRF 12. Phylip 4. EMBL 14. PIR/CODATA 6. DNAStrider 15. MSF 7. Fitch 17. PAUP/NEXUS
This controls the addition of the -f parameter and its associated value. Set this property to the argument value required.
- property fixedbranches
Use fixed branch lengths of input value
This controls the addition of the -fixedbranches parameter and its associated value. Set this property to the argument value required.
- property gapext
Gap extension probability. Default: dna 0.5 / prot 0.5
This controls the addition of the -gapext parameter and its associated value. Set this property to the argument value required.
- property gaprate
Gap opening rate. Default: dna 0.025 prot 0.0025
This controls the addition of the -gaprate parameter and its associated value. Set this property to the argument value required.
- property kappa
Transition/transversion ratio. Default: 2
This controls the addition of the -kappa parameter and its associated value. Set this property to the argument value required.
- property longseq
Save space in pairwise alignments
This property controls the addition of the -longseq switch, treat this property as a boolean.
- property m
User-defined alignment model filename. Default: HKY2/WAG
This controls the addition of the -m parameter and its associated value. Set this property to the argument value required.
- property matinitsize
Matrix initial size multiplier
This controls the addition of the -matinitsize parameter and its associated value. Set this property to the argument value required.
- property matresize
Matrix resizing multiplier
This controls the addition of the -matresize parameter and its associated value. Set this property to the argument value required.
- property maxbranches
Use maximum branch lengths of input value
This controls the addition of the -maxbranches parameter and its associated value. Set this property to the argument value required.
- property mttranslate
Translate to protein using mt table
This property controls the addition of the -mttranslate switch, treat this property as a boolean.
- property nopost
Do not compute posterior support. Default: compute
This property controls the addition of the -nopost switch, treat this property as a boolean.
- property notree
Do not output dnd tree files (PRANK versions earlier than v.120626)
This property controls the addition of the -notree switch, treat this property as a boolean.
- property noxml
Do not output XML files (PRANK versions earlier than v.120626)
This property controls the addition of the -noxml switch, treat this property as a boolean.
- property o
- Output filenames prefix. Default: ‘output’
Will write: output.?.fas (depending on requested format), output.?.xml and output.?.dnd
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
- property once
Run only once. Default: twice if no guidetree given
This property controls the addition of the -once switch, treat this property as a boolean.
- property printnodes
Output each node; mostly for debugging
This property controls the addition of the -printnodes switch, treat this property as a boolean.
- property pwdist
Expected pairwise distance for computing guidetree. Default: dna 0.25 / prot 0.5
This controls the addition of the -pwdist parameter and its associated value. Set this property to the argument value required.
- property pwgenomic
Do pairwise alignment, no guidetree
This property controls the addition of the -pwgenomic switch, treat this property as a boolean.
- property pwgenomicdist
Distance for pairwise alignment. Default: 0.3
This controls the addition of the -pwgenomicdist parameter and its associated value. Set this property to the argument value required.
- property quiet
Reduce verbosity
This property controls the addition of the -quiet switch, treat this property as a boolean.
- property realbranches
Disable branch length truncation
This property controls the addition of the -realbranches switch, treat this property as a boolean.
- property rho
Purine/pyrimidine ratio. Default: 1
This controls the addition of the -rho parameter and its associated value. Set this property to the argument value required.
- property scalebranches
Scale branch lengths. Default: dna 1 / prot 2
This controls the addition of the -scalebranches parameter and its associated value. Set this property to the argument value required.
- property shortnames
Truncate names at first space
This property controls the addition of the -shortnames switch, treat this property as a boolean.
- property showtree
Output dnd tree files (PRANK v.120626 and later)
This property controls the addition of the -showtree switch, treat this property as a boolean.
- property showxml
Output XML files (PRANK v.120626 and later)
This property controls the addition of the -showxml switch, treat this property as a boolean.
- property skipins
Skip insertions in posterior support
This property controls the addition of the -skipins switch, treat this property as a boolean.
- property t
Input guide tree filename
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
- property termgap
Penalise terminal gaps normally
This property controls the addition of the -termgap switch, treat this property as a boolean.
- property translate
Translate to protein
This property controls the addition of the -translate switch, treat this property as a boolean.
- property tree
Input guide tree as Newick string
This controls the addition of the -tree parameter and its associated value. Set this property to the argument value required.
- property twice
Always run twice
This property controls the addition of the -twice switch, treat this property as a boolean.
- property uselogs
Slower but should work for a greater number of sequences
This property controls the addition of the -uselogs switch, treat this property as a boolean.
- property writeanc
Output ancestral sequences
This property controls the addition of the -writeanc switch, treat this property as a boolean.
- class Bio.Align.Applications.MafftCommandline(cmd='mafft', **kwargs)
Bases:
AbstractCommandline
Command line wrapper for the multiple alignment program MAFFT.
http://align.bmr.kyushu-u.ac.jp/mafft/software/
Notes
Last checked against version: MAFFT v6.717b (2009/12/03)
References
Katoh, Toh (BMC Bioinformatics 9:212, 2008) Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework (describes RNA structural alignment methods)
Katoh, Toh (Briefings in Bioinformatics 9:286-298, 2008) Recent developments in the MAFFT multiple sequence alignment program (outlines version 6)
Katoh, Toh (Bioinformatics 23:372-374, 2007) Errata PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences (describes the PartTree algorithm)
Katoh, Kuma, Toh, Miyata (Nucleic Acids Res. 33:511-518, 2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment (describes [ancestral versions of] the G-INS-i, L-INS-i and E-INS-i strategies)
Katoh, Misawa, Kuma, Miyata (Nucleic Acids Res. 30:3059-3066, 2002)
Examples
>>> from Bio.Align.Applications import MafftCommandline >>> mafft_exe = "/opt/local/mafft" >>> in_file = "../Doc/examples/opuntia.fasta" >>> mafft_cline = MafftCommandline(mafft_exe, input=in_file) >>> print(mafft_cline) /opt/local/mafft ../Doc/examples/opuntia.fasta
If the mafft binary is on the path (typically the case on a Unix style operating system) then you don’t need to supply the executable location:
>>> from Bio.Align.Applications import MafftCommandline >>> in_file = "../Doc/examples/opuntia.fasta" >>> mafft_cline = MafftCommandline(input=in_file) >>> print(mafft_cline) mafft ../Doc/examples/opuntia.fasta
You would typically run the command line with mafft_cline() or via the Python subprocess module, as described in the Biopython tutorial.
Note that MAFFT will write the alignment to stdout, which you may want to save to a file and then parse, e.g.:
stdout, stderr = mafft_cline() with open("aligned.fasta", "w") as handle: handle.write(stdout) from Bio import AlignIO align = AlignIO.read("aligned.fasta", "fasta")
Alternatively, to parse the output with AlignIO directly you can use StringIO to turn the string into a handle:
stdout, stderr = mafft_cline() from io import StringIO from Bio import AlignIO align = AlignIO.read(StringIO(stdout), "fasta")
- __init__(cmd='mafft', **kwargs)
Initialize the class.
- property LEXP
Gap extension penalty to skip the alignment. Default: 0.00
This controls the addition of the –LEXP parameter and its associated value. Set this property to the argument value required.
- property LOP
Gap opening penalty to skip the alignment. Default: -6.00
This controls the addition of the –LOP parameter and its associated value. Set this property to the argument value required.
- property aamatrix
Use a user-defined AA scoring matrix. Default: BLOSUM62
This controls the addition of the –aamatrix parameter and its associated value. Set this property to the argument value required.
- property adjustdirection
Adjust direction according to the first sequence. Default off.
This property controls the addition of the –adjustdirection switch, treat this property as a boolean.
- property adjustdirectionaccurately
Adjust direction according to the first sequence,for highly diverged data; very slowDefault off.
This property controls the addition of the –adjustdirectionaccurately switch, treat this property as a boolean.
- property amino
Assume the sequences are amino acid (True/False). Default: auto
This property controls the addition of the –amino switch, treat this property as a boolean.
- property auto
Automatically select strategy. Default off.
This property controls the addition of the –auto switch, treat this property as a boolean.
- property bl
BLOSUM number matrix is used. Default: 62
This controls the addition of the –bl parameter and its associated value. Set this property to the argument value required.
- property clustalout
Output format: clustal (True) or fasta (False, default)
This property controls the addition of the –clustalout switch, treat this property as a boolean.
- property dpparttree
The PartTree algorithm is used with distances based on DP. Default: off
This property controls the addition of the –dpparttree switch, treat this property as a boolean.
- property ep
Offset value, which works like gap extension penalty, for group-to- group alignment. Default: 0.123
This controls the addition of the –ep parameter and its associated value. Set this property to the argument value required.
- property fastapair
All pairwise alignments are computed with FASTA (Pearson and Lipman 1988). Default: off
This property controls the addition of the –fastapair switch, treat this property as a boolean.
- property fastaparttree
The PartTree algorithm is used with distances based on FASTA. Default: off
This property controls the addition of the –fastaparttree switch, treat this property as a boolean.
- property fft
Use FFT approximation in group-to-group alignment. Default: on
This property controls the addition of the –fft switch, treat this property as a boolean.
- property fmodel
Incorporate the AA/nuc composition information into the scoring matrix (True) or not (False, default)
This property controls the addition of the –fmodel switch, treat this property as a boolean.
- property genafpair
All pairwise alignments are computed with a local algorithm with the generalized affine gap cost (Altschul 1998). Default: off
This property controls the addition of the –genafpair switch, treat this property as a boolean.
- property globalpair
All pairwise alignments are computed with the Needleman-Wunsch algorithm. Default: off
This property controls the addition of the –globalpair switch, treat this property as a boolean.
- property groupsize
Do not make alignment larger than number sequences. Default: the number of input sequences
This property controls the addition of the –groupsize switch, treat this property as a boolean.
- property input
Input file name
This controls the addition of the input parameter and its associated value. Set this property to the argument value required.
- property input1
Second input file name for the mafft-profile command
This controls the addition of the input1 parameter and its associated value. Set this property to the argument value required.
- property inputorder
Output order: same as input (True, default) or alignment based (False)
This property controls the addition of the –inputorder switch, treat this property as a boolean.
- property jtt
JTT PAM number (Jones et al. 1992) matrix is used. number>0. Default: BLOSUM62
This controls the addition of the –jtt parameter and its associated value. Set this property to the argument value required.
- property lep
Offset value at local pairwise alignment. Default: 0.1
This controls the addition of the –lep parameter and its associated value. Set this property to the argument value required.
- property lexp
Gap extension penalty at local pairwise alignment. Default: -0.1
This controls the addition of the –lexp parameter and its associated value. Set this property to the argument value required.
- property localpair
All pairwise alignments are computed with the Smith-Waterman algorithm. Default: off
This property controls the addition of the –localpair switch, treat this property as a boolean.
- property lop
Gap opening penalty at local pairwise alignment. Default: 0.123
This controls the addition of the –lop parameter and its associated value. Set this property to the argument value required.
- property maxiterate
Number cycles of iterative refinement are performed. Default: 0
This controls the addition of the –maxiterate parameter and its associated value. Set this property to the argument value required.
- property memsave
Use the Myers-Miller (1988) algorithm. Default: automatically turned on when the alignment length exceeds 10,000 (aa/nt).
This property controls the addition of the –memsave switch, treat this property as a boolean.
- property namelength
Name length in CLUSTAL and PHYLIP output.
MAFFT v6.847 (2011) added –namelength for use with the –clustalout option for CLUSTAL output.
MAFFT v7.024 (2013) added support for this with the –phylipout option for PHYLIP output (default 10).
This controls the addition of the –namelength parameter and its associated value. Set this property to the argument value required.
- property nofft
Do not use FFT approximation in group-to-group alignment. Default: off
This property controls the addition of the –nofft switch, treat this property as a boolean.
- property noscore
Alignment score is not checked in the iterative refinement stage. Default: off (score is checked)
This property controls the addition of the –noscore switch, treat this property as a boolean.
- property nuc
Assume the sequences are nucleotide (True/False). Default: auto
This property controls the addition of the –nuc switch, treat this property as a boolean.
- property op
Gap opening penalty at group-to-group alignment. Default: 1.53
This controls the addition of the –op parameter and its associated value. Set this property to the argument value required.
- property partsize
The number of partitions in the PartTree algorithm. Default: 50
This controls the addition of the –partsize parameter and its associated value. Set this property to the argument value required.
- property parttree
Use a fast tree-building method with the 6mer distance. Default: off
This property controls the addition of the –parttree switch, treat this property as a boolean.
- property phylipout
Output format: phylip (True), or fasta (False, default)
This property controls the addition of the –phylipout switch, treat this property as a boolean.
- property quiet
Do not report progress (True) or not (False, default).
This property controls the addition of the –quiet switch, treat this property as a boolean.
- property reorder
Output order: aligned (True) or in input order (False, default)
This property controls the addition of the –reorder switch, treat this property as a boolean.
- property retree
Guide tree is built number times in the progressive stage. Valid with 6mer distance. Default: 2
This controls the addition of the –retree parameter and its associated value. Set this property to the argument value required.
- property seed
Seed alignments given in alignment_n (fasta format) are aligned with sequences in input.
This controls the addition of the –seed parameter and its associated value. Set this property to the argument value required.
- property sixmerpair
Distance is calculated based on the number of shared 6mers. Default: on
This property controls the addition of the –6merpair switch, treat this property as a boolean.
- property thread
Number of threads to use. Default: 1
This controls the addition of the –thread parameter and its associated value. Set this property to the argument value required.
- property tm
Transmembrane PAM number (Jones et al. 1994) matrix is used. number>0. Default: BLOSUM62
This controls the addition of the –tm parameter and its associated value. Set this property to the argument value required.
- property treeout
Guide tree is output to the input.tree file (True) or not (False, default)
This property controls the addition of the –treeout switch, treat this property as a boolean.
- property weighti
Weighting factor for the consistency term calculated from pairwise alignments. Default: 2.7
This controls the addition of the –weighti parameter and its associated value. Set this property to the argument value required.
- class Bio.Align.Applications.DialignCommandline(cmd='dialign2-2', **kwargs)
Bases:
AbstractCommandline
Command line wrapper for the multiple alignment program DIALIGN2-2.
http://bibiserv.techfak.uni-bielefeld.de/dialign/welcome.html
Notes
Last checked against version: 2.2
References
B. Morgenstern (2004). DIALIGN: Multiple DNA and Protein Sequence Alignment at BiBiServ. Nucleic Acids Research 32, W33-W36.
Examples
To align a FASTA file (unaligned.fasta) with the output files names aligned.* including a FASTA output file (aligned.fa), use:
>>> from Bio.Align.Applications import DialignCommandline >>> dialign_cline = DialignCommandline(input="unaligned.fasta", ... fn="aligned", fa=True) >>> print(dialign_cline) dialign2-2 -fa -fn aligned unaligned.fasta
You would typically run the command line with dialign_cline() or via the Python subprocess module, as described in the Biopython tutorial.
- __init__(cmd='dialign2-2', **kwargs)
Initialize the class.
- property afc
Creates additional output file ‘*.afc’ containing data of all fragments considered for alignment WARNING: this file can be HUGE !
This property controls the addition of the -afc switch, treat this property as a boolean.
- property afc_v
Like ‘-afc’ but verbose: fragments are explicitly printed. WARNING: this file can be EVEN BIGGER !
This property controls the addition of the -afc_v switch, treat this property as a boolean.
- property anc
Anchored alignment. Requires a file <seq_file>.anc containing anchor points.
This property controls the addition of the -anc switch, treat this property as a boolean.
- property cs
If segments are translated, not only the ‘Watson strand’ but also the ‘Crick strand’ is looked at.
This property controls the addition of the -cs switch, treat this property as a boolean.
- property cw
Additional output file in CLUSTAL W format.
This property controls the addition of the -cw switch, treat this property as a boolean.
- property ds
‘dna alignment speed up’ - non-translated nucleic acid fragments are taken into account only if they start with at least two matches. Speeds up DNA alignment at the expense of sensitivity.
This property controls the addition of the -ds switch, treat this property as a boolean.
- property fa
Additional output file in FASTA format.
This property controls the addition of the -fa switch, treat this property as a boolean.
- property ff
Creates file *.frg containing information about all fragments that are part of the respective optimal pairwise alignmnets plus information about consistency in the multiple alignment
This property controls the addition of the -ff switch, treat this property as a boolean.
- property fn
Output files are named <out_file>.<extension>.
This controls the addition of the -fn parameter and its associated value. Set this property to the argument value required.
- property fop
Creates file *.fop containing coordinates of all fragments that are part of the respective pairwise alignments.
This property controls the addition of the -fop switch, treat this property as a boolean.
- property fsm
Creates file *.fsm containing coordinates of all fragments that are part of the final alignment
This property controls the addition of the -fsm switch, treat this property as a boolean.
- property input
Input file name. Must be FASTA format
This controls the addition of the input parameter and its associated value. Set this property to the argument value required.
- property iw
Overlap weights switched off (by default, overlap weights are used if up to 35 sequences are aligned). This option speeds up the alignment but may lead to reduced alignment quality.
This property controls the addition of the -iw switch, treat this property as a boolean.
- property lgs
‘long genomic sequences’ - combines the following options: -ma, -thr 2, -lmax 30, -smin 8, -nta, -ff, -fop, -ff, -cs, -ds, -pst
This property controls the addition of the -lgs switch, treat this property as a boolean.
- property lgs_t
Like ‘-lgs’ but with all segment pairs assessed at the peptide level (rather than ‘mixed alignments’ as with the ‘-lgs’ option). Therefore faster than -lgs but not very sensitive for non-coding regions.
This property controls the addition of the -lgs_t switch, treat this property as a boolean.
- property lmax
Maximum fragment length = x (default: x = 40 or x = 120 for ‘translated’ fragments). Shorter x speeds up the program but may affect alignment quality.
This controls the addition of the -lmax parameter and its associated value. Set this property to the argument value required.
- property lo
(Long Output) Additional file *.log with information about fragments selected for pairwise alignment and about consistency in multi-alignment procedure.
This property controls the addition of the -lo switch, treat this property as a boolean.
- property ma
‘mixed alignments’ consisting of P-fragments and N-fragments if nucleic acid sequences are aligned.
This property controls the addition of the -ma switch, treat this property as a boolean.
- property mask
Residues not belonging to selected fragments are replaced by ‘*’ characters in output alignment (rather than being printed in lower-case characters)
This property controls the addition of the -mask switch, treat this property as a boolean.
- property mat
Creates file *mat with substitution counts derived from the fragments that have been selected for alignment.
This property controls the addition of the -mat switch, treat this property as a boolean.
- property mat_thr
Like ‘-mat’ but only fragments with weight score > t are considered
This property controls the addition of the -mat_thr switch, treat this property as a boolean.
- property max_link
‘maximum linkage’ clustering used to construct sequence tree (instead of UPGMA).
This property controls the addition of the -max_link switch, treat this property as a boolean.
- property min_link
‘minimum linkage’ clustering used.
This property controls the addition of the -min_link switch, treat this property as a boolean.
- property mot
‘motif’ option.
This controls the addition of the -mot parameter and its associated value. Set this property to the argument value required.
- property msf
Separate output file in MSF format.
This property controls the addition of the -msf switch, treat this property as a boolean.
- property n
Input sequences are nucleic acid sequences. No translation of fragments.
This property controls the addition of the -n switch, treat this property as a boolean.
- property nt
Input sequences are nucleic acid sequences and ‘nucleic acid segments’ are translated to ‘peptide segments’.
This property controls the addition of the -nt switch, treat this property as a boolean.
- property nta
‘no textual alignment’ - textual alignment suppressed. This option makes sense if other output files are of interest – e.g. the fragment files created with -ff, -fop, -fsm or -lo.
This property controls the addition of the -nta switch, treat this property as a boolean.
- property o
Fast version, resulting alignments may be slightly different.
This property controls the addition of the -o switch, treat this property as a boolean.
- property ow
Overlap weights enforced (By default, overlap weights are used only if up to 35 sequences are aligned since calculating overlap weights is time consuming).
This property controls the addition of the -ow switch, treat this property as a boolean.
- property pst
‘print status’. Creates and updates a file *.sta with information about the current status of the program run. This option is recommended if large data sets are aligned since it allows the user to estimate the remaining running time.
This property controls the addition of the -pst switch, treat this property as a boolean.
- property smin
Minimum similarity value for first residue pair (or codon pair) in fragments. Speeds up protein alignment or alignment of translated DNA fragments at the expense of sensitivity.
This property controls the addition of the -smin switch, treat this property as a boolean.
- property stars
Maximum number of ‘*’ characters indicating degree of local similarity among sequences. By default, no stars are used but numbers between 0 and 9, instead.
This controls the addition of the -stars parameter and its associated value. Set this property to the argument value required.
- property stdo
Results written to standard output.
This property controls the addition of the -stdo switch, treat this property as a boolean.
- property ta
Standard textual alignment printed (overrides suppression of textual alignments in special options, e.g. -lgs)
This property controls the addition of the -ta switch, treat this property as a boolean.
- property thr
Threshold T = x.
This controls the addition of the -thr parameter and its associated value. Set this property to the argument value required.
- property xfr
‘exclude fragments’ - list of fragments can be specified that are NOT considered for pairwise alignment
This property controls the addition of the -xfr switch, treat this property as a boolean.
- class Bio.Align.Applications.ProbconsCommandline(cmd='probcons', **kwargs)
Bases:
AbstractCommandline
Command line wrapper for the multiple alignment program PROBCONS.
Notes
Last checked against version: 1.12
References
Do, C.B., Mahabhashyam, M.S.P., Brudno, M., and Batzoglou, S. 2005. PROBCONS: Probabilistic Consistency-based Multiple Sequence Alignment. Genome Research 15: 330-340.
Examples
To align a FASTA file (unaligned.fasta) with the output in ClustalW format, and otherwise default settings, use:
>>> from Bio.Align.Applications import ProbconsCommandline >>> probcons_cline = ProbconsCommandline(input="unaligned.fasta", ... clustalw=True) >>> print(probcons_cline) probcons -clustalw unaligned.fasta
You would typically run the command line with probcons_cline() or via the Python subprocess module, as described in the Biopython tutorial.
Note that PROBCONS will write the alignment to stdout, which you may want to save to a file and then parse, e.g.:
stdout, stderr = probcons_cline() with open("aligned.aln", "w") as handle: handle.write(stdout) from Bio import AlignIO align = AlignIO.read("aligned.fasta", "clustalw")
Alternatively, to parse the output with AlignIO directly you can use StringIO to turn the string into a handle:
stdout, stderr = probcons_cline() from io import StringIO from Bio import AlignIO align = AlignIO.read(StringIO(stdout), "clustalw")
- __init__(cmd='probcons', **kwargs)
Initialize the class.
- property a
Print sequences in alignment order rather than input order (default: off)
This property controls the addition of the -a switch, treat this property as a boolean.
- property annot
Write annotation for multiple alignment to FILENAME
This controls the addition of the -annot parameter and its associated value. Set this property to the argument value required.
- property clustalw
Use CLUSTALW output format instead of MFA
This property controls the addition of the -clustalw switch, treat this property as a boolean.
- property consistency
Use 0 <= REPS <= 5 (default: 2) passes of consistency transformation
This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.
- property emissions
Also reestimate emission probabilities (default: off)
This property controls the addition of the -e switch, treat this property as a boolean.
- property input
Input file name. Must be multiple FASTA alignment (MFA) format
This controls the addition of the input parameter and its associated value. Set this property to the argument value required.
- property ir
Use 0 <= REPS <= 1000 (default: 100) passes of iterative-refinement
This controls the addition of the -ir parameter and its associated value. Set this property to the argument value required.
- property pairs
Generate all-pairs pairwise alignments
This property controls the addition of the -pairs switch, treat this property as a boolean.
- property paramfile
Read parameters from FILENAME
This controls the addition of the -p parameter and its associated value. Set this property to the argument value required.
- property pre
Use 0 <= REPS <= 20 (default: 0) rounds of pretraining
This controls the addition of the -pre parameter and its associated value. Set this property to the argument value required.
- property train
Compute EM transition probabilities, store in FILENAME (default: no training)
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
- property verbose
Report progress while aligning (default: off)
This property controls the addition of the -verbose switch, treat this property as a boolean.
- property viterbi
Use Viterbi algorithm to generate all pairs (automatically enables -pairs)
This property controls the addition of the -viterbi switch, treat this property as a boolean.
- class Bio.Align.Applications.TCoffeeCommandline(cmd='t_coffee', **kwargs)
Bases:
AbstractCommandline
Commandline object for the TCoffee alignment program.
http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html
The T-Coffee command line tool has a lot of switches and options. This wrapper implements a VERY limited number of options - if you would like to help improve it please get in touch.
Notes
Last checked against: Version_6.92
References
T-Coffee: A novel method for multiple sequence alignments. Notredame, Higgins, Heringa, JMB,302(205-217) 2000
Examples
To align a FASTA file (unaligned.fasta) with the output in ClustalW format (file aligned.aln), and otherwise default settings, use:
>>> from Bio.Align.Applications import TCoffeeCommandline >>> tcoffee_cline = TCoffeeCommandline(infile="unaligned.fasta", ... output="clustalw", ... outfile="aligned.aln") >>> print(tcoffee_cline) t_coffee -output clustalw -infile unaligned.fasta -outfile aligned.aln
You would typically run the command line with tcoffee_cline() or via the Python subprocess module, as described in the Biopython tutorial.
- SEQ_TYPES = ['dna', 'protein', 'dna_protein']
- __init__(cmd='t_coffee', **kwargs)
Initialize the class.
- property convert
Specify you want to perform a file conversion
This property controls the addition of the -convert switch, treat this property as a boolean.
- property gapext
Indicates the penalty applied for extending a gap (negative integer)
This controls the addition of the -gapext parameter and its associated value. Set this property to the argument value required.
- property gapopen
Indicates the penalty applied for opening a gap (negative integer)
This controls the addition of the -gapopen parameter and its associated value. Set this property to the argument value required.
- property infile
Specify the input file.
This controls the addition of the -infile parameter and its associated value. Set this property to the argument value required.
- property matrix
Specify the filename of the substitution matrix to use. Default: blosum62mt
This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.
- property mode
Specifies a special mode: genome, quickaln, dali, 3dcoffee
This controls the addition of the -mode parameter and its associated value. Set this property to the argument value required.
- property outfile
Specify the output file. Default: <your sequences>.aln
This controls the addition of the -outfile parameter and its associated value. Set this property to the argument value required.
- property outorder
Specify the order of sequence to outputEither ‘input’, ‘aligned’ or <filename> of Fasta file with sequence order
This controls the addition of the -outorder parameter and its associated value. Set this property to the argument value required.
- property output
Specify the output type.
One (or more separated by a comma) of: ‘clustalw_aln’, ‘clustalw’, ‘gcg’, ‘msf_aln’, ‘pir_aln’, ‘fasta_aln’, ‘phylip’, ‘pir_seq’, ‘fasta_seq’
This controls the addition of the -output parameter and its associated value. Set this property to the argument value required.
- property quiet
Turn off log output
This property controls the addition of the -quiet switch, treat this property as a boolean.
- property type
Specify the type of sequence being aligned
This controls the addition of the -type parameter and its associated value. Set this property to the argument value required.
- class Bio.Align.Applications.MSAProbsCommandline(cmd='msaprobs', **kwargs)
Bases:
AbstractCommandline
Command line wrapper for MSAProbs.
http://msaprobs.sourceforge.net
Notes
Last checked against version: 0.9.7
References
Yongchao Liu, Bertil Schmidt, Douglas L. Maskell: “MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities”. Bioinformatics, 2010, 26(16): 1958 -1964
Examples
>>> from Bio.Align.Applications import MSAProbsCommandline >>> in_file = "unaligned.fasta" >>> out_file = "aligned.cla" >>> cline = MSAProbsCommandline(infile=in_file, outfile=out_file, clustalw=True) >>> print(cline) msaprobs -o aligned.cla -clustalw unaligned.fasta
You would typically run the command line with cline() or via the Python subprocess module, as described in the Biopython tutorial.
- __init__(cmd='msaprobs', **kwargs)
Initialize the class.
- property alignment_order
print sequences in alignment order rather than input order (default: off)
This property controls the addition of the -a switch, treat this property as a boolean.
- property annot
write annotation for multiple alignment to FILENAME
This controls the addition of the -annot parameter and its associated value. Set this property to the argument value required.
- property clustalw
use CLUSTALW output format instead of FASTA format
This property controls the addition of the -clustalw switch, treat this property as a boolean.
- property consistency
use 0 <= REPS <= 5 (default: 2) passes of consistency transformation
This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.
- property infile
Multiple sequence input file
This controls the addition of the infile parameter and its associated value. Set this property to the argument value required.
- property iterative_refinement
use 0 <= REPS <= 1000 (default: 10) passes of iterative-refinement
This controls the addition of the -ir parameter and its associated value. Set this property to the argument value required.
- property numthreads
specify the number of threads used, and otherwise detect automatically
This controls the addition of the -num_threads parameter and its associated value. Set this property to the argument value required.
- property outfile
specify the output file name (STDOUT by default)
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
- property verbose
report progress while aligning (default: off)
This property controls the addition of the -v switch, treat this property as a boolean.
- property version
print out version of MSAPROBS
This controls the addition of the -version parameter and its associated value. Set this property to the argument value required.