Bio.Align.Applications package¶
Module contents¶
Alignment command line tool wrappers.
-
class
Bio.Align.Applications.
MuscleCommandline
(cmd='muscle', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command line wrapper for the multiple alignment program MUSCLE.
Notes
Last checked against version: 3.7, briefly against 3.8
References
Edgar, Robert C. (2004), MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Research 32(5), 1792-97.
Edgar, R.C. (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5(1): 113.
Examples
>>> from Bio.Align.Applications import MuscleCommandline >>> muscle_exe = r"C:\Program Files\Aligments\muscle3.8.31_i86win32.exe" >>> in_file = r"C:\My Documents\unaligned.fasta" >>> out_file = r"C:\My Documents\aligned.fasta" >>> muscle_cline = MuscleCommandline(muscle_exe, input=in_file, out=out_file) >>> print(muscle_cline) "C:\Program Files\Aligments\muscle3.8.31_i86win32.exe" -in "C:\My Documents\unaligned.fasta" -out "C:\My Documents\aligned.fasta"
You would typically run the command line with muscle_cline() or via the Python subprocess module, as described in the Biopython tutorial.
-
__init__
(self, cmd='muscle', **kwargs)¶ Initialize the class.
-
property
anchors
¶ Use anchor optimisation in tree dependent refinement iterations
This property controls the addition of the -anchors switch, treat this property as a boolean.
-
property
anchorspacing
¶ Minimum spacing between anchor columns
This controls the addition of the -anchorspacing parameter and its associated value. Set this property to the argument value required.
-
property
brenner
¶ Use Steve Brenner’s root alignment method
This property controls the addition of the -brenner switch, treat this property as a boolean.
-
property
center
¶ Center parameter - should be negative
This controls the addition of the -center parameter and its associated value. Set this property to the argument value required.
-
property
cluster
¶ Perform fast clustering of input sequences, use -tree1 to save tree
This property controls the addition of the -cluster switch, treat this property as a boolean.
-
property
cluster1
¶ Clustering method used in iteration 1
This controls the addition of the -cluster1 parameter and its associated value. Set this property to the argument value required.
-
property
cluster2
¶ Clustering method used in iteration 2
This controls the addition of the -cluster2 parameter and its associated value. Set this property to the argument value required.
-
property
clw
¶ Write output in CLUSTALW format (with a MUSCLE header)
This property controls the addition of the -clw switch, treat this property as a boolean.
-
property
clwout
¶ Write CLUSTALW output (with MUSCLE header) to specified filename
This controls the addition of the -clwout parameter and its associated value. Set this property to the argument value required.
-
property
clwstrict
¶ Write output in CLUSTALW format with version1.81 header
This property controls the addition of the -clwstrict switch, treat this property as a boolean.
-
property
clwstrictout
¶ Write CLUSTALW output (with version 1.81 header) to specified filename
This controls the addition of the -clwstrictout parameter and its associated value. Set this property to the argument value required.
-
property
core
¶ Do not catch exceptions
This property controls the addition of the -core switch, treat this property as a boolean.
-
property
diagbreak
¶ Maximum distance between two diagonals that allows them to merge into one diagonal
This controls the addition of the -diagbreak parameter and its associated value. Set this property to the argument value required.
-
property
diaglength
¶ Minimum length of diagonal
This controls the addition of the -diaglength parameter and its associated value. Set this property to the argument value required.
-
property
diagmargin
¶ Discard this many positions at ends of diagonal
This controls the addition of the -diagmargin parameter and its associated value. Set this property to the argument value required.
-
property
diags
¶ Find diagonals (faster for similar sequences)
This property controls the addition of the -diags switch, treat this property as a boolean.
-
property
dimer
¶ Use faster (slightly less accurate) dimer approximationfor the SP score
This property controls the addition of the -dimer switch, treat this property as a boolean.
-
property
distance1
¶ Distance measure for iteration 1
This controls the addition of the -distance1 parameter and its associated value. Set this property to the argument value required.
-
property
distance2
¶ Distance measure for iteration 2
This controls the addition of the -distance2 parameter and its associated value. Set this property to the argument value required.
-
property
fasta
¶ Write output in FASTA format
This property controls the addition of the -fasta switch, treat this property as a boolean.
-
property
fastaout
¶ Write FASTA format output to specified filename
This controls the addition of the -fastaout parameter and its associated value. Set this property to the argument value required.
-
property
gapextend
¶ Gap extension penalty
This controls the addition of the -gapextend parameter and its associated value. Set this property to the argument value required.
-
property
gapopen
¶ Gap open score - negative number
This controls the addition of the -gapopen parameter and its associated value. Set this property to the argument value required.
-
property
group
¶ Group similar sequences in output
This property controls the addition of the -group switch, treat this property as a boolean.
-
property
html
¶ Write output in HTML format
This property controls the addition of the -html switch, treat this property as a boolean.
-
property
htmlout
¶ Write HTML output to specified filename
This controls the addition of the -htmlout parameter and its associated value. Set this property to the argument value required.
-
property
hydro
¶ Window size for hydrophobic region
This controls the addition of the -hydro parameter and its associated value. Set this property to the argument value required.
-
property
hydrofactor
¶ Multiplier for gap penalties in hydrophobic regions
This controls the addition of the -hydrofactor parameter and its associated value. Set this property to the argument value required.
-
property
in1
¶ First input filename for profile alignment
This controls the addition of the -in1 parameter and its associated value. Set this property to the argument value required.
-
property
in2
¶ Second input filename for a profile alignment
This controls the addition of the -in2 parameter and its associated value. Set this property to the argument value required.
-
property
input
¶ Input filename
This controls the addition of the -in parameter and its associated value. Set this property to the argument value required.
-
property
le
¶ Use log-expectation profile score (VTML240)
This property controls the addition of the -le switch, treat this property as a boolean.
-
property
log
¶ Log file name
This controls the addition of the -log parameter and its associated value. Set this property to the argument value required.
-
property
loga
¶ Log file name (append to existing file)
This controls the addition of the -loga parameter and its associated value. Set this property to the argument value required.
-
property
matrix
¶ path to NCBI or WU-BLAST format protein substitution matrix - also set -gapopen, -gapextend and -center
This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.
-
property
maxdiagbreak
¶ Deprecated in v3.8, use -diagbreak instead.
This controls the addition of the -maxdiagbreak parameter and its associated value. Set this property to the argument value required.
-
property
maxhours
¶ Maximum time to run in hours
This controls the addition of the -maxhours parameter and its associated value. Set this property to the argument value required.
-
property
maxiters
¶ Maximum number of iterations
This controls the addition of the -maxiters parameter and its associated value. Set this property to the argument value required.
-
property
maxtrees
¶ Maximum number of trees to build in iteration 2
This controls the addition of the -maxtrees parameter and its associated value. Set this property to the argument value required.
-
property
minbestcolscore
¶ Minimum score a column must have to be an anchor
This controls the addition of the -minbestcolscore parameter and its associated value. Set this property to the argument value required.
-
property
minsmoothscore
¶ Minimum smoothed score a column must have to be an anchor
This controls the addition of the -minsmoothscore parameter and its associated value. Set this property to the argument value required.
-
property
msf
¶ Write output in MSF format
This property controls the addition of the -msf switch, treat this property as a boolean.
-
property
msfout
¶ Write MSF format output to specified filename
This controls the addition of the -msfout parameter and its associated value. Set this property to the argument value required.
-
property
noanchors
¶ Do not use anchor optimisation in tree dependent refinement iterations
This property controls the addition of the -noanchors switch, treat this property as a boolean.
-
property
nocore
¶ Catch exceptions
This property controls the addition of the -nocore switch, treat this property as a boolean.
-
property
objscore
¶ Objective score used by tree dependent refinement
This controls the addition of the -objscore parameter and its associated value. Set this property to the argument value required.
-
property
out
¶ Output filename
This controls the addition of the -out parameter and its associated value. Set this property to the argument value required.
-
property
phyi
¶ Write output in PHYLIP interleaved format
This property controls the addition of the -phyi switch, treat this property as a boolean.
-
property
phyiout
¶ Write PHYLIP interleaved output to specified filename
This controls the addition of the -phyiout parameter and its associated value. Set this property to the argument value required.
-
property
phys
¶ Write output in PHYLIP sequential format
This property controls the addition of the -phys switch, treat this property as a boolean.
-
property
physout
¶ Write PHYLIP sequential format to specified filename
This controls the addition of the -physout parameter and its associated value. Set this property to the argument value required.
-
property
profile
¶ Perform a profile alignment
This property controls the addition of the -profile switch, treat this property as a boolean.
-
property
quiet
¶ Do not display progress messages
This property controls the addition of the -quiet switch, treat this property as a boolean.
-
property
refine
¶ Only do tree dependent refinement
This property controls the addition of the -refine switch, treat this property as a boolean.
-
property
refinew
¶ Only do tree dependent refinement using sliding window approach
This property controls the addition of the -refinew switch, treat this property as a boolean.
-
property
refinewindow
¶ Length of window for -refinew
This controls the addition of the -refinewindow parameter and its associated value. Set this property to the argument value required.
-
property
root1
¶ Method used to root tree in iteration 1
This controls the addition of the -root1 parameter and its associated value. Set this property to the argument value required.
-
property
root2
¶ Method used to root tree in iteration 2
This controls the addition of the -root2 parameter and its associated value. Set this property to the argument value required.
-
property
scorefile
¶ Score file name, contains one line for each column in the alignment with average BLOSUM62 score
This controls the addition of the -scorefile parameter and its associated value. Set this property to the argument value required.
-
property
seqtype
¶ Sequence type
This controls the addition of the -seqtype parameter and its associated value. Set this property to the argument value required.
-
property
smoothscoreceil
¶ Maximum value of column score for smoothing
This controls the addition of the -smoothscoreceil parameter and its associated value. Set this property to the argument value required.
-
property
smoothwindow
¶ Window used for anchor column smoothing
This controls the addition of the -smoothwindow parameter and its associated value. Set this property to the argument value required.
-
property
sp
¶ Use sum-of-pairs protein profile score (PAM200)
This property controls the addition of the -sp switch, treat this property as a boolean.
-
property
spn
¶ Use sum-of-pairs protein nucleotide profile score
This property controls the addition of the -spn switch, treat this property as a boolean.
-
property
spscore
¶ Compute SP objective score of multiple alignment
This controls the addition of the -spscore parameter and its associated value. Set this property to the argument value required.
-
property
stable
¶ Do not group similar sequences in output (not supported in v3.8)
This property controls the addition of the -stable switch, treat this property as a boolean.
-
property
sueff
¶ Constant used in UPGMB clustering
This controls the addition of the -sueff parameter and its associated value. Set this property to the argument value required.
-
property
sv
¶ Use sum-of-pairs profile score (VTML240)
This property controls the addition of the -sv switch, treat this property as a boolean.
-
property
tree1
¶ Save Newick tree from iteration 1
This controls the addition of the -tree1 parameter and its associated value. Set this property to the argument value required.
-
property
tree2
¶ Save Newick tree from iteration 2
This controls the addition of the -tree2 parameter and its associated value. Set this property to the argument value required.
-
property
usetree
¶ Use given Newick tree as guide tree
This controls the addition of the -usetree parameter and its associated value. Set this property to the argument value required.
-
property
verbose
¶ Write parameter settings and progress
This property controls the addition of the -verbose switch, treat this property as a boolean.
-
property
version
¶ Write version string to stdout and exit
This property controls the addition of the -version switch, treat this property as a boolean.
-
property
weight1
¶ Weighting scheme used in iteration 1
This controls the addition of the -weight1 parameter and its associated value. Set this property to the argument value required.
-
property
weight2
¶ Weighting scheme used in iteration 2
This controls the addition of the -weight2 parameter and its associated value. Set this property to the argument value required.
-
-
class
Bio.Align.Applications.
ClustalwCommandline
(cmd='clustalw', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command line wrapper for clustalw (version one or two).
Notes
Last checked against versions: 1.83 and 2.1
References
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947-2948.
Examples
>>> from Bio.Align.Applications import ClustalwCommandline >>> in_file = "unaligned.fasta" >>> clustalw_cline = ClustalwCommandline("clustalw2", infile=in_file) >>> print(clustalw_cline) clustalw2 -infile=unaligned.fasta
You would typically run the command line with clustalw_cline() or via the Python subprocess module, as described in the Biopython tutorial.
-
__init__
(self, cmd='clustalw', **kwargs)¶ Initialize the class.
-
property
align
¶ Do full multiple alignment.
This property controls the addition of the -align switch, treat this property as a boolean.
-
property
bootlabels
¶ Node OR branch position of bootstrap values in tree display
This controls the addition of the -bootlabels parameter and its associated value. Set this property to the argument value required.
-
property
bootstrap
¶ Bootstrap a NJ tree (n= number of bootstraps; def. = 1000).
This controls the addition of the -bootstrap parameter and its associated value. Set this property to the argument value required.
-
property
case
¶ LOWER or UPPER (for GDE output only)
This controls the addition of the -case parameter and its associated value. Set this property to the argument value required.
-
property
check
¶ Outline the command line params.
This property controls the addition of the -check switch, treat this property as a boolean.
-
property
clustering
¶ NJ or UPGMA
This controls the addition of the -clustering parameter and its associated value. Set this property to the argument value required.
-
property
convert
¶ Output the input sequences in a different file format.
This property controls the addition of the -convert switch, treat this property as a boolean.
-
property
dnamatrix
¶ DNA weight matrix=IUB, CLUSTALW or filename
This controls the addition of the -dnamatrix parameter and its associated value. Set this property to the argument value required.
-
property
endgaps
¶ No end gap separation pen.
This property controls the addition of the -endgaps switch, treat this property as a boolean.
-
property
fullhelp
¶ Output full help content.
This property controls the addition of the -fullhelp switch, treat this property as a boolean.
-
property
gapdist
¶ Gap separation pen. range
This controls the addition of the -gapdist parameter and its associated value. Set this property to the argument value required.
-
property
gapext
¶ Gap extension penalty
This controls the addition of the -gapext parameter and its associated value. Set this property to the argument value required.
-
property
gapopen
¶ Gap opening penalty
This controls the addition of the -gapopen parameter and its associated value. Set this property to the argument value required.
-
property
helixendin
¶ Number of residues inside helix to be treated as terminal
This controls the addition of the -helixendin parameter and its associated value. Set this property to the argument value required.
-
property
helixendout
¶ Number of residues outside helix to be treated as terminal
This controls the addition of the -helixendout parameter and its associated value. Set this property to the argument value required.
-
property
helixgap
¶ Gap penalty for helix core residues
This controls the addition of the -helixgap parameter and its associated value. Set this property to the argument value required.
-
property
help
¶ Outline the command line params.
This property controls the addition of the -help switch, treat this property as a boolean.
-
property
hgapresidues
¶ List hydrophilic res.
This property controls the addition of the -hgapresidues switch, treat this property as a boolean.
-
property
infile
¶ Input sequences.
This controls the addition of the -infile parameter and its associated value. Set this property to the argument value required.
-
property
iteration
¶ NONE or TREE or ALIGNMENT
This controls the addition of the -iteration parameter and its associated value. Set this property to the argument value required.
-
property
kimura
¶ Use Kimura’s correction.
This property controls the addition of the -kimura switch, treat this property as a boolean.
-
property
ktuple
¶ Word size
This controls the addition of the -ktuple parameter and its associated value. Set this property to the argument value required.
-
property
loopgap
¶ Gap penalty for loop regions
This controls the addition of the -loopgap parameter and its associated value. Set this property to the argument value required.
-
property
matrix
¶ Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.
-
property
maxdiv
¶ % ident. for delay
This controls the addition of the -maxdiv parameter and its associated value. Set this property to the argument value required.
-
property
maxseqlen
¶ Maximum allowed input sequence length
This controls the addition of the -maxseqlen parameter and its associated value. Set this property to the argument value required.
-
property
negative
¶ Protein alignment with negative values in matrix
This property controls the addition of the -negative switch, treat this property as a boolean.
-
property
newtree
¶ Output file name for newly created guide tree
This controls the addition of the -newtree parameter and its associated value. Set this property to the argument value required.
-
property
newtree1
¶ Output file name for new guide tree of profile1
This controls the addition of the -newtree1 parameter and its associated value. Set this property to the argument value required.
-
property
newtree2
¶ Output file for new guide tree of profile2
This controls the addition of the -newtree2 parameter and its associated value. Set this property to the argument value required.
-
property
nohgap
¶ Hydrophilic gaps off
This property controls the addition of the -nohgap switch, treat this property as a boolean.
-
property
nopgap
¶ Residue-specific gaps off
This property controls the addition of the -nopgap switch, treat this property as a boolean.
-
property
nosecstr1
¶ Do not use secondary structure-gap penalty mask for profile 1
This property controls the addition of the -nosecstr1 switch, treat this property as a boolean.
-
property
nosecstr2
¶ Do not use secondary structure-gap penalty mask for profile 2
This property controls the addition of the -nosecstr2 switch, treat this property as a boolean.
-
property
noweights
¶ Disable sequence weighting
This property controls the addition of the -noweights switch, treat this property as a boolean.
-
property
numiter
¶ maximum number of iterations to perform
This controls the addition of the -numiter parameter and its associated value. Set this property to the argument value required.
-
property
options
¶ List the command line parameters
This property controls the addition of the -options switch, treat this property as a boolean.
-
property
outfile
¶ Output sequence alignment file name
This controls the addition of the -outfile parameter and its associated value. Set this property to the argument value required.
-
property
outorder
¶ Output taxon order: INPUT or ALIGNED
This controls the addition of the -outorder parameter and its associated value. Set this property to the argument value required.
-
property
output
¶ Output format: CLUSTAL(default), GCG, GDE, PHYLIP, PIR, NEXUS and FASTA
This controls the addition of the -output parameter and its associated value. Set this property to the argument value required.
-
property
outputtree
¶ nj OR phylip OR dist OR nexus
This controls the addition of the -outputtree parameter and its associated value. Set this property to the argument value required.
-
property
pairgap
¶ Gap penalty
This controls the addition of the -pairgap parameter and its associated value. Set this property to the argument value required.
-
property
pim
¶ Output percent identity matrix (while calculating the tree).
This property controls the addition of the -pim switch, treat this property as a boolean.
-
property
profile
¶ Merge two alignments by profile alignment
This property controls the addition of the -profile switch, treat this property as a boolean.
-
property
profile1
¶ Profiles (old alignment).
This controls the addition of the -profile1 parameter and its associated value. Set this property to the argument value required.
-
property
profile2
¶ Profiles (old alignment).
This controls the addition of the -profile2 parameter and its associated value. Set this property to the argument value required.
-
property
pwdnamatrix
¶ DNA weight matrix=IUB, CLUSTALW or filename
This controls the addition of the -pwdnamatrix parameter and its associated value. Set this property to the argument value required.
-
property
pwgapext
¶ Gap extension penalty
This controls the addition of the -pwgapext parameter and its associated value. Set this property to the argument value required.
-
property
pwgapopen
¶ Gap opening penalty
This controls the addition of the -pwgapopen parameter and its associated value. Set this property to the argument value required.
-
property
pwmatrix
¶ Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
This controls the addition of the -pwmatrix parameter and its associated value. Set this property to the argument value required.
-
property
quicktree
¶ Use FAST algorithm for the alignment guide tree
This property controls the addition of the -quicktree switch, treat this property as a boolean.
-
property
quiet
¶ Reduce console output to minimum
This property controls the addition of the -quiet switch, treat this property as a boolean.
-
property
range
¶ Sequence range to write starting m to m+n. Input as string eg. ‘24,200’
This controls the addition of the -range parameter and its associated value. Set this property to the argument value required.
-
property
score
¶ Either: PERCENT or ABSOLUTE
This controls the addition of the -score parameter and its associated value. Set this property to the argument value required.
-
property
secstrout
¶ STRUCTURE or MASK or BOTH or NONE output in alignment file
This controls the addition of the -secstrout parameter and its associated value. Set this property to the argument value required.
-
property
seed
¶ Seed number for bootstraps.
This controls the addition of the -seed parameter and its associated value. Set this property to the argument value required.
-
property
seqno_range
¶ OFF or ON (NEW- for all output formats)
This controls the addition of the -seqno_range parameter and its associated value. Set this property to the argument value required.
-
property
seqnos
¶ OFF or ON (for Clustal output only)
This controls the addition of the -seqnos parameter and its associated value. Set this property to the argument value required.
-
property
sequences
¶ Sequentially add profile2 sequences to profile1 alignment
This property controls the addition of the -sequences switch, treat this property as a boolean.
-
property
stats
¶ Log some alignment statistics to file
This controls the addition of the -stats parameter and its associated value. Set this property to the argument value required.
-
property
strandendin
¶ Number of residues inside strand to be treated as terminal
This controls the addition of the -strandendin parameter and its associated value. Set this property to the argument value required.
-
property
strandendout
¶ Number of residues outside strand to be treated as terminal
This controls the addition of the -strandendout parameter and its associated value. Set this property to the argument value required.
-
property
strandgap
¶ gap penalty for strand core residues
This controls the addition of the -strandgap parameter and its associated value. Set this property to the argument value required.
-
property
terminalgap
¶ Gap penalty for structure termini
This controls the addition of the -terminalgap parameter and its associated value. Set this property to the argument value required.
-
property
topdiags
¶ Number of best diags.
This controls the addition of the -topdiags parameter and its associated value. Set this property to the argument value required.
-
property
tossgaps
¶ Ignore positions with gaps.
This property controls the addition of the -tossgaps switch, treat this property as a boolean.
-
property
transweight
¶ Transitions weighting
This controls the addition of the -transweight parameter and its associated value. Set this property to the argument value required.
-
property
tree
¶ Calculate NJ tree.
This property controls the addition of the -tree switch, treat this property as a boolean.
-
property
type
¶ PROTEIN or DNA sequences
This controls the addition of the -type parameter and its associated value. Set this property to the argument value required.
-
property
usetree
¶ File name of guide tree
This controls the addition of the -usetree parameter and its associated value. Set this property to the argument value required.
-
property
usetree1
¶ File name of guide tree for profile1
This controls the addition of the -usetree1 parameter and its associated value. Set this property to the argument value required.
-
property
usetree2
¶ File name of guide tree for profile2
This controls the addition of the -usetree2 parameter and its associated value. Set this property to the argument value required.
-
property
window
¶ Window around best diags.
This controls the addition of the -window parameter and its associated value. Set this property to the argument value required.
-
-
class
Bio.Align.Applications.
ClustalOmegaCommandline
(cmd='clustalo', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command line wrapper for clustal omega.
Notes
Last checked against version: 1.2.0
References
Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology 7:539 https://doi.org/10.1038/msb.2011.75
Examples
>>> from Bio.Align.Applications import ClustalOmegaCommandline >>> in_file = "unaligned.fasta" >>> out_file = "aligned.fasta" >>> clustalomega_cline = ClustalOmegaCommandline(infile=in_file, outfile=out_file, verbose=True, auto=True) >>> print(clustalomega_cline) clustalo -i unaligned.fasta -o aligned.fasta --auto -v
You would typically run the command line with clustalomega_cline() or via the Python subprocess module, as described in the Biopython tutorial.
-
__init__
(self, cmd='clustalo', **kwargs)¶ Initialize the class.
-
property
auto
¶ Set options automatically (might overwrite some of your options)
This property controls the addition of the –auto switch, treat this property as a boolean.
-
property
clusteringout
¶ Clustering output file
This controls the addition of the –clustering-out parameter and its associated value. Set this property to the argument value required.
-
property
clustersize
¶ soft maximum of sequences in sub-clusters
This controls the addition of the –cluster-size parameter and its associated value. Set this property to the argument value required.
-
property
dealign
¶ Dealign input sequences
This property controls the addition of the –dealign switch, treat this property as a boolean.
-
property
distmat_full
¶ Use full distance matrix for guide-tree calculation (slow; mBed is default)
This property controls the addition of the –full switch, treat this property as a boolean.
-
property
distmat_full_iter
¶ Use full distance matrix for guide-tree calculation during iteration (mBed is default)
This property controls the addition of the –full-iter switch, treat this property as a boolean.
-
property
distmat_in
¶ Pairwise distance matrix input file (skips distance computation).
This controls the addition of the –distmat-in parameter and its associated value. Set this property to the argument value required.
-
property
distmat_out
¶ Pairwise distance matrix output file.
This controls the addition of the –distmat-out parameter and its associated value. Set this property to the argument value required.
-
property
force
¶ Force file overwriting.
This property controls the addition of the –force switch, treat this property as a boolean.
-
property
guidetree_in
¶ Guide tree input file (skips distance computation and guide-tree clustering step).
This controls the addition of the –guidetree-in parameter and its associated value. Set this property to the argument value required.
-
property
guidetree_out
¶ Guide tree output file.
This controls the addition of the –guidetree-out parameter and its associated value. Set this property to the argument value required.
-
property
help
¶ Print help and exit.
This property controls the addition of the -h switch, treat this property as a boolean.
-
property
hmm_input
¶ HMM input files
This controls the addition of the –hmm-in parameter and its associated value. Set this property to the argument value required.
-
property
infile
¶ Multiple sequence input file
This controls the addition of the -i parameter and its associated value. Set this property to the argument value required.
-
property
infmt
¶ Forced sequence input file format (default: auto)
Allowed values: a2m, fa[sta], clu[stal], msf, phy[lip], selex, st[ockholm], vie[nna]
This controls the addition of the –infmt parameter and its associated value. Set this property to the argument value required.
-
property
isprofile
¶ disable check if profile, force profile (default no)
This property controls the addition of the –is-profile switch, treat this property as a boolean.
-
property
iterations
¶ Number of (combined guide-tree/HMM) iterations
This controls the addition of the –iterations parameter and its associated value. Set this property to the argument value required.
-
property
log
¶ Log all non-essential output to this file.
This controls the addition of the -l parameter and its associated value. Set this property to the argument value required.
-
property
long_version
¶ Print long version information and exit
This property controls the addition of the –long-version switch, treat this property as a boolean.
-
property
max_guidetree_iterations
¶ Maximum number of guidetree iterations
This controls the addition of the –max-guidetree-iterations parameter and its associated value. Set this property to the argument value required.
-
property
max_hmm_iterations
¶ Maximum number of HMM iterations
This controls the addition of the –max-hmm-iterations parameter and its associated value. Set this property to the argument value required.
-
property
maxnumseq
¶ Maximum allowed number of sequences
This controls the addition of the –maxnumseq parameter and its associated value. Set this property to the argument value required.
-
property
maxseqlen
¶ Maximum allowed sequence length
This controls the addition of the –maxseqlen parameter and its associated value. Set this property to the argument value required.
-
property
outfile
¶ Multiple sequence alignment output file (default: stdout).
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
-
property
outfmt
¶ MSA output file format: a2m=fa[sta],clu[stal],msf,phy[lip],selex,st[ockholm],vie[nna] (default: fasta).
This controls the addition of the –outfmt parameter and its associated value. Set this property to the argument value required.
-
property
outputorder
¶ MSA output order like in input/guide-tree
This controls the addition of the –output-order parameter and its associated value. Set this property to the argument value required.
-
property
percentid
¶ convert distances into percent identities (default no)
This property controls the addition of the –percent-id switch, treat this property as a boolean.
-
property
profile1
¶ Pre-aligned multiple sequence file (aligned columns will be kept fix).
This controls the addition of the –profile1 parameter and its associated value. Set this property to the argument value required.
-
property
profile2
¶ Pre-aligned multiple sequence file (aligned columns will be kept fix).
This controls the addition of the –profile2 parameter and its associated value. Set this property to the argument value required.
-
property
residuenumber
¶ in Clustal format print residue numbers (default no)
This property controls the addition of the –residuenumber switch, treat this property as a boolean.
-
property
seqtype
¶ {Protein, RNA, DNA} Force a sequence type (default: auto).
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
-
property
threads
¶ Number of processors to use
This controls the addition of the –threads parameter and its associated value. Set this property to the argument value required.
-
property
usekimura
¶ use Kimura distance correction for aligned sequences (default no)
This property controls the addition of the –use-kimura switch, treat this property as a boolean.
-
property
verbose
¶ Verbose output
This property controls the addition of the -v switch, treat this property as a boolean.
-
property
version
¶ Print version information and exit
This property controls the addition of the –version switch, treat this property as a boolean.
-
property
wrap
¶ number of residues before line-wrap in output
This controls the addition of the –wrap parameter and its associated value. Set this property to the argument value required.
-
-
class
Bio.Align.Applications.
PrankCommandline
(cmd='prank', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command line wrapper for the multiple alignment program PRANK.
http://www.ebi.ac.uk/goldman-srv/prank/prank/
Notes
Last checked against version: 081202
References
Loytynoja, A. and Goldman, N. 2005. An algorithm for progressive multiple alignment of sequences with insertions. Proceedings of the National Academy of Sciences, 102: 10557–10562.
Loytynoja, A. and Goldman, N. 2008. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science, 320: 1632.
Examples
To align a FASTA file (unaligned.fasta) with the output in aligned FASTA format with the output filename starting with “aligned” (you can’t pick the filename explicitly), no tree output and no XML output, use:
>>> from Bio.Align.Applications import PrankCommandline >>> prank_cline = PrankCommandline(d="unaligned.fasta", ... o="aligned", # prefix only! ... f=8, # FASTA output ... notree=True, noxml=True) >>> print(prank_cline) prank -d=unaligned.fasta -o=aligned -f=8 -noxml -notree
You would typically run the command line with prank_cline() or via the Python subprocess module, as described in the Biopython tutorial.
-
__init__
(self, cmd='prank', **kwargs)¶ Initialize the class.
-
property
F
¶ Force insertions to be always skipped: same as +F
This property controls the addition of the -F switch, treat this property as a boolean.
-
property
codon
¶ Codon aware alignment or not
This property controls the addition of the -codon switch, treat this property as a boolean.
-
property
convert
¶ Convert input alignment to new format. Do not perform alignment
This property controls the addition of the -convert switch, treat this property as a boolean.
-
property
d
¶ Input filename
This controls the addition of the -d parameter and its associated value. Set this property to the argument value required.
-
property
dnafreqs
¶ DNA frequencies - ‘A,C,G,T’. eg ‘25,25,25,25’ as a quote surrounded string value. Default: empirical
This controls the addition of the -dnafreqs parameter and its associated value. Set this property to the argument value required.
-
property
dots
¶ Show insertion gaps as dots
This property controls the addition of the -dots switch, treat this property as a boolean.
-
property
f
¶ Output alignment format. Default: 8 FASTA Option are: 1. IG/Stanford 8. Pearson/Fasta 2. GenBank/GB 11. Phylip3.2 3. NBRF 12. Phylip 4. EMBL 14. PIR/CODATA 6. DNAStrider 15. MSF 7. Fitch 17. PAUP/NEXUS
This controls the addition of the -f parameter and its associated value. Set this property to the argument value required.
-
property
fixedbranches
¶ Use fixed branch lengths of input value
This controls the addition of the -fixedbranches parameter and its associated value. Set this property to the argument value required.
-
property
gapext
¶ Gap extension probability. Default: dna 0.5 / prot 0.5
This controls the addition of the -gapext parameter and its associated value. Set this property to the argument value required.
-
property
gaprate
¶ Gap opening rate. Default: dna 0.025 prot 0.0025
This controls the addition of the -gaprate parameter and its associated value. Set this property to the argument value required.
-
property
kappa
¶ Transition/transversion ratio. Default: 2
This controls the addition of the -kappa parameter and its associated value. Set this property to the argument value required.
-
property
longseq
¶ Save space in pairwise alignments
This property controls the addition of the -longseq switch, treat this property as a boolean.
-
property
m
¶ User-defined alignment model filename. Default: HKY2/WAG
This controls the addition of the -m parameter and its associated value. Set this property to the argument value required.
-
property
matinitsize
¶ Matrix initial size multiplier
This controls the addition of the -matinitsize parameter and its associated value. Set this property to the argument value required.
-
property
matresize
¶ Matrix resizing multiplier
This controls the addition of the -matresize parameter and its associated value. Set this property to the argument value required.
-
property
maxbranches
¶ Use maximum branch lengths of input value
This controls the addition of the -maxbranches parameter and its associated value. Set this property to the argument value required.
-
property
mttranslate
¶ Translate to protein using mt table
This property controls the addition of the -mttranslate switch, treat this property as a boolean.
-
property
nopost
¶ Do not compute posterior support. Default: compute
This property controls the addition of the -nopost switch, treat this property as a boolean.
-
property
notree
¶ Do not output dnd tree files (PRANK versions earlier than v.120626)
This property controls the addition of the -notree switch, treat this property as a boolean.
-
property
noxml
¶ Do not output XML files (PRANK versions earlier than v.120626)
This property controls the addition of the -noxml switch, treat this property as a boolean.
-
property
o
¶ - Output filenames prefix. Default: ‘output’
Will write: output.?.fas (depending on requested format), output.?.xml and output.?.dnd
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
-
property
once
¶ Run only once. Default: twice if no guidetree given
This property controls the addition of the -once switch, treat this property as a boolean.
-
property
printnodes
¶ Output each node; mostly for debugging
This property controls the addition of the -printnodes switch, treat this property as a boolean.
-
property
pwdist
¶ Expected pairwise distance for computing guidetree. Default: dna 0.25 / prot 0.5
This controls the addition of the -pwdist parameter and its associated value. Set this property to the argument value required.
-
property
pwgenomic
¶ Do pairwise alignment, no guidetree
This property controls the addition of the -pwgenomic switch, treat this property as a boolean.
-
property
pwgenomicdist
¶ Distance for pairwise alignment. Default: 0.3
This controls the addition of the -pwgenomicdist parameter and its associated value. Set this property to the argument value required.
-
property
quiet
¶ Reduce verbosity
This property controls the addition of the -quiet switch, treat this property as a boolean.
-
property
realbranches
¶ Disable branch length truncation
This property controls the addition of the -realbranches switch, treat this property as a boolean.
-
property
rho
¶ Purine/pyrimidine ratio. Default: 1
This controls the addition of the -rho parameter and its associated value. Set this property to the argument value required.
-
property
scalebranches
¶ Scale branch lengths. Default: dna 1 / prot 2
This controls the addition of the -scalebranches parameter and its associated value. Set this property to the argument value required.
-
property
shortnames
¶ Truncate names at first space
This property controls the addition of the -shortnames switch, treat this property as a boolean.
-
property
showtree
¶ Output dnd tree files (PRANK v.120626 and later)
This property controls the addition of the -showtree switch, treat this property as a boolean.
-
property
showxml
¶ Output XML files (PRANK v.120626 and later)
This property controls the addition of the -showxml switch, treat this property as a boolean.
-
property
skipins
¶ Skip insertions in posterior support
This property controls the addition of the -skipins switch, treat this property as a boolean.
-
property
t
¶ Input guide tree filename
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
-
property
termgap
¶ Penalise terminal gaps normally
This property controls the addition of the -termgap switch, treat this property as a boolean.
-
property
translate
¶ Translate to protein
This property controls the addition of the -translate switch, treat this property as a boolean.
-
property
tree
¶ Input guide tree as Newick string
This controls the addition of the -tree parameter and its associated value. Set this property to the argument value required.
-
property
twice
¶ Always run twice
This property controls the addition of the -twice switch, treat this property as a boolean.
-
property
uselogs
¶ Slower but should work for a greater number of sequences
This property controls the addition of the -uselogs switch, treat this property as a boolean.
-
property
writeanc
¶ Output ancestral sequences
This property controls the addition of the -writeanc switch, treat this property as a boolean.
-
-
class
Bio.Align.Applications.
MafftCommandline
(cmd='mafft', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command line wrapper for the multiple alignment program MAFFT.
http://align.bmr.kyushu-u.ac.jp/mafft/software/
Notes
Last checked against version: MAFFT v6.717b (2009/12/03)
References
Katoh, Toh (BMC Bioinformatics 9:212, 2008) Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework (describes RNA structural alignment methods)
Katoh, Toh (Briefings in Bioinformatics 9:286-298, 2008) Recent developments in the MAFFT multiple sequence alignment program (outlines version 6)
Katoh, Toh (Bioinformatics 23:372-374, 2007) Errata PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences (describes the PartTree algorithm)
Katoh, Kuma, Toh, Miyata (Nucleic Acids Res. 33:511-518, 2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment (describes [ancestral versions of] the G-INS-i, L-INS-i and E-INS-i strategies)
Katoh, Misawa, Kuma, Miyata (Nucleic Acids Res. 30:3059-3066, 2002)
Examples
>>> from Bio.Align.Applications import MafftCommandline >>> mafft_exe = "/opt/local/mafft" >>> in_file = "../Doc/examples/opuntia.fasta" >>> mafft_cline = MafftCommandline(mafft_exe, input=in_file) >>> print(mafft_cline) /opt/local/mafft ../Doc/examples/opuntia.fasta
If the mafft binary is on the path (typically the case on a Unix style operating system) then you don’t need to supply the executable location:
>>> from Bio.Align.Applications import MafftCommandline >>> in_file = "../Doc/examples/opuntia.fasta" >>> mafft_cline = MafftCommandline(input=in_file) >>> print(mafft_cline) mafft ../Doc/examples/opuntia.fasta
You would typically run the command line with mafft_cline() or via the Python subprocess module, as described in the Biopython tutorial.
Note that MAFFT will write the alignment to stdout, which you may want to save to a file and then parse, e.g.:
stdout, stderr = mafft_cline() with open("aligned.fasta", "w") as handle: handle.write(stdout) from Bio import AlignIO align = AlignIO.read("aligned.fasta", "fasta")
Alternatively, to parse the output with AlignIO directly you can use StringIO to turn the string into a handle:
stdout, stderr = mafft_cline() from StringIO import StringIO from Bio import AlignIO align = AlignIO.read(StringIO(stdout), "fasta")
-
__init__
(self, cmd='mafft', **kwargs)¶ Initialize the class.
-
property
LEXP
¶ Gap extension penalty to skip the alignment. Default: 0.00
This controls the addition of the –LEXP parameter and its associated value. Set this property to the argument value required.
-
property
LOP
¶ Gap opening penalty to skip the alignment. Default: -6.00
This controls the addition of the –LOP parameter and its associated value. Set this property to the argument value required.
-
property
aamatrix
¶ Use a user-defined AA scoring matrix. Default: BLOSUM62
This controls the addition of the –aamatrix parameter and its associated value. Set this property to the argument value required.
-
property
adjustdirection
¶ Adjust direction according to the first sequence. Default off.
This property controls the addition of the –adjustdirection switch, treat this property as a boolean.
-
property
adjustdirectionaccurately
¶ Adjust direction according to the first sequence,for highly diverged data; very slowDefault off.
This property controls the addition of the –adjustdirectionaccurately switch, treat this property as a boolean.
-
property
amino
¶ Assume the sequences are amino acid (True/False). Default: auto
This property controls the addition of the –amino switch, treat this property as a boolean.
-
property
auto
¶ Automatically select strategy. Default off.
This property controls the addition of the –auto switch, treat this property as a boolean.
-
property
bl
¶ BLOSUM number matrix is used. Default: 62
This controls the addition of the –bl parameter and its associated value. Set this property to the argument value required.
-
property
clustalout
¶ Output format: clustal (True) or fasta (False, default)
This property controls the addition of the –clustalout switch, treat this property as a boolean.
-
property
dpparttree
¶ The PartTree algorithm is used with distances based on DP. Default: off
This property controls the addition of the –dpparttree switch, treat this property as a boolean.
-
property
ep
¶ Offset value, which works like gap extension penalty, for group-to- group alignment. Default: 0.123
This controls the addition of the –ep parameter and its associated value. Set this property to the argument value required.
-
property
fastapair
¶ All pairwise alignments are computed with FASTA (Pearson and Lipman 1988). Default: off
This property controls the addition of the –fastapair switch, treat this property as a boolean.
-
property
fastaparttree
¶ The PartTree algorithm is used with distances based on FASTA. Default: off
This property controls the addition of the –fastaparttree switch, treat this property as a boolean.
-
property
fft
¶ Use FFT approximation in group-to-group alignment. Default: on
This property controls the addition of the –fft switch, treat this property as a boolean.
-
property
fmodel
¶ Incorporate the AA/nuc composition information into the scoring matrix (True) or not (False, default)
This property controls the addition of the –fmodel switch, treat this property as a boolean.
-
property
genafpair
¶ All pairwise alignments are computed with a local algorithm with the generalized affine gap cost (Altschul 1998). Default: off
This property controls the addition of the –genafpair switch, treat this property as a boolean.
-
property
globalpair
¶ All pairwise alignments are computed with the Needleman-Wunsch algorithm. Default: off
This property controls the addition of the –globalpair switch, treat this property as a boolean.
-
property
groupsize
¶ Do not make alignment larger than number sequences. Default: the number of input sequences
This property controls the addition of the –groupsize switch, treat this property as a boolean.
-
property
input
¶ Input file name
This controls the addition of the input parameter and its associated value. Set this property to the argument value required.
-
property
input1
¶ Second input file name for the mafft-profile command
This controls the addition of the input1 parameter and its associated value. Set this property to the argument value required.
-
property
inputorder
¶ Output order: same as input (True, default) or alignment based (False)
This property controls the addition of the –inputorder switch, treat this property as a boolean.
-
property
jtt
¶ JTT PAM number (Jones et al. 1992) matrix is used. number>0. Default: BLOSUM62
This controls the addition of the –jtt parameter and its associated value. Set this property to the argument value required.
-
property
lep
¶ Offset value at local pairwise alignment. Default: 0.1
This controls the addition of the –lep parameter and its associated value. Set this property to the argument value required.
-
property
lexp
¶ Gap extension penalty at local pairwise alignment. Default: -0.1
This controls the addition of the –lexp parameter and its associated value. Set this property to the argument value required.
-
property
localpair
¶ All pairwise alignments are computed with the Smith-Waterman algorithm. Default: off
This property controls the addition of the –localpair switch, treat this property as a boolean.
-
property
lop
¶ Gap opening penalty at local pairwise alignment. Default: 0.123
This controls the addition of the –lop parameter and its associated value. Set this property to the argument value required.
-
property
maxiterate
¶ Number cycles of iterative refinement are performed. Default: 0
This controls the addition of the –maxiterate parameter and its associated value. Set this property to the argument value required.
-
property
memsave
¶ Use the Myers-Miller (1988) algorithm. Default: automatically turned on when the alignment length exceeds 10,000 (aa/nt).
This property controls the addition of the –memsave switch, treat this property as a boolean.
-
property
namelength
¶ Name length in CLUSTAL and PHYLIP output.
MAFFT v6.847 (2011) added –namelength for use with the –clustalout option for CLUSTAL output.
MAFFT v7.024 (2013) added support for this with the –phylipout option for PHYLIP output (default 10).
This controls the addition of the –namelength parameter and its associated value. Set this property to the argument value required.
-
property
nofft
¶ Do not use FFT approximation in group-to-group alignment. Default: off
This property controls the addition of the –nofft switch, treat this property as a boolean.
-
property
noscore
¶ Alignment score is not checked in the iterative refinement stage. Default: off (score is checked)
This property controls the addition of the –noscore switch, treat this property as a boolean.
-
property
nuc
¶ Assume the sequences are nucleotide (True/False). Default: auto
This property controls the addition of the –nuc switch, treat this property as a boolean.
-
property
op
¶ Gap opening penalty at group-to-group alignment. Default: 1.53
This controls the addition of the –op parameter and its associated value. Set this property to the argument value required.
-
property
partsize
¶ The number of partitions in the PartTree algorithm. Default: 50
This controls the addition of the –partsize parameter and its associated value. Set this property to the argument value required.
-
property
parttree
¶ Use a fast tree-building method with the 6mer distance. Default: off
This property controls the addition of the –parttree switch, treat this property as a boolean.
-
property
phylipout
¶ Output format: phylip (True), or fasta (False, default)
This property controls the addition of the –phylipout switch, treat this property as a boolean.
-
property
quiet
¶ Do not report progress (True) or not (False, default).
This property controls the addition of the –quiet switch, treat this property as a boolean.
-
property
reorder
¶ Output order: aligned (True) or in input order (False, default)
This property controls the addition of the –reorder switch, treat this property as a boolean.
-
property
retree
¶ Guide tree is built number times in the progressive stage. Valid with 6mer distance. Default: 2
This controls the addition of the –retree parameter and its associated value. Set this property to the argument value required.
-
property
seed
¶ Seed alignments given in alignment_n (fasta format) are aligned with sequences in input.
This controls the addition of the –seed parameter and its associated value. Set this property to the argument value required.
-
property
sixmerpair
¶ Distance is calculated based on the number of shared 6mers. Default: on
This property controls the addition of the –6merpair switch, treat this property as a boolean.
-
property
thread
¶ Number of threads to use. Default: 1
This controls the addition of the –thread parameter and its associated value. Set this property to the argument value required.
-
property
tm
¶ Transmembrane PAM number (Jones et al. 1994) matrix is used. number>0. Default: BLOSUM62
This controls the addition of the –tm parameter and its associated value. Set this property to the argument value required.
-
property
treeout
¶ Guide tree is output to the input.tree file (True) or not (False, default)
This property controls the addition of the –treeout switch, treat this property as a boolean.
-
property
weighti
¶ Weighting factor for the consistency term calculated from pairwise alignments. Default: 2.7
This controls the addition of the –weighti parameter and its associated value. Set this property to the argument value required.
-
-
class
Bio.Align.Applications.
DialignCommandline
(cmd='dialign2-2', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command line wrapper for the multiple alignment program DIALIGN2-2.
http://bibiserv.techfak.uni-bielefeld.de/dialign/welcome.html
Notes
Last checked against version: 2.2
References
B. Morgenstern (2004). DIALIGN: Multiple DNA and Protein Sequence Alignment at BiBiServ. Nucleic Acids Research 32, W33-W36.
Examples
To align a FASTA file (unaligned.fasta) with the output files names aligned.* including a FASTA output file (aligned.fa), use:
>>> from Bio.Align.Applications import DialignCommandline >>> dialign_cline = DialignCommandline(input="unaligned.fasta", ... fn="aligned", fa=True) >>> print(dialign_cline) dialign2-2 -fa -fn aligned unaligned.fasta
You would typically run the command line with dialign_cline() or via the Python subprocess module, as described in the Biopython tutorial.
-
__init__
(self, cmd='dialign2-2', **kwargs)¶ Initialize the class.
-
property
afc
¶ Creates additional output file ‘*.afc’ containing data of all fragments considered for alignment WARNING: this file can be HUGE !
This property controls the addition of the -afc switch, treat this property as a boolean.
-
property
afc_v
¶ Like ‘-afc’ but verbose: fragments are explicitly printed. WARNING: this file can be EVEN BIGGER !
This property controls the addition of the -afc_v switch, treat this property as a boolean.
-
property
anc
¶ Anchored alignment. Requires a file <seq_file>.anc containing anchor points.
This property controls the addition of the -anc switch, treat this property as a boolean.
-
property
cs
¶ If segments are translated, not only the ‘Watson strand’ but also the ‘Crick strand’ is looked at.
This property controls the addition of the -cs switch, treat this property as a boolean.
-
property
cw
¶ Additional output file in CLUSTAL W format.
This property controls the addition of the -cw switch, treat this property as a boolean.
-
property
ds
¶ ‘dna alignment speed up’ - non-translated nucleic acid fragments are taken into account only if they start with at least two matches. Speeds up DNA alignment at the expense of sensitivity.
This property controls the addition of the -ds switch, treat this property as a boolean.
-
property
fa
¶ Additional output file in FASTA format.
This property controls the addition of the -fa switch, treat this property as a boolean.
-
property
ff
¶ Creates file *.frg containing information about all fragments that are part of the respective optimal pairwise alignmnets plus information about consistency in the multiple alignment
This property controls the addition of the -ff switch, treat this property as a boolean.
-
property
fn
¶ Output files are named <out_file>.<extension>.
This controls the addition of the -fn parameter and its associated value. Set this property to the argument value required.
-
property
fop
¶ Creates file *.fop containing coordinates of all fragments that are part of the respective pairwise alignments.
This property controls the addition of the -fop switch, treat this property as a boolean.
-
property
fsm
¶ Creates file *.fsm containing coordinates of all fragments that are part of the final alignment
This property controls the addition of the -fsm switch, treat this property as a boolean.
-
property
input
¶ Input file name. Must be FASTA format
This controls the addition of the input parameter and its associated value. Set this property to the argument value required.
-
property
iw
¶ Overlap weights switched off (by default, overlap weights are used if up to 35 sequences are aligned). This option speeds up the alignment but may lead to reduced alignment quality.
This property controls the addition of the -iw switch, treat this property as a boolean.
-
property
lgs
¶ ‘long genomic sequences’ - combines the following options: -ma, -thr 2, -lmax 30, -smin 8, -nta, -ff, -fop, -ff, -cs, -ds, -pst
This property controls the addition of the -lgs switch, treat this property as a boolean.
-
property
lgs_t
¶ Like ‘-lgs’ but with all segment pairs assessed at the peptide level (rather than ‘mixed alignments’ as with the ‘-lgs’ option). Therefore faster than -lgs but not very sensitive for non-coding regions.
This property controls the addition of the -lgs_t switch, treat this property as a boolean.
-
property
lmax
¶ Maximum fragment length = x (default: x = 40 or x = 120 for ‘translated’ fragments). Shorter x speeds up the program but may affect alignment quality.
This controls the addition of the -lmax parameter and its associated value. Set this property to the argument value required.
-
property
lo
¶ (Long Output) Additional file *.log with information about fragments selected for pairwise alignment and about consistency in multi-alignment procedure.
This property controls the addition of the -lo switch, treat this property as a boolean.
-
property
ma
¶ ‘mixed alignments’ consisting of P-fragments and N-fragments if nucleic acid sequences are aligned.
This property controls the addition of the -ma switch, treat this property as a boolean.
-
property
mask
¶ Residues not belonging to selected fragments are replaced by ‘*’ characters in output alignment (rather than being printed in lower-case characters)
This property controls the addition of the -mask switch, treat this property as a boolean.
-
property
mat
¶ Creates file *mat with substitution counts derived from the fragments that have been selected for alignment.
This property controls the addition of the -mat switch, treat this property as a boolean.
-
property
mat_thr
¶ Like ‘-mat’ but only fragments with weight score > t are considered
This property controls the addition of the -mat_thr switch, treat this property as a boolean.
-
property
max_link
¶ ‘maximum linkage’ clustering used to construct sequence tree (instead of UPGMA).
This property controls the addition of the -max_link switch, treat this property as a boolean.
-
property
min_link
¶ ‘minimum linkage’ clustering used.
This property controls the addition of the -min_link switch, treat this property as a boolean.
-
property
mot
¶ ‘motif’ option.
This controls the addition of the -mot parameter and its associated value. Set this property to the argument value required.
-
property
msf
¶ Separate output file in MSF format.
This property controls the addition of the -msf switch, treat this property as a boolean.
-
property
n
¶ Input sequences are nucleic acid sequences. No translation of fragments.
This property controls the addition of the -n switch, treat this property as a boolean.
-
property
nt
¶ Input sequences are nucleic acid sequences and ‘nucleic acid segments’ are translated to ‘peptide segments’.
This property controls the addition of the -nt switch, treat this property as a boolean.
-
property
nta
¶ ‘no textual alignment’ - textual alignment suppressed. This option makes sense if other output files are of interest – e.g. the fragment files created with -ff, -fop, -fsm or -lo.
This property controls the addition of the -nta switch, treat this property as a boolean.
-
property
o
¶ Fast version, resulting alignments may be slightly different.
This property controls the addition of the -o switch, treat this property as a boolean.
-
property
ow
¶ Overlap weights enforced (By default, overlap weights are used only if up to 35 sequences are aligned since calculating overlap weights is time consuming).
This property controls the addition of the -ow switch, treat this property as a boolean.
-
property
pst
¶ ‘print status’. Creates and updates a file *.sta with information about the current status of the program run. This option is recommended if large data sets are aligned since it allows the user to estimate the remaining running time.
This property controls the addition of the -pst switch, treat this property as a boolean.
-
property
smin
¶ Minimum similarity value for first residue pair (or codon pair) in fragments. Speeds up protein alignment or alignment of translated DNA fragments at the expense of sensitivity.
This property controls the addition of the -smin switch, treat this property as a boolean.
-
property
stars
¶ Maximum number of ‘*’ characters indicating degree of local similarity among sequences. By default, no stars are used but numbers between 0 and 9, instead.
This controls the addition of the -stars parameter and its associated value. Set this property to the argument value required.
-
property
stdo
¶ Results written to standard output.
This property controls the addition of the -stdo switch, treat this property as a boolean.
-
property
ta
¶ Standard textual alignment printed (overrides suppression of textual alignments in special options, e.g. -lgs)
This property controls the addition of the -ta switch, treat this property as a boolean.
-
property
thr
¶ Threshold T = x.
This controls the addition of the -thr parameter and its associated value. Set this property to the argument value required.
-
property
xfr
¶ ‘exclude fragments’ - list of fragments can be specified that are NOT considered for pairwise alignment
This property controls the addition of the -xfr switch, treat this property as a boolean.
-
-
class
Bio.Align.Applications.
ProbconsCommandline
(cmd='probcons', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command line wrapper for the multiple alignment program PROBCONS.
Notes
Last checked against version: 1.12
References
Do, C.B., Mahabhashyam, M.S.P., Brudno, M., and Batzoglou, S. 2005. PROBCONS: Probabilistic Consistency-based Multiple Sequence Alignment. Genome Research 15: 330-340.
Examples
To align a FASTA file (unaligned.fasta) with the output in ClustalW format, and otherwise default settings, use:
>>> from Bio.Align.Applications import ProbconsCommandline >>> probcons_cline = ProbconsCommandline(input="unaligned.fasta", ... clustalw=True) >>> print(probcons_cline) probcons -clustalw unaligned.fasta
You would typically run the command line with probcons_cline() or via the Python subprocess module, as described in the Biopython tutorial.
Note that PROBCONS will write the alignment to stdout, which you may want to save to a file and then parse, e.g.:
stdout, stderr = probcons_cline() with open("aligned.aln", "w") as handle: handle.write(stdout) from Bio import AlignIO align = AlignIO.read("aligned.fasta", "clustalw")
Alternatively, to parse the output with AlignIO directly you can use StringIO to turn the string into a handle:
stdout, stderr = probcons_cline() from StringIO import StringIO from Bio import AlignIO align = AlignIO.read(StringIO(stdout), "clustalw")
-
__init__
(self, cmd='probcons', **kwargs)¶ Initialize the class.
-
property
a
¶ Print sequences in alignment order rather than input order (default: off)
This property controls the addition of the -a switch, treat this property as a boolean.
-
property
annot
¶ Write annotation for multiple alignment to FILENAME
This controls the addition of the -annot parameter and its associated value. Set this property to the argument value required.
-
property
clustalw
¶ Use CLUSTALW output format instead of MFA
This property controls the addition of the -clustalw switch, treat this property as a boolean.
-
property
consistency
¶ Use 0 <= REPS <= 5 (default: 2) passes of consistency transformation
This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.
-
property
emissions
¶ Also reestimate emission probabilities (default: off)
This property controls the addition of the -e switch, treat this property as a boolean.
-
property
input
¶ Input file name. Must be multiple FASTA alignment (MFA) format
This controls the addition of the input parameter and its associated value. Set this property to the argument value required.
-
property
ir
¶ Use 0 <= REPS <= 1000 (default: 100) passes of iterative-refinement
This controls the addition of the -ir parameter and its associated value. Set this property to the argument value required.
-
property
pairs
¶ Generate all-pairs pairwise alignments
This property controls the addition of the -pairs switch, treat this property as a boolean.
-
property
paramfile
¶ Read parameters from FILENAME
This controls the addition of the -p parameter and its associated value. Set this property to the argument value required.
-
property
pre
¶ Use 0 <= REPS <= 20 (default: 0) rounds of pretraining
This controls the addition of the -pre parameter and its associated value. Set this property to the argument value required.
-
property
train
¶ Compute EM transition probabilities, store in FILENAME (default: no training)
This controls the addition of the -t parameter and its associated value. Set this property to the argument value required.
-
property
verbose
¶ Report progress while aligning (default: off)
This property controls the addition of the -verbose switch, treat this property as a boolean.
-
property
viterbi
¶ Use Viterbi algorithm to generate all pairs (automatically enables -pairs)
This property controls the addition of the -viterbi switch, treat this property as a boolean.
-
-
class
Bio.Align.Applications.
TCoffeeCommandline
(cmd='t_coffee', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Commandline object for the TCoffee alignment program.
http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html
The T-Coffee command line tool has a lot of switches and options. This wrapper implements a VERY limited number of options - if you would like to help improve it please get in touch.
Notes
Last checked against: Version_6.92
References
T-Coffee: A novel method for multiple sequence alignments. Notredame, Higgins, Heringa, JMB,302(205-217) 2000
Examples
To align a FASTA file (unaligned.fasta) with the output in ClustalW format (file aligned.aln), and otherwise default settings, use:
>>> from Bio.Align.Applications import TCoffeeCommandline >>> tcoffee_cline = TCoffeeCommandline(infile="unaligned.fasta", ... output="clustalw", ... outfile="aligned.aln") >>> print(tcoffee_cline) t_coffee -output clustalw -infile unaligned.fasta -outfile aligned.aln
You would typically run the command line with tcoffee_cline() or via the Python subprocess module, as described in the Biopython tutorial.
-
SEQ_TYPES
= ['dna', 'protein', 'dna_protein']¶
-
__init__
(self, cmd='t_coffee', **kwargs)¶ Initialize the class.
-
property
convert
¶ Specify you want to perform a file conversion
This property controls the addition of the -convert switch, treat this property as a boolean.
-
property
gapext
¶ Indicates the penalty applied for extending a gap (negative integer)
This controls the addition of the -gapext parameter and its associated value. Set this property to the argument value required.
-
property
gapopen
¶ Indicates the penalty applied for opening a gap (negative integer)
This controls the addition of the -gapopen parameter and its associated value. Set this property to the argument value required.
-
property
infile
¶ Specify the input file.
This controls the addition of the -infile parameter and its associated value. Set this property to the argument value required.
-
property
matrix
¶ Specify the filename of the substitution matrix to use.Default: blosum62mt
This controls the addition of the -matrix parameter and its associated value. Set this property to the argument value required.
-
property
mode
¶ Specifies a special mode: genome, quickaln, dali, 3dcoffee
This controls the addition of the -mode parameter and its associated value. Set this property to the argument value required.
-
property
outfile
¶ Specify the output file. Default: <your sequences>.aln
This controls the addition of the -outfile parameter and its associated value. Set this property to the argument value required.
-
property
outorder
¶ Specify the order of sequence to outputEither ‘input’, ‘aligned’ or <filename> of Fasta file with sequence order
This controls the addition of the -outorder parameter and its associated value. Set this property to the argument value required.
-
property
output
¶ Specify the output type.
One (or more separated by a comma) of: ‘clustalw_aln’, ‘clustalw’, ‘gcg’, ‘msf_aln’, ‘pir_aln’, ‘fasta_aln’, ‘phylip’, ‘pir_seq’, ‘fasta_seq’
This controls the addition of the -output parameter and its associated value. Set this property to the argument value required.
-
property
quiet
¶ Turn off log output
This property controls the addition of the -quiet switch, treat this property as a boolean.
-
property
type
¶ Specify the type of sequence being aligned
This controls the addition of the -type parameter and its associated value. Set this property to the argument value required.
-
-
class
Bio.Align.Applications.
MSAProbsCommandline
(cmd='msaprobs', **kwargs)¶ Bases:
Bio.Application.AbstractCommandline
Command line wrapper for MSAProbs.
http://msaprobs.sourceforge.net
Notes
Last checked against version: 0.9.7
References
Yongchao Liu, Bertil Schmidt, Douglas L. Maskell: “MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities”. Bioinformatics, 2010, 26(16): 1958 -1964
Examples
>>> from Bio.Align.Applications import MSAProbsCommandline >>> in_file = "unaligned.fasta" >>> out_file = "aligned.cla" >>> cline = MSAProbsCommandline(infile=in_file, outfile=out_file, clustalw=True) >>> print(cline) msaprobs -o aligned.cla -clustalw unaligned.fasta
You would typically run the command line with cline() or via the Python subprocess module, as described in the Biopython tutorial.
-
__init__
(self, cmd='msaprobs', **kwargs)¶ Initialize the class.
-
property
alignment_order
¶ print sequences in alignment order rather than input order (default: off)
This property controls the addition of the -a switch, treat this property as a boolean.
-
property
annot
¶ write annotation for multiple alignment to FILENAME
This controls the addition of the -annot parameter and its associated value. Set this property to the argument value required.
-
property
clustalw
¶ use CLUSTALW output format instead of FASTA format
This property controls the addition of the -clustalw switch, treat this property as a boolean.
-
property
consistency
¶ use 0 <= REPS <= 5 (default: 2) passes of consistency transformation
This controls the addition of the -c parameter and its associated value. Set this property to the argument value required.
-
property
infile
¶ Multiple sequence input file
This controls the addition of the infile parameter and its associated value. Set this property to the argument value required.
-
property
iterative_refinement
¶ use 0 <= REPS <= 1000 (default: 10) passes of iterative-refinement
This controls the addition of the -ir parameter and its associated value. Set this property to the argument value required.
-
property
numthreads
¶ specify the number of threads used, and otherwise detect automatically
This controls the addition of the -num_threads parameter and its associated value. Set this property to the argument value required.
-
property
outfile
¶ specify the output file name (STDOUT by default)
This controls the addition of the -o parameter and its associated value. Set this property to the argument value required.
-
property
verbose
¶ report progress while aligning (default: off)
This property controls the addition of the -v switch, treat this property as a boolean.
-
property
version
¶ print out version of MSAPROBS
This controls the addition of the -version parameter and its associated value. Set this property to the argument value required.
-