Bio.SeqUtils.MeltingTemp module¶
Calculate the melting temperature of nucleotide sequences.
This module contains three different methods to calculate the melting temperature of oligonucleotides:
Tm_Wallace: ‘Rule of thumb’
Tm_GC: Empirical formulas based on GC content. Salt and mismatch corrections can be included.
Tm_NN: Calculation based on nearest neighbor thermodynamics. Several tables for DNA/DNA, DNA/RNA and RNA/RNA hybridizations are included. Correction for mismatches, dangling ends, salt concentration and other additives are available.
Tm_staluc is the ‘old’ NN calculation and is kept for compatibility. It is, however, recommended to use Tm_NN instead, since Tm_staluc may be depreceated in the future. Also, Tm_NN has much more options. Using Tm_staluc and Tm_NN with default parameters gives (essentially) the same results.
- General parameters for most Tm methods:
seq – A Biopython sequence object or a string.
check – Checks if the sequence is valid for the given method (default= True). In general, whitespaces and non-base characters are removed and characters are converted to uppercase. RNA will be backtranscribed.
strict – Do not allow base characters or neighbor duplex keys (e.g. ‘AT/NA’) that could not or not unambigiously be evaluated for the respective method (default=True). Note that W (= A or T) and S (= C or G) are not ambiguous for Tm_Wallace and Tm_GC. If ‘False’, average values (if applicable) will be used.
This module is not able to detect self-complementary and it will not use alignment tools to align an oligonucleotide sequence to its target sequence. Thus it can not detect dangling-ends and mismatches by itself (don’t even think about bulbs and loops). These parameters have to be handed over to the respective method.
- Other public methods of this module:
make_table : To create a table with thermodynamic data.
salt_correction: To adjust Tm to a given salt concentration by different formulas. This method is called from Tm_GC and Tm_NN but may also be accessed ‘manually’. It returns a correction term, not a corrected Tm!
chem_correction: To adjust Tm regarding the chemical additives DMSO and formaldehyde. The method returns a corrected Tm. Chemical correction is not an integral part of the Tm methods and must be called additionally.
For example:
>>> from Bio.SeqUtils import MeltingTemp as mt
>>> from Bio.Seq import Seq
>>> mystring = 'CGTTCCAAAGATGTGGGCATGAGCTTAC'
>>> myseq = Seq(mystring)
>>> print('%0.2f' % mt.Tm_Wallace(mystring))
84.00
>>> print('%0.2f' % mt.Tm_Wallace(myseq))
84.00
>>> print('%0.2f' % mt.Tm_GC(myseq))
58.73
>>> print('%0.2f' % mt.Tm_NN(myseq))
60.32
Tm_NN with default values gives same result as ‘old’ Tm_staluc. However, values differ for RNA, since Tm_staluc had some errors for RNA calculation. These errors have been fixed in this version.
New Tm_NN can do slightly more: Using different thermodynamic tables, e.g. from Breslauer ‘86 or Sugimoto ‘96:
>>> print('%0.2f' % mt.Tm_NN(myseq, nn_table=mt.DNA_NN1)) # Breslauer '86
72.19
>>> print('%0.2f' % mt.Tm_NN(myseq, nn_table=mt.DNA_NN2)) # Sugimoto '96
65.47
Tables for RNA and RNA/DNA hybrids are included:
>>> print('%0.2f' % mt.Tm_NN(myseq, nn_table=mt.RNA_NN1)) # Freier '86
73.35
>>> print('%0.2f' % mt.Tm_NN(myseq, nn_table=mt.R_DNA_NN1)) # Sugimoto '95
58.45
Several types of salc correction (for Tm_NN and Tm_GC):
>>> for i in range(1, 8):
... print("Type: %d, Tm: %0.2f" % (i, Tm_NN(myseq, saltcorr=i)))
...
Type: 1, Tm: 54.27
Type: 2, Tm: 54.02
Type: 3, Tm: 59.60
Type: 4, Tm: 60.64
Type: 5, Tm: 60.32
Type: 6, Tm: 59.78
Type: 7, Tm: 59.78
Correction for other monovalent cations (K+, Tris), Mg2+ and dNTPs according to von Ahsen et al. (2001) or Owczarzy et al. (2008) (for Tm_NN and Tm_GC):
>>> print('%0.2f' % mt.Tm_NN(myseq, Na=50, Tris=10))
60.79
>>> print('%0.2f' % mt.Tm_NN(myseq, Na=50, Tris=10, Mg=1.5))
67.39
>>> print('%0.2f' % mt.Tm_NN(myseq, Na=50, Tris=10, Mg=1.5, saltcorr=7))
66.81
>>> print('%0.2f' % mt.Tm_NN(myseq, Na=50, Tris=10, Mg=1.5, dNTPs=0.6,
... saltcorr=7))
66.04
Dangling ends and mismatches, e.g.:
Oligo: CGTTCCaAAGATGTGGGCATGAGCTTAC CGTTCCaAAGATGTGGGCATGAGCTTAC
::::::X::::::::::::::::::::: or ::::::X:::::::::::::::::::::
Template: GCAAGGcTTCTACACCCGTACTCGAATG TGCAAGGcTTCTACACCCGTACTCGAATGC
Here:
>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC'))
60.32
>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC',
... c_seq='GCAAGGcTTCTACACCCGTACTCGAATG'))
55.39
>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC', shift=1,
... c_seq='TGCAAGGcTTCTACACCCGTACTCGAATGC'))
55.69
The same for RNA:
>>> print('%0.2f' % mt.Tm_NN('CGUUCCAAAGAUGUGGGCAUGAGCUUAC',
... c_seq='UGCAAGGcUUCUACACCCGUACUCGAAUGC',
... shift=1, nn_table=mt.RNA_NN3,
... de_table=mt.RNA_DE1))
73.00
Note, that thermodynamic data are not available for all kind of mismatches, e.g. most double mismatches or terminal mismatches combined with dangling ends:
>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC',
... c_seq='TtCAAGGcTTCTACACCCGTACTCGAATGC',
... shift=1))
Traceback (most recent call last):
ValueError: no thermodynamic data for neighbors '.C/TT' available
Make your own tables, or update/extend existing tables. E.g., add values for locked nucleotides. Here, ‘locked A’ (and its complement) should be represented by ‘1’:
>>> mytable = mt.make_table(oldtable=mt.DNA_NN3,
... values={'A1/T1':(-6.608, -17.235),
... '1A/1T':(-6.893, -15.923)})
>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC'))
60.32
>>> print('%0.2f' % mt.Tm_NN('CGTTCCA1AGATGTGGGCATGAGCTTAC',
... nn_table=mytable, check=False))
... # 'check' must be False, otherwise '1' would be discarded
62.53
- Bio.SeqUtils.MeltingTemp.make_table(oldtable=None, values=None)¶
Return a table with thermodynamic parameters (as dictionary).
- Arguments:
oldtable: An existing dictionary with thermodynamic parameters.
values: A dictionary with new or updated values.
E.g., to replace the initiation parameters in the Sugimoto ‘96 dataset with the initiation parameters from Allawi & SantaLucia ‘97:
>>> from Bio.SeqUtils.MeltingTemp import make_table, DNA_NN2 >>> table = DNA_NN2 # Sugimoto '96 >>> table['init_A/T'] (0, 0) >>> newtable = make_table(oldtable=DNA_NN2, values={'init': (0, 0), ... 'init_A/T': (2.3, 4.1), ... 'init_G/C': (0.1, -2.8)}) >>> print("%0.1f, %0.1f" % newtable['init_A/T']) 2.3, 4.1
- Bio.SeqUtils.MeltingTemp.salt_correction(Na=0, K=0, Tris=0, Mg=0, dNTPs=0, method=1, seq=None)¶
Calculate a term to correct Tm for salt ions.
Depending on the Tm calculation, the term will correct Tm or entropy. To calculate corrected Tm values, different operations need to be applied:
methods 1-4: Tm(new) = Tm(old) + corr
method 5: deltaS(new) = deltaS(old) + corr
methods 6+7: Tm(new) = 1/(1/Tm(old) + corr)
- Arguments:
Na, K, Tris, Mg, dNTPS: Millimolar concentration of respective ion. To have a simple ‘salt correction’, just pass Na. If any of K, Tris, Mg and dNTPS is non-zero, a ‘sodium-equivalent’ concentration is calculated according to von Ahsen et al. (2001, Clin Chem 47: 1956-1961): [Na_eq] = [Na+] + [K+] + [Tris]/2 + 120*([Mg2+] - [dNTPs])^0.5 If [dNTPs] >= [Mg2+]: [Na_eq] = [Na+] + [K+] + [Tris]/2
method: Which method to be applied. Methods 1-4 correct Tm, method 5 corrects deltaS, methods 6 and 7 correct 1/Tm. The methods are:
16.6 x log[Na+] (Schildkraut & Lifson (1965), Biopolymers 3: 195-208)
16.6 x log([Na+]/(1.0 + 0.7*[Na+])) (Wetmur (1991), Crit Rev Biochem Mol Biol 126: 227-259)
12.5 x log(Na+] (SantaLucia et al. (1996), Biochemistry 35: 3555-3562
11.7 x log[Na+] (SantaLucia (1998), Proc Natl Acad Sci USA 95: 1460-1465
Correction for deltaS: 0.368 x (N-1) x ln[Na+] (SantaLucia (1998), Proc Natl Acad Sci USA 95: 1460-1465)
(4.29(%GC)-3.95)x1e-5 x ln[Na+] + 9.40e-6 x ln[Na+]^2 (Owczarzy et al. (2004), Biochemistry 43: 3537-3554)
Complex formula with decision tree and 7 empirical constants. Mg2+ is corrected for dNTPs binding (if present) (Owczarzy et al. (2008), Biochemistry 47: 5336-5353)
Examples
>>> from Bio.SeqUtils import MeltingTemp as mt >>> print('%0.2f' % mt.salt_correction(Na=50, method=1)) -21.60 >>> print('%0.2f' % mt.salt_correction(Na=50, method=2)) -21.85 >>> print('%0.2f' % mt.salt_correction(Na=100, Tris=20, method=2)) -16.45 >>> print('%0.2f' % mt.salt_correction(Na=100, Tris=20, Mg=1.5, method=2)) -10.99
- Bio.SeqUtils.MeltingTemp.chem_correction(melting_temp, DMSO=0, fmd=0, DMSOfactor=0.75, fmdfactor=0.65, fmdmethod=1, GC=None)¶
Correct a given Tm for DMSO and formamide.
Please note that these corrections are +/- rough approximations.
- Arguments:
melting_temp: Melting temperature.
DMSO: Percent DMSO.
fmd: Formamide concentration in %(fmdmethod=1) or molar (fmdmethod=2).
DMSOfactor: How much should Tm decreases per percent DMSO. Default=0.65 (von Ahsen et al. 2001). Other published values are 0.5, 0.6 and 0.675.
fmdfactor: How much should Tm decrease per percent formamide. Default=0.65. Several papers report factors between 0.6 and 0.72.
fmdmethod:
Tm = Tm - factor(%formamide) (Default)
Tm = Tm + (0.453(f(GC)) - 2.88) x [formamide]
Here f(GC) is fraction of GC. Note (again) that in fmdmethod=1 formamide concentration is given in %, while in fmdmethod=2 it is given in molar.
GC: GC content in percent.
- Examples:
>>> from Bio.SeqUtils import MeltingTemp as mt >>> mt.chem_correction(70) 70 >>> print('%0.2f' % mt.chem_correction(70, DMSO=3)) 67.75 >>> print('%0.2f' % mt.chem_correction(70, fmd=5)) 66.75 >>> print('%0.2f' % mt.chem_correction(70, fmdmethod=2, fmd=1.25, ... GC=50)) 66.68
- Bio.SeqUtils.MeltingTemp.Tm_Wallace(seq, check=True, strict=True)¶
Calculate and return the Tm using the ‘Wallace rule’.
Tm = 4 degC * (G + C) + 2 degC * (A+T)
The Wallace rule (Thein & Wallace 1986, in Human genetic diseases: a practical approach, 33-50) is often used as rule of thumb for approximate Tm calculations for primers of 14 to 20 nt length.
Non-DNA characters (e.g., E, F, J, !, 1, etc) are ignored by this method.
- Examples:
>>> from Bio.SeqUtils import MeltingTemp as mt >>> mt.Tm_Wallace('ACGTTGCAATGCCGTA') 48.0 >>> mt.Tm_Wallace('ACGT TGCA ATGC CGTA') 48.0 >>> mt.Tm_Wallace('1ACGT2TGCA3ATGC4CGTA') 48.0
- Bio.SeqUtils.MeltingTemp.Tm_GC(seq, check=True, strict=True, valueset=7, userset=None, Na=50, K=0, Tris=0, Mg=0, dNTPs=0, saltcorr=0, mismatch=True)¶
Return the Tm using empirical formulas based on GC content.
General format: Tm = A + B(%GC) - C/N + salt correction - D(%mismatch)
A, B, C, D: empirical constants, N: primer length D (amount of decrease in Tm per % mismatch) is often 1, but sometimes other values have been used (0.6-1.5). Use ‘X’ to indicate the mismatch position in the sequence. Note that this mismatch correction is a rough estimate.
>>> from Bio.SeqUtils import MeltingTemp as mt >>> print("%0.2f" % mt.Tm_GC('CTGCTGATXGCACGAGGTTATGG', valueset=2)) 69.20
- Arguments:
valueset: A few often cited variants are included:
Tm = 69.3 + 0.41(%GC) - 650/N (Marmur & Doty 1962, J Mol Biol 5: 109-118; Chester & Marshak 1993), Anal Biochem 209: 284-290)
Tm = 81.5 + 0.41(%GC) - 675/N - %mismatch ‘QuikChange’ formula. Recommended (by the manufacturer) for the design of primers for QuikChange mutagenesis.
Tm = 81.5 + 0.41(%GC) - 675/N + 16.6 x log[Na+] (Marmur & Doty 1962, J Mol Biol 5: 109-118; Schildkraut & Lifson 1965, Biopolymers 3: 195-208)
Tm = 81.5 + 0.41(%GC) - 500/N + 16.6 x log([Na+]/(1.0 + 0.7 x [Na+])) - %mismatch (Wetmur 1991, Crit Rev Biochem Mol Biol 126: 227-259). This is the standard formula in approximative mode of MELTING 4.3.
Tm = 78 + 0.7(%GC) - 500/N + 16.6 x log([Na+]/(1.0 + 0.7 x [Na+])) - %mismatch (Wetmur 1991, Crit Rev Biochem Mol Biol 126: 227-259). For RNA.
Tm = 67 + 0.8(%GC) - 500/N + 16.6 x log([Na+]/(1.0 + 0.7 x [Na+])) - %mismatch (Wetmur 1991, Crit Rev Biochem Mol Biol 126: 227-259). For RNA/DNA hybrids.
Tm = 81.5 + 0.41(%GC) - 600/N + 16.6 x log[Na+] Used by Primer3Plus to calculate the product Tm. Default set.
Tm = 77.1 + 0.41(%GC) - 528/N + 11.7 x log[Na+] (von Ahsen et al. 2001, Clin Chem 47: 1956-1961). Recommended ‘as a tradeoff between accuracy and ease of use’.
userset: Tuple of four values for A, B, C, and D. Usersets override valuesets.
Na, K, Tris, Mg, dNTPs: Concentration of the respective ions [mM]. If any of K, Tris, Mg and dNTPS is non-zero, a ‘sodium-equivalent’ concentration is calculated and used for salt correction (von Ahsen et al., 2001).
saltcorr: Type of salt correction (see method salt_correction). Default=5. 0 or None means no salt correction.
mismatch: If ‘True’ (default) every ‘X’ in the sequence is counted as mismatch.
- Bio.SeqUtils.MeltingTemp.Tm_NN(seq, check=True, strict=True, c_seq=None, shift=0, nn_table=None, tmm_table=None, imm_table=None, de_table=None, dnac1=25, dnac2=25, selfcomp=False, Na=50, K=0, Tris=0, Mg=0, dNTPs=0, saltcorr=5)¶
Return the Tm using nearest neighbor thermodynamics.
- Arguments:
seq: The primer/probe sequence as string or Biopython sequence object. For RNA/DNA hybridizations seq must be the RNA sequence.
c_seq: Complementary sequence. The sequence of the template/target in 3’->5’ direction. c_seq is necessary for mismatch correction and dangling-ends correction. Both corrections will automatically be applied if mismatches or dangling ends are present. Default=None.
shift: Shift of the primer/probe sequence on the template/target sequence, e.g.:
shift=0 shift=1 shift= -1 Primer (seq): 5' ATGC... 5' ATGC... 5' ATGC... Template (c_seq): 3' TACG... 3' CTACG... 3' ACG...
The shift parameter is necessary to align seq and c_seq if they have different lengths or if they should have dangling ends. Default=0
table: Thermodynamic NN values, eight tables are implemented: For DNA/DNA hybridizations:
DNA_NN1: values from Breslauer et al. (1986)
DNA_NN2: values from Sugimoto et al. (1996)
DNA_NN3: values from Allawi & SantaLucia (1997) (default)
DNA_NN4: values from SantaLucia & Hicks (2004)
For RNA/RNA hybridizations:
RNA_NN1: values from Freier et al. (1986)
RNA_NN2: values from Xia et al. (1998)
RNA_NN3: valuse from Chen et al. (2012)
For RNA/DNA hybridizations:
R_DNA_NN1: values from Sugimoto et al. (1995) Note that
seq
must be the RNA sequence.
Use the module’s maketable method to make a new table or to update one one of the implemented tables.
tmm_table: Thermodynamic values for terminal mismatches. Default: DNA_TMM1 (SantaLucia & Peyret, 2001)
imm_table: Thermodynamic values for internal mismatches, may include insosine mismatches. Default: DNA_IMM1 (Allawi & SantaLucia, 1997-1998; Peyret et al., 1999; Watkins & SantaLucia, 2005)
de_table: Thermodynamic values for dangling ends:
DNA_DE1: for DNA. Values from Bommarito et al. (2000) (default)
RNA_DE1: for RNA. Values from Turner & Mathews (2010)
dnac1: Concentration of the higher concentrated strand [nM]. Typically this will be the primer (for PCR) or the probe. Default=25.
dnac2: Concentration of the lower concentrated strand [nM]. In PCR this is the template strand which concentration is typically very low and may be ignored (dnac2=0). In oligo/oligo hybridization experiments, dnac1 equals dnac1. Default=25. MELTING and Primer3Plus use k = [Oligo(Total)]/4 by default. To mimic this behaviour, you have to divide [Oligo(Total)] by 2 and assign this concentration to dnac1 and dnac2. E.g., Total oligo concentration of 50 nM in Primer3Plus means dnac1=25, dnac2=25.
selfcomp: Is the sequence self-complementary? Default=False. If ‘True’ the primer is thought binding to itself, thus dnac2 is not considered.
Na, K, Tris, Mg, dNTPs: See method ‘Tm_GC’ for details. Defaults: Na=50, K=0, Tris=0, Mg=0, dNTPs=0.
saltcorr: See method ‘Tm_GC’. Default=5. 0 means no salt correction.
- Bio.SeqUtils.MeltingTemp.Tm_staluc(s, dnac=50, saltc=50, rna=0)¶
Return DNA/DNA Tm using nearest neighbor thermodynamics (OBSOLETE).
This method may be depreceated in the future. Use Tm_NN instead. Tm_NN with default values gives the same result as Tm_staluc.
s is the sequence as string or Seq object dnac is DNA concentration [nM] saltc is salt concentration [mM]. rna=0 is for DNA/DNA (default), use 1 for RNA/RNA hybridisation.
For DNA/DNA, see Allawi & SantaLucia (1997), Biochemistry 36: 10581-10594 For RNA/RNA, see Xia et al (1998), Biochemistry 37: 14719-14735
Examples
>>> print("%0.2f" % Tm_staluc('CAGTCAGTACGTACGTGTACTGCCGTA')) 59.87 >>> print("%0.2f" % Tm_staluc('CAGTCAGTACGTACGTGTACTGCCGTA', rna=True)) 77.90
You can also use a Seq object instead of a string,
>>> from Bio.Seq import Seq >>> s = Seq('CAGTCAGTACGTACGTGTACTGCCGTA') >>> print("%0.2f" % Tm_staluc(s)) 59.87 >>> print("%0.2f" % Tm_staluc(s, rna=True)) 77.90