Bio.SeqUtils.MeltingTemp module

Calculate the melting temperature of nucleotide sequences.

This module contains three different methods to calculate the melting temperature of oligonucleotides:

  1. Tm_Wallace: ‘Rule of thumb’

  2. Tm_GC: Empirical formulas based on GC content. Salt and mismatch corrections can be included.

  3. Tm_NN: Calculation based on nearest neighbor thermodynamics. Several tables for DNA/DNA, DNA/RNA and RNA/RNA hybridizations are included. Correction for mismatches, dangling ends, salt concentration and other additives are available.

Tm_staluc is the ‘old’ NN calculation and is kept for compatibility. It is, however, recommended to use Tm_NN instead, since Tm_staluc may be depreceated in the future. Also, Tm_NN has much more options. Using Tm_staluc and Tm_NN with default parameters gives (essentially) the same results.

General parameters for most Tm methods:
  • seq – A Biopython sequence object or a string.

  • check – Checks if the sequence is valid for the given method (default= True). In general, whitespaces and non-base characters are removed and characters are converted to uppercase. RNA will be backtranscribed.

  • strict – Do not allow base characters or neighbor duplex keys (e.g. ‘AT/NA’) that could not or not unambigiously be evaluated for the respective method (default=True). Note that W (= A or T) and S (= C or G) are not ambiguous for Tm_Wallace and Tm_GC. If ‘False’, average values (if applicable) will be used.

This module is not able to detect self-complementary and it will not use alignment tools to align an oligonucleotide sequence to its target sequence. Thus it can not detect dangling-ends and mismatches by itself (don’t even think about bulbs and loops). These parameters have to be handed over to the respective method.

Other public methods of this module:
  • make_table : To create a table with thermodynamic data.

  • salt_correction: To adjust Tm to a given salt concentration by different formulas. This method is called from Tm_GC and Tm_NN but may also be accessed ‘manually’. It returns a correction term, not a corrected Tm!

  • chem_correction: To adjust Tm regarding the chemical additives DMSO and formaldehyde. The method returns a corrected Tm. Chemical correction is not an integral part of the Tm methods and must be called additionally.

For example:

>>> from Bio.SeqUtils import MeltingTemp as mt
>>> from Bio.Seq import Seq
>>> mystring = 'CGTTCCAAAGATGTGGGCATGAGCTTAC'
>>> myseq = Seq(mystring)
>>> print('%0.2f' % mt.Tm_Wallace(mystring))
84.00
>>> print('%0.2f' % mt.Tm_Wallace(myseq))
84.00
>>> print('%0.2f' % mt.Tm_GC(myseq))
58.73
>>> print('%0.2f' % mt.Tm_NN(myseq))
60.32

Tm_NN with default values gives same result as ‘old’ Tm_staluc. However, values differ for RNA, since Tm_staluc had some errors for RNA calculation. These errors have been fixed in this version.

New Tm_NN can do slightly more: Using different thermodynamic tables, e.g. from Breslauer ‘86 or Sugimoto ‘96:

>>> print('%0.2f' % mt.Tm_NN(myseq, nn_table=mt.DNA_NN1))  # Breslauer '86
72.19
>>> print('%0.2f' % mt.Tm_NN(myseq, nn_table=mt.DNA_NN2))  # Sugimoto '96
65.47

Tables for RNA and RNA/DNA hybrids are included:

>>> print('%0.2f' % mt.Tm_NN(myseq, nn_table=mt.RNA_NN1))  # Freier '86
73.35
>>> print('%0.2f' % mt.Tm_NN(myseq, nn_table=mt.R_DNA_NN1))  # Sugimoto '95
57.17

Several types of salc correction (for Tm_NN and Tm_GC):

>>> for i in range(1, 8):
...     print("Type: %d, Tm: %0.2f" % (i, Tm_NN(myseq, saltcorr=i)))
...
Type: 1, Tm: 54.27
Type: 2, Tm: 54.02
Type: 3, Tm: 59.60
Type: 4, Tm: 60.64
Type: 5, Tm: 60.32
Type: 6, Tm: 59.78
Type: 7, Tm: 59.78

Correction for other monovalent cations (K+, Tris), Mg2+ and dNTPs according to von Ahsen et al. (2001) or Owczarzy et al. (2008) (for Tm_NN and Tm_GC):

>>> print('%0.2f' % mt.Tm_NN(myseq, Na=50, Tris=10))
60.79
>>> print('%0.2f' % mt.Tm_NN(myseq, Na=50, Tris=10, Mg=1.5))
67.39
>>> print('%0.2f' % mt.Tm_NN(myseq, Na=50, Tris=10, Mg=1.5, saltcorr=7))
66.81
>>> print('%0.2f' % mt.Tm_NN(myseq, Na=50, Tris=10, Mg=1.5, dNTPs=0.6,
...                          saltcorr=7))
66.04

Dangling ends and mismatches, e.g.:

Oligo:     CGTTCCaAAGATGTGGGCATGAGCTTAC       CGTTCCaAAGATGTGGGCATGAGCTTAC
           ::::::X:::::::::::::::::::::  or   ::::::X:::::::::::::::::::::
Template:  GCAAGGcTTCTACACCCGTACTCGAATG      TGCAAGGcTTCTACACCCGTACTCGAATGC

Here:

>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC'))
60.32
>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC',
...                    c_seq='GCAAGGcTTCTACACCCGTACTCGAATG'))
55.39
>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC', shift=1,
...                   c_seq='TGCAAGGcTTCTACACCCGTACTCGAATGC'))
55.69

The same for RNA:

>>> print('%0.2f' % mt.Tm_NN('CGUUCCAAAGAUGUGGGCAUGAGCUUAC',
...                   c_seq='UGCAAGGcUUCUACACCCGUACUCGAAUGC',
...                   shift=1, nn_table=mt.RNA_NN3,
...                   de_table=mt.RNA_DE1))
73.00

Note, that thermodynamic data are not available for all kind of mismatches, e.g. most double mismatches or terminal mismaches combined with danglind ends:

>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC',
...                   c_seq='TtCAAGGcTTCTACACCCGTACTCGAATGC',
...                   shift=1))
Traceback (most recent call last):
ValueError: no thermodynamic data for neighbors '.C/TT' available

Make your own tables, or update/extend existing tables. E.g., add values for locked nucleotides. Here, ‘locked A’ (and its complement) should be represented by ‘1’:

>>> mytable = mt.make_table(oldtable=mt.DNA_NN3,
...                         values={'A1/T1':(-6.608, -17.235),
...                         '1A/1T':(-6.893, -15.923)})
>>> print('%0.2f' % mt.Tm_NN('CGTTCCAAAGATGTGGGCATGAGCTTAC'))
60.32
>>> print('%0.2f' % mt.Tm_NN('CGTTCCA1AGATGTGGGCATGAGCTTAC',
...                           nn_table=mytable, check=False))
... # 'check' must be False, otherwise '1' would be discarded
62.53
Bio.SeqUtils.MeltingTemp.make_table(oldtable=None, values=None)

Return a table with thermodynamic parameters (as dictionary).

Parameters: oldtable: An existing dictionary with thermodynamic parameters. values: A dictionary with new or updated values.

E.g., to replace the initiation parameters in the Sugimoto ‘96 dataset with the initiation parameters from Allawi & SantaLucia ‘97:

>>> from Bio.SeqUtils.MeltingTemp import make_table, DNA_NN2
>>> table = DNA_NN2                               # Sugimoto '96
>>> table['init_A/T']
(0, 0)
>>> newtable = make_table(oldtable=DNA_NN2, values={'init': (0, 0),
...                       'init_A/T': (2.3, 4.1),
...                       'init_G/C': (0.1, -2.8)})
>>> print("%0.1f, %0.1f" % newtable['init_A/T'])
2.3, 4.1
Bio.SeqUtils.MeltingTemp.salt_correction(Na=0, K=0, Tris=0, Mg=0, dNTPs=0, method=1, seq=None)

Calculate a term to correct Tm for salt ions.

Depending on the Tm calculation, the term will correct Tm or entropy. To calculate corrected Tm values, different operations need to be applied:

  • methods 1-4: Tm(new) = Tm(old) + corr

  • method 5: deltaS(new) = deltaS(old) + corr

  • methods 6+7: Tm(new) = 1/(1/Tm(old) + corr)

Parameters:
  • Na, K, Tris, Mg, dNTPS: Millimolar concentration of respective ion. To have a simple ‘salt correction’, just pass Na. If any of K, Tris, Mg and dNTPS is non-zero, a ‘sodium-equivalent’ concentration is calculated according to von Ahsen et al. (2001, Clin Chem 47: 1956-1961): [Na_eq] = [Na+] + [K+] + [Tris]/2 + 120*([Mg2+] - [dNTPs])^0.5 If [dNTPs] >= [Mg2+]: [Na_eq] = [Na+] + [K+] + [Tris]/2

  • method: Which method to be applied. Methods 1-4 correct Tm, method 5 corrects deltaS, methods 6 and 7 correct 1/Tm. The methods are:

    1. 16.6 x log[Na+] (Schildkraut & Lifson (1965), Biopolymers 3: 195-208)

    2. 16.6 x log([Na+]/(1.0 + 0.7*[Na+])) (Wetmur (1991), Crit Rev Biochem Mol Biol 126: 227-259)

    3. 12.5 x log(Na+] (SantaLucia et al. (1996), Biochemistry 35: 3555-3562

    4. 11.7 x log[Na+] (SantaLucia (1998), Proc Natl Acad Sci USA 95: 1460-1465

    5. Correction for deltaS: 0.368 x (N-1) x ln[Na+] (SantaLucia (1998), Proc Natl Acad Sci USA 95: 1460-1465)

    6. (4.29(%GC)-3.95)x1e-5 x ln[Na+] + 9.40e-6 x ln[Na+]^2 (Owczarzy et al. (2004), Biochemistry 43: 3537-3554)

    7. Complex formula with decision tree and 7 empirical constants. Mg2+ is corrected for dNTPs binding (if present) (Owczarzy et al. (2008), Biochemistry 47: 5336-5353)

Examples

>>> from Bio.SeqUtils import MeltingTemp as mt
>>> print('%0.2f' % mt.salt_correction(Na=50, method=1))
-21.60
>>> print('%0.2f' % mt.salt_correction(Na=50, method=2))
-21.85
>>> print('%0.2f' % mt.salt_correction(Na=100, Tris=20, method=2))
-16.45
>>> print('%0.2f' % mt.salt_correction(Na=100, Tris=20, Mg=1.5, method=2))
-10.99
Bio.SeqUtils.MeltingTemp.chem_correction(melting_temp, DMSO=0, fmd=0, DMSOfactor=0.75, fmdfactor=0.65, fmdmethod=1, GC=None)

Correct a given Tm for DMSO and formamide.

Please note that these corrections are +/- rough approximations.

Arguments:
  • melting_temp: Melting temperature.

  • DMSO: Percent DMSO.

  • fmd: Formamide concentration in %(fmdmethod=1) or molar (fmdmethod=2).

  • DMSOfactor: How much should Tm decreases per percent DMSO. Default=0.65 (von Ahsen et al. 2001). Other published values are 0.5, 0.6 and 0.675.

  • fmdfactor: How much should Tm decrease per percent formamide. Default=0.65. Several papers report factors between 0.6 and 0.72.

  • fmdmethod:

    1. Tm = Tm - factor(%formamide) (Default)

    2. Tm = Tm + (0.453(f(GC)) - 2.88) x [formamide]

    Here f(GC) is fraction of GC. Note (again) that in fmdmethod=1 formamide concentration is given in %, while in fmdmethod=2 it is given in molar.

  • GC: GC content in percent.

Examples:
>>> from Bio.SeqUtils import MeltingTemp as mt
>>> mt.chem_correction(70)
70
>>> print('%0.2f' % mt.chem_correction(70, DMSO=3))
67.75
>>> print('%0.2f' % mt.chem_correction(70, fmd=5))
66.75
>>> print('%0.2f' % mt.chem_correction(70, fmdmethod=2, fmd=1.25,
...                                    GC=50))
66.68
Bio.SeqUtils.MeltingTemp.Tm_Wallace(seq, check=True, strict=True)

Calculate and return the Tm using the ‘Wallace rule’.

Tm = 4 degC * (G + C) + 2 degC * (A+T)

The Wallace rule (Thein & Wallace 1986, in Human genetic diseases: a practical approach, 33-50) is often used as rule of thumb for approximate Tm calculations for primers of 14 to 20 nt length.

Non-DNA characters (e.g., E, F, J, !, 1, etc) are ignored by this method.

Examples:
>>> from Bio.SeqUtils import MeltingTemp as mt
>>> mt.Tm_Wallace('ACGTTGCAATGCCGTA')
48.0
>>> mt.Tm_Wallace('ACGT TGCA ATGC CGTA')
48.0
>>> mt.Tm_Wallace('1ACGT2TGCA3ATGC4CGTA')
48.0
Bio.SeqUtils.MeltingTemp.Tm_GC(seq, check=True, strict=True, valueset=7, userset=None, Na=50, K=0, Tris=0, Mg=0, dNTPs=0, saltcorr=0, mismatch=True)

Return the Tm using empirical formulas based on GC content.

General format: Tm = A + B(%GC) - C/N + salt correction - D(%mismatch)

A, B, C, D: empirical constants, N: primer length D (amount of decrease in Tm per % mismatch) is often 1, but sometimes other values have been used (0.6-1.5). Use ‘X’ to indicate the mismatch position in the sequence. Note that this mismatch correction is a rough estimate.

>>> from Bio.SeqUtils import MeltingTemp as mt
>>> print("%0.2f" % mt.Tm_GC('CTGCTGATXGCACGAGGTTATGG', valueset=2))
69.20
Arguments:
  • valueset: A few often cited variants are included:

    1. Tm = 69.3 + 0.41(%GC) - 650/N (Marmur & Doty 1962, J Mol Biol 5: 109-118; Chester & Marshak 1993), Anal Biochem 209: 284-290)

    2. Tm = 81.5 + 0.41(%GC) - 675/N - %mismatch ‘QuikChange’ formula. Recommended (by the manufacturer) for the design of primers for QuikChange mutagenesis.

    3. Tm = 81.5 + 0.41(%GC) - 675/N + 16.6 x log[Na+] (Marmur & Doty 1962, J Mol Biol 5: 109-118; Schildkraut & Lifson 1965, Biopolymers 3: 195-208)

    4. Tm = 81.5 + 0.41(%GC) - 500/N + 16.6 x log([Na+]/(1.0 + 0.7 x [Na+])) - %mismatch (Wetmur 1991, Crit Rev Biochem Mol Biol 126: 227-259). This is the standard formula in approximative mode of MELTING 4.3.

    5. Tm = 78 + 0.7(%GC) - 500/N + 16.6 x log([Na+]/(1.0 + 0.7 x [Na+])) - %mismatch (Wetmur 1991, Crit Rev Biochem Mol Biol 126: 227-259). For RNA.

    6. Tm = 67 + 0.8(%GC) - 500/N + 16.6 x log([Na+]/(1.0 + 0.7 x [Na+])) - %mismatch (Wetmur 1991, Crit Rev Biochem Mol Biol 126: 227-259). For RNA/DNA hybrids.

    7. Tm = 81.5 + 0.41(%GC) - 600/N + 16.6 x log[Na+] Used by Primer3Plus to calculate the product Tm. Default set.

    8. Tm = 77.1 + 0.41(%GC) - 528/N + 11.7 x log[Na+] (von Ahsen et al. 2001, Clin Chem 47: 1956-1961). Recommended ‘as a tradeoff between accuracy and ease of use’.

  • userset: Tuple of four values for A, B, C, and D. Usersets override valuesets.

  • Na, K, Tris, Mg, dNTPs: Concentration of the respective ions [mM]. If any of K, Tris, Mg and dNTPS is non-zero, a ‘sodium-equivalent’ concentration is calculated and used for salt correction (von Ahsen et al., 2001).

  • saltcorr: Type of salt correction (see method salt_correction). Default=5. 0 or None means no salt correction.

  • mismatch: If ‘True’ (default) every ‘X’ in the sequence is counted as mismatch.

Bio.SeqUtils.MeltingTemp.Tm_NN(seq, check=True, strict=True, c_seq=None, shift=0, nn_table=None, tmm_table=None, imm_table=None, de_table=None, dnac1=25, dnac2=25, selfcomp=False, Na=50, K=0, Tris=0, Mg=0, dNTPs=0, saltcorr=5)

Return the Tm using nearest neighbor thermodynamics.

Arguments:
  • seq: The primer/probe sequence as string or Biopython sequence object. For RNA/DNA hybridizations seq must be the RNA sequence.

  • c_seq: Complementary sequence. The sequence of the template/target in 3’->5’ direction. c_seq is necessary for mismatch correction and dangling-ends correction. Both corrections will automatically be applied if mismatches or dangling ends are present. Default=None.

  • shift: Shift of the primer/probe sequence on the template/target sequence, e.g.:

                       shift=0       shift=1        shift= -1
    Primer (seq):      5' ATGC...    5'  ATGC...    5' ATGC...
    Template (c_seq):  3' TACG...    3' CTACG...    3'  ACG...
    

    The shift parameter is necessary to align seq and c_seq if they have different lengths or if they should have dangling ends. Default=0

  • table: Thermodynamic NN values, eight tables are implemented: For DNA/DNA hybridizations:

    • DNA_NN1: values from Breslauer et al. (1986)

    • DNA_NN2: values from Sugimoto et al. (1996)

    • DNA_NN3: values from Allawi & SantaLucia (1997) (default)

    • DNA_NN4: values from SantaLucia & Hicks (2004)

    For RNA/RNA hybridizations:

    • RNA_NN1: values from Freier et al. (1986)

    • RNA_NN2: values from Xia et al. (1998)

    • RNA_NN3: valuse from Chen et al. (2012)

    For RNA/DNA hybridizations:

    • R_DNA_NN1: values from Sugimoto et al. (1995) Note that seq must be the RNA sequence.

    Use the module’s maketable method to make a new table or to update one one of the implemented tables.

  • tmm_table: Thermodynamic values for terminal mismatches. Default: DNA_TMM1 (SantaLucia & Peyret, 2001)

  • imm_table: Thermodynamic values for internal mismatches, may include insosine mismatches. Default: DNA_IMM1 (Allawi & SantaLucia, 1997-1998; Peyret et al., 1999; Watkins & SantaLucia, 2005)

  • de_table: Thermodynamic values for dangling ends:

    • DNA_DE1: for DNA. Values from Bommarito et al. (2000) (default)

    • RNA_DE1: for RNA. Values from Turner & Mathews (2010)

  • dnac1: Concentration of the higher concentrated strand [nM]. Typically this will be the primer (for PCR) or the probe. Default=25.

  • dnac2: Concentration of the lower concentrated strand [nM]. In PCR this is the template strand which concentration is typically very low and may be ignored (dnac2=0). In oligo/oligo hybridization experiments, dnac1 equals dnac1. Default=25. MELTING and Primer3Plus use k = [Oligo(Total)]/4 by default. To mimic this behaviour, you have to divide [Oligo(Total)] by 2 and assign this concentration to dnac1 and dnac2. E.g., Total oligo concentration of 50 nM in Primer3Plus means dnac1=25, dnac2=25.

  • selfcomp: Is the sequence self-complementary? Default=False. If ‘True’ the primer is thought binding to itself, thus dnac2 is not considered.

  • Na, K, Tris, Mg, dNTPs: See method ‘Tm_GC’ for details. Defaults: Na=50, K=0, Tris=0, Mg=0, dNTPs=0.

  • saltcorr: See method ‘Tm_GC’. Default=5. 0 means no salt correction.

Bio.SeqUtils.MeltingTemp.Tm_staluc(s, dnac=50, saltc=50, rna=0)

Return DNA/DNA Tm using nearest neighbor thermodynamics (OBSOLETE).

This method may be depreceated in the future. Use Tm_NN instead. Tm_NN with default values gives the same result as Tm_staluc.

s is the sequence as string or Seq object dnac is DNA concentration [nM] saltc is salt concentration [mM]. rna=0 is for DNA/DNA (default), use 1 for RNA/RNA hybridisation.

For DNA/DNA, see Allawi & SantaLucia (1997), Biochemistry 36: 10581-10594 For RNA/RNA, see Xia et al (1998), Biochemistry 37: 14719-14735

Examples

>>> print("%0.2f" % Tm_staluc('CAGTCAGTACGTACGTGTACTGCCGTA'))
59.87
>>> print("%0.2f" % Tm_staluc('CAGTCAGTACGTACGTGTACTGCCGTA', rna=True))
77.90

You can also use a Seq object instead of a string,

>>> from Bio.Seq import Seq
>>> from Bio.Alphabet import generic_nucleotide
>>> s = Seq('CAGTCAGTACGTACGTGTACTGCCGTA', generic_nucleotide)
>>> print("%0.2f" % Tm_staluc(s))
59.87
>>> print("%0.2f" % Tm_staluc(s, rna=True))
77.90