Bio.SeqUtils.CheckSum module

Functions to calculate assorted sequence checksums.

Bio.SeqUtils.CheckSum.crc32(seq)

Return the crc32 checksum for a sequence (string or Seq object).

Note that the case is important:

>>> crc32("ACGTACGTACGT")
20049947
>>> crc32("acgtACGTacgt")
1688586483
Bio.SeqUtils.CheckSum.crc64(s)

Return the crc64 checksum for a sequence (string or Seq object).

Note that the case is important:

>>> crc64("ACGTACGTACGT")
'CRC-C4FBB762C4A87EBD'
>>> crc64("acgtACGTacgt")
'CRC-DA4509DC64A87EBD'
Bio.SeqUtils.CheckSum.gcg(seq)

Return the GCG checksum (int) for a sequence (string or Seq object).

Given a nucleotide or amino-acid sequence (or any string), returns the GCG checksum (int). Checksum used by GCG program. seq type = str.

Based on BioPerl GCG_checksum. Adapted by Sebastian Bassi with the help of John Lenton, Pablo Ziliani, and Gabriel Genellina.

All sequences are converted to uppercase.

>>> gcg("ACGTACGTACGT")
5688
>>> gcg("acgtACGTacgt")
5688
Bio.SeqUtils.CheckSum.seguid(seq)

Return the SEGUID (string) for a sequence (string or Seq object).

Given a nucleotide or amino-acid sequence (or any string), returns the SEGUID string (A SEquence Globally Unique IDentifier). seq type = str.

Note that the case is not important:

>>> seguid("ACGTACGTACGT")
'If6HIvcnRSQDVNiAoefAzySc6i4'
>>> seguid("acgtACGTacgt")
'If6HIvcnRSQDVNiAoefAzySc6i4'

For more information about SEGUID, see: http://bioinformatics.anl.gov/seguid/ https://doi.org/10.1002/pmic.200600032