Package Bio :: Package Motif :: Module _Motif :: Class Motif
[hide private]
[frames] | no frames]

Class Motif

source code

object --+
         |
        Motif
Known Subclasses:

A class representing sequence motifs.
Instance Methods [hide private]
 
__getitem__(self, index)
Returns the probability distribution over symbols at a given position, padding with background.
source code
 
__init__(self, alphabet=IUPACUnambiguousDNA())
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
source code
 
__len__(self)
return the length of a motif
source code
 
__str__(self, masked=False)
string representation of a motif.
source code
 
_check_alphabet(self, alphabet) source code
 
_check_length(self, len) source code
 
_from_horiz_matrix(self, stream, letters=None, make_instances=False)
reads a horizontal count matrix from stream and fill in the counts.
source code
 
_from_jaspar_pfm(self, stream, make_instances=False)
reads the motif from Jaspar .pfm file
source code
 
_from_jaspar_sites(self, stream)
reads the motif from Jaspar .sites file
source code
 
_from_vert_matrix(self, stream, letters=None, make_instances=False)
reads a vertical count matrix from stream and fill in the counts.
source code
 
_pwm_calculate(self, sequence) source code
 
_read(self, stream)
Reads the motif from the stream (in AlignAce format).
source code
 
_to_fasta(self)
FASTA representation of motif
source code
 
_to_horizontal_matrix(self, letters=None, normalized=True)
Return string representation of the motif as a matrix.
source code
 
_to_jaspar_pfm(self)
Returns the pfm representation of the motif
source code
 
_to_transfac(self)
Write the representation of a motif in TRANSFAC format
source code
 
_to_vertical_matrix(self, letters=None)
Return string representation of the motif as a matrix.
source code
 
_write(self, stream)
writes the motif to the stream
source code
 
add_instance(self, instance)
adds new instance to the motif
source code
 
anticonsensus(self)
returns the least probable pattern to be generated from this motif.
source code
 
consensus(self)
Returns the consensus sequence of a motif.
source code
 
dist_dpq(self, other)
Calculates the DPQ distance measure between motifs.
source code
 
dist_dpq_at(self, other, offset)
calculates the dist_dpq measure with a given offset.
source code
 
dist_pearson(self, motif, masked=0)
return the similarity score based on pearson correlation for the given motif against self.
source code
 
dist_pearson_at(self, motif, offset) source code
 
dist_product(self, other)
A similarity measure taking into account a product probability of generating overlaping instances of two motifs
source code
 
dist_product_at(self, other, offset) source code
 
exp_score(self, st_dev=False)
Computes expected score of motif's instance and its standard deviation
source code
 
format(self, format)
Returns a string representation of the Motif in a given format
source code
 
ic(self)
Method returning the information content of a motif.
source code
 
log_odds(self, laplace=True)
returns the logg odds matrix computed for the set of instances
source code
 
make_counts_from_instances(self)
Creates the count matrix for a motif with instances.
source code
 
make_instances_from_counts(self)
Creates "fake" instances for a motif created from a count matrix.
source code
 
max_score(self)
Maximal possible score for this motif.
source code
 
min_score(self)
Minimal possible score for this motif.
source code
 
pwm(self, laplace=True)
returns the PWM computed for the set of instances
source code
 
reverse_complement(self)
Gives the reverse complement of the motif
source code
 
scanPWM(self, seq)
Matrix of log-odds scores for a nucleotide sequence.
source code
 
score_hit(self, sequence, position, normalized=0, masked=0)
give the pwm score for a given position
source code
 
search_instances(self, sequence)
a generator function, returning found positions of instances of the motif in a given sequence
source code
 
search_pwm(self, sequence, normalized=0, masked=0, threshold=0.0, both=True)
a generator function, returning found hits in a given sequence with the pwm score higher than the threshold
source code
 
set_mask(self, mask)
sets the mask for the motif
source code
 
weblogo(self, fname, format='PNG', **kwds)
uses the Berkeley weblogo service to download and save a weblogo of itself
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __subclasshook__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__getitem__(self, index)
(Indexing operator)

source code 

Returns the probability distribution over symbols at a given position, padding with background.

If the requested index is out of bounds, the returned distribution comes from background.

__init__(self, alphabet=IUPACUnambiguousDNA())
(Constructor)

source code 
x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Overrides: object.__init__
(inherited documentation)

__len__(self)
(Length operator)

source code 

return the length of a motif

Please use this method (i.e. invoke len(m)) instead of refering to the m.length directly.

__str__(self, masked=False)
(Informal representation operator)

source code 
string representation of a motif.
Overrides: object.__str__

_from_jaspar_pfm(self, stream, make_instances=False)

source code 

reads the motif from Jaspar .pfm file

The instances are fake, but the pwm is accurate.

_from_jaspar_sites(self, stream)

source code 

reads the motif from Jaspar .sites file

The instances and pwm are OK.

_read(self, stream)

source code 

Reads the motif from the stream (in AlignAce format).

the self.alphabet variable must be set beforehand. If the last line contains asterisks it is used for setting mask

dist_dpq(self, other)

source code 

Calculates the DPQ distance measure between motifs.

It is calculated as a maximal value of DPQ formula (shown using LaTeX markup, familiar to mathematicians):

\sqrt{\sum_{i=1}^{alignment.len()} \sum_{k=1}^alphabet.len() \
\{ m1[i].freq(alphabet[k])*log_2(m1[i].freq(alphabet[k])/m2[i].freq(alphabet[k])) +
   m2[i].freq(alphabet[k])*log_2(m2[i].freq(alphabet[k])/m1[i].freq(alphabet[k]))
}

over possible non-spaced alignemts of two motifs. See this reference:

D. M Endres and J. E Schindelin, "A new metric for probability distributions", IEEE transactions on Information Theory 49, no. 7 (July 2003): 1858-1860.

dist_dpq_at(self, other, offset)

source code 

calculates the dist_dpq measure with a given offset.

offset should satisfy 0<=offset<=len(self)

dist_pearson(self, motif, masked=0)

source code 

return the similarity score based on pearson correlation for the given motif against self.

We use the Pearson's correlation of the respective probabilities.

format(self, format)

source code 

Returns a string representation of the Motif in a given format

Currently supported fromats:
  • jaspar-pfm : JASPAR Position Frequency Matrix
  • transfac : TRANSFAC like files
  • fasta : FASTA file with instances

make_instances_from_counts(self)

source code 

Creates "fake" instances for a motif created from a count matrix.

In case the sums of counts are different for different columnes, the shorter columns are padded with background.

max_score(self)

source code 

Maximal possible score for this motif.

returns the score computed for the consensus sequence.

min_score(self)

source code 

Minimal possible score for this motif.

returns the score computed for the anticonsensus sequence.

pwm(self, laplace=True)

source code 

returns the PWM computed for the set of instances

if laplace=True (default), pseudocounts equal to self.background multiplied by self.beta are added to all positions.

scanPWM(self, seq)

source code 

Matrix of log-odds scores for a nucleotide sequence.

scans a nucleotide sequence and returns the matrix of log-odds scores for all positions.

  • the result is a one-dimensional list or numpy array
  • the sequence can only be a DNA sequence
  • the search is performed only on one strand

set_mask(self, mask)

source code 

sets the mask for the motif

The mask should be a string containing asterisks in the position of significant columns and spaces in other columns

weblogo(self, fname, format='PNG', **kwds)

source code 

uses the Berkeley weblogo service to download and save a weblogo of itself

requires an internet connection. The parameters from **kwds are passed directly to the weblogo server.