Package Bio :: Package Motif :: Module _Motif :: Class Motif
[hide private]
[frames] | no frames]

Class Motif

source code

object --+
         |
        Motif
Known Subclasses:


A class representing sequence motifs.

Instance Methods [hide private]
 
__getitem__(self, index)
Returns the probability distribution over symbols at a given position, padding with background.
source code
 
__init__(self, alphabet=IUPACUnambiguousDNA())
x.__init__(...) initializes x; see help(type(x)) for signature
source code
 
__len__(self)
return the length of a motif
source code
 
__str__(self, masked=False)
string representation of a motif.
source code
 
_check_alphabet(self, alphabet) source code
 
_check_length(self, len) source code
 
_from_horiz_matrix(self, stream, letters=None, make_instances=False)
reads a horizontal count matrix from stream and fill in the counts.
source code
 
_from_jaspar_pfm(self, stream, make_instances=False)
reads the motif from Jaspar .pfm file
source code
 
_from_jaspar_sites(self, stream)
reads the motif from Jaspar .sites file
source code
 
_from_vert_matrix(self, stream, letters=None, make_instances=False)
reads a vertical count matrix from stream and fill in the counts.
source code
 
_pwm_calculate(self, sequence) source code
 
_read(self, stream)
Reads the motif from the stream (in AlignAce format).
source code
 
_to_fasta(self)
FASTA representation of motif
source code
 
_to_horizontal_matrix(self, letters=None, normalized=True)
Return string representation of the motif as a matrix.
source code
 
_to_jaspar_pfm(self)
Returns the pfm representation of the motif...
source code
 
_to_transfac(self)
Write the representation of a motif in TRANSFAC format...
source code
 
_to_vertical_matrix(self, letters=None)
Return string representation of the motif as a matrix.
source code
 
_write(self, stream)
writes the motif to the stream
source code
 
add_instance(self, instance)
adds new instance to the motif
source code
 
anticonsensus(self)
returns the least probable pattern to be generated from this motif.
source code
 
consensus(self)
Returns the consensus sequence of a motif.
source code
 
dist_dpq(self, other)
Calculates the DPQ distance measure between motifs.
source code
 
dist_dpq_at(self, other, offset)
calculates the dist_dpq measure with a given offset.
source code
 
dist_pearson(self, motif, masked=0)
return the similarity score based on pearson correlation for the given motif against self.
source code
 
dist_pearson_at(self, motif, offset) source code
 
dist_product(self, other)
A similarity measure taking into account a product probability of generating overlaping instances of two motifs
source code
 
dist_product_at(self, other, offset) source code
 
exp_score(self, st_dev=False)
Computes expected score of motif's instance and its standard deviation
source code
 
format(self, format)
Returns a string representation of the Motif in a given format
source code
 
ic(self)
Method returning the information content of a motif.
source code
 
log_odds(self, laplace=True)
returns the logg odds matrix computed for the set of instances
source code
 
make_counts_from_instances(self)
Creates the count matrix for a motif with instances.
source code
 
make_instances_from_counts(self)
Creates "fake" instances for a motif created from a count matrix.
source code
 
max_score(self)
Maximal possible score for this motif.
source code
 
min_score(self)
Minimal possible score for this motif.
source code
 
pwm(self, laplace=True)
returns the PWM computed for the set of instances
source code
 
reverse_complement(self)
Gives the reverse complement of the motif
source code
 
scanPWM(self, seq)
Matrix of log-odds scores for a nucleotide sequence.
source code
 
score_hit(self, sequence, position, normalized=0, masked=0)
give the pwm score for a given position
source code
 
search_instances(self, sequence)
a generator function, returning found positions of instances of the motif in a given sequence
source code
 
search_pwm(self, sequence, normalized=0, masked=0, threshold=0.0, both=True)
a generator function, returning found hits in a given sequence with the pwm score higher than the threshold
source code
 
set_mask(self, mask)
sets the mask for the motif
source code
 
weblogo(self, fname, format='PNG', **kwds)
uses the Berkeley weblogo service to download and save a weblogo of itself
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __subclasshook__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__getitem__(self, index)
(Indexing operator)

source code 
Returns the probability distribution over symbols at a given position, padding with background.

If the requested index is out of bounds, the returned distribution comes from background.

__init__(self, alphabet=IUPACUnambiguousDNA())
(Constructor)

source code 
x.__init__(...) initializes x; see help(type(x)) for signature

Overrides: object.__init__
(inherited documentation)

__len__(self)
(Length operator)

source code 
return the length of a motif

Please use this method (i.e. invoke len(m)) instead of refering to the m.length directly.

__str__(self, masked=False)
(Informal representation operator)

source code 
string representation of a motif.
        

Overrides: object.__str__

_from_jaspar_pfm(self, stream, make_instances=False)

source code 

reads the motif from Jaspar .pfm file

The instances are fake, but the pwm is accurate.

_from_jaspar_sites(self, stream)

source code 

reads the motif from Jaspar .sites file

The instances and pwm are OK.

_read(self, stream)

source code 
Reads the motif from the stream (in AlignAce format).

the self.alphabet variable must be set beforehand.
If the last line contains asterisks it is used for setting mask

_to_jaspar_pfm(self)

source code 
Returns the pfm representation of the motif
        

_to_transfac(self)

source code 
Write the representation of a motif in TRANSFAC format
        

dist_dpq(self, other)

source code 
Calculates the DPQ distance measure between motifs.

It is calculated as a maximal value of DPQ formula (shown using LaTeX
markup, familiar to mathematicians):

\sqrt{\sum_{i=1}^{alignment.len()} \sum_{k=1}^alphabet.len() \
\{ m1[i].freq(alphabet[k])*log_2(m1[i].freq(alphabet[k])/m2[i].freq(alphabet[k])) +
   m2[i].freq(alphabet[k])*log_2(m2[i].freq(alphabet[k])/m1[i].freq(alphabet[k]))
}

over possible non-spaced alignemts of two motifs.  See this reference:

D. M Endres and J. E Schindelin, "A new metric for probability
distributions", IEEE transactions on Information Theory 49, no. 7
(July 2003): 1858-1860.

dist_dpq_at(self, other, offset)

source code 

calculates the dist_dpq measure with a given offset.

offset should satisfy 0<=offset<=len(self)

dist_pearson(self, motif, masked=0)

source code 

return the similarity score based on pearson correlation for the given motif against self.

We use the Pearson's correlation of the respective probabilities.

format(self, format)

source code 
Returns a string representation of the Motif in a given format

Currently supported fromats:
 - jaspar-pfm : JASPAR Position Frequency Matrix
 - transfac : TRANSFAC like files
 - fasta : FASTA file with instances

make_instances_from_counts(self)

source code 
Creates "fake" instances for a motif created from a count matrix.

In case the sums of counts are different for different columnes, the
shorter columns are padded with background.

max_score(self)

source code 
Maximal possible score for this motif.

returns the score computed for the consensus sequence.

min_score(self)

source code 
Minimal possible score for this motif.

returns the score computed for the anticonsensus sequence.

pwm(self, laplace=True)

source code 

returns the PWM computed for the set of instances

if laplace=True (default), pseudocounts equal to self.background multiplied by self.beta are added to all positions.

scanPWM(self, seq)

source code 
Matrix of log-odds scores for a nucleotide sequence.

scans a nucleotide sequence and returns the matrix of log-odds
scores for all positions.

- the result is a one-dimensional list or numpy array
- the sequence can only be a DNA sequence
- the search is performed only on one strand

set_mask(self, mask)

source code 

sets the mask for the motif

The mask should be a string containing asterisks in the position of significant columns and spaces in other columns

weblogo(self, fname, format='PNG', **kwds)

source code 

uses the Berkeley weblogo service to download and save a weblogo of itself

requires an internet connection.
The parameters from **kwds are passed directly to the weblogo server.