Package Bio :: Package Phylo :: Module Consensus :: Class _BitString
[hide private]
[frames] | no frames]

Class _BitString

source code

object --+        
         |        
basestring --+    
             |    
           str --+
                 |
                _BitString

Assistant class of binary string data used for storing and
 counting compatible clades in consensus tree searching. It includes
 some binary manipulation(&|^~) methods.

_BitString is a sub-class of ``str`` object that only accepts two
characters('0' and '1'), with additional functions for binary-like
manipulation(&|^~). It is used to count and store the clades in
multiple trees in consensus tree searching. During counting, the
clades will be considered the same if their terminals(in terms of
``name`` attribute) are the same.

For example, let's say two trees are provided as below to search
their strict consensus tree:

    tree1: (((A, B), C),(D, E))
    tree2: ((A, (B, C)),(D, E))

For both trees, a _BitString object '11111' will represent their
root clade. Each '1' stands for the terminal clade in the list
[A, B, C, D, E](the order might not be the same, it's determined
by the ``get_terminal`` method of the first tree provided). For
the clade ((A, B), C) in tree1 and (A, (B, C)) in tree2, they both
can be represented by '11100'. Similarly, '11000' represents clade
(A, B) in tree1, '01100' represents clade (B, C) in tree2, and '00011'
represents clade (D, E) in both trees.

So, with the ``_count_clades`` function in this module, finally we
can get the clade counts and their _BitString representation as follows
(the root and terminals are omitted):

    clade   _BitString   count
    ABC     '11100'     2
    DE      '00011'     2
    AB      '11000'     1
    BC      '01100'     1

To get the _BitString representation of a clade, we can use the following
code snippet:

    # suppose we are provided with a tree list, the first thing to do is
    # to get all the terminal names in the first tree
    term_names = [term.name for term in trees[0].get_terminals()]
    # for a specific clade in any of the tree, also get its terminal names
    clade_term_names = [term.name for term in clade.get_terminals()]
    # then create a boolean list
    boolvals = [name in clade_term_names for name in term_names]
    # create the string version and pass it to _BitString
    bitstr = _BitString(''.join(map(str, map(int, boolvals))))
    # or, equivalently:
    bitstr = _BitString.from_bool(boolvals)

To convert back:

    # get all the terminal clades of the first tree
    terms = [term for term in trees[0].get_terminals()]
    # get the index of terminal clades in bitstr
    index_list = bitstr.index_one()
    # get all terminal clades by index
    clade_terms = [terms[i] for i in index_list]
    # create a new calde and append all the terminal clades
    new_clade = BaseTree.Clade()
    new_clade.clades.extend(clade_terms)


Example
-------

>>> from Bio.Phylo.Consensus import _BitString
>>> bitstr1 = _BitString('11111')
>>> bitstr2 = _BitString('11100')
>>> bitstr3 = _BitString('01101')
>>> bitstr1
_BitString('11111')
>>> bitstr2 & bitstr3
_BitString('01100')
>>> bitstr2 | bitstr3
_BitString('11101')
>>> bitstr2 ^ bitstr3
_BitString('10001')
>>> bitstr2.index_one()
[0, 1, 2]
>>> bitstr3.index_one()
[1, 2, 4]
>>> bitstr3.index_zero()
[0, 3]
>>> bitstr1.contains(bitstr2)
True
>>> bitstr2.contains(bitstr3)
False
>>> bitstr2.independent(bitstr3)
False
>>> bitstr2.independent(bitstr4)
True
>>> bitstr1.iscompatible(bitstr2)
True
>>> bitstr2.iscompatible(bitstr3)
False
>>> bitstr2.iscompatible(bitstr4)
True
 

Instance Methods [hide private]
 
__and__(self, other) source code
 
__or__(self, other) source code
 
__xor__(self, other) source code
 
__rand__(self, other) source code
 
__ror__(self, other) source code
 
__rxor__(self, other) source code
 
__repr__(self)
repr(x)
source code
 
index_one(self)
Return a list of positions where the element is '1'
source code
 
index_zero(self)
Return a list of positions where the element is '0'
source code
 
contains(self, other)
Check if current bitstr1 contains another one bitstr2.
source code
 
independent(self, other)
Check if current bitstr1 is independent of another one bitstr2.
source code
 
iscompatible(self, other)
Check if current bitstr1 is compatible with another bitstr2.
source code

Inherited from str: __add__, __contains__, __eq__, __format__, __ge__, __getattribute__, __getitem__, __getnewargs__, __getslice__, __gt__, __hash__, __le__, __len__, __lt__, __mod__, __mul__, __ne__, __rmod__, __rmul__, __sizeof__, __str__, capitalize, center, count, decode, encode, endswith, expandtabs, find, format, index, isalnum, isalpha, isdigit, islower, isspace, istitle, isupper, join, ljust, lower, lstrip, partition, replace, rfind, rindex, rjust, rpartition, rsplit, rstrip, split, splitlines, startswith, strip, swapcase, title, translate, upper, zfill

Inherited from str (private): _formatter_field_name_split, _formatter_parser

Inherited from object: __delattr__, __init__, __reduce__, __reduce_ex__, __setattr__, __subclasshook__

Class Methods [hide private]
 
from_bool(cls, bools) source code
Static Methods [hide private]
a new object with type S, a subtype of T

__new__(cls, strdata)
init from a binary string data
source code
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__new__(cls, strdata)
Static Method

source code 
init from a binary string data

Returns:
a new object with type S, a subtype of T

Overrides: object.__new__

__repr__(self)
(Representation operator)

source code 
repr(x)

Overrides: object.__repr__
(inherited documentation)

contains(self, other)

source code 
Check if current bitstr1 contains another one bitstr2.

That is to say, the bitstr2.index_one() is a subset of
bitstr1.index_one().

Examples:
    "011011" contains "011000", "011001", "000011"

Be careful, "011011" also contains "000000". Actually, all _BitString
objects contain all-zero _BitString of the same length.

independent(self, other)

source code 
Check if current bitstr1 is independent of another one bitstr2.
That is to say the bitstr1.index_one() and bitstr2.index_one() have
no intersection.

Be careful, all _BitString objects are independent of all-zero _BitString
of the same length.

iscompatible(self, other)

source code 
Check if current bitstr1 is compatible with another bitstr2.

Two conditions are considered as compatible:

1. bitstr1.contain(bitstr2) or vise versa;
2. bitstr1.independent(bitstr2).