Package Bio :: Package Nexus :: Module Nexus
[hide private]
[frames] | no frames]

Module Nexus

source code

Nexus class. Parse the contents of a NEXUS file.

Based upon 'NEXUS: An extensible file format for systematic information' Maddison, Swofford, Maddison. 1997. Syst. Biol. 46(4):590-621

Classes [hide private]
  NexusError
  CharBuffer
Helps reading NEXUS-words and characters from a buffer (semi-PRIVATE).
  StepMatrix
Calculate a stepmatrix for weighted parsimony.
  Commandline
Represent a commandline as command and options.
  Block
Represent a NEXUS block with block name and list of commandlines.
  Nexus
Functions [hide private]
 
safename(name, mrbayes=False)
Return a taxon identifier according to NEXUS standard.
source code
 
quotestrip(word)
Remove quotes and/or double quotes around identifiers.
source code
 
get_start_end(sequence, skiplist=['-', '?'])
Return position of first and last character which is not in skiplist.
source code
 
_sort_keys_by_values(p)
Returns a sorted list of keys of p sorted by values of p.
source code
 
_make_unique(l)
Check that all values in list are unique and return a pruned and sorted list.
source code
 
_unique_label(previous_labels, label)
Returns a unique name if label is already in previous_labels.
source code
 
_seqmatrix2strmatrix(matrix)
Converts a Seq-object matrix to a plain sequence-string matrix.
source code
 
_compact4nexus(orig_list)
Transform [1 2 3 5 6 7 8 12 15 18 20] (baseindex 0, used in the Nexus class) into '2-4 6-9 13-193 21' (baseindex 1, used in programs like Paup or MrBayes.).
source code
 
combine(matrices)
Combine matrices in [(name,nexus-instance),...] and return new nexus instance.
source code
 
_kill_comments_and_break_lines(text)
Delete []-delimited comments out of a file and break into lines separated by ';'.
source code
 
_adjust_lines(lines)
Adjust linebreaks to match ';', strip leading/trailing whitespace.
source code
 
_replace_parenthesized_ambigs(seq, rev_ambig_values)
Replaces ambigs in xxx(ACG)xxx format by IUPAC ambiguity code.
source code
 
_get_command_lines(file_contents) source code
Variables [hide private]
  INTERLEAVE = 70
  SPECIAL_COMMANDS = ['charstatelabels', 'charlabels', 'taxlabel...
  KNOWN_NEXUS_BLOCKS = ['trees', 'data', 'characters', 'taxa', '...
  PUNCTUATION = '()[]{}/\\,;:=*\'"`+-<>'
  MRBAYESSAFE = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTU...
  WHITESPACE = ' \t\n'
  SPECIALCOMMENTS = ['&']
  CHARSET = 'chars'
  TAXSET = 'taxa'
  CODONPOSITIONS = 'codonpositions'
  DEFAULTNEXUS = '#NEXUS\nbegin data; dimensions ntax=0 nchar=0;...
  __package__ = 'Bio.Nexus'
Function Details [hide private]

safename(name, mrbayes=False)

source code 

Return a taxon identifier according to NEXUS standard.

Wrap quotes around names with punctuation or whitespace, and double single quotes.

mrbayes=True: write names without quotes, whitespace or punctuation for the mrbayes software package.

get_start_end(sequence, skiplist=['-', '?'])

source code 

Return position of first and last character which is not in skiplist.

Skiplist defaults to ['-','?']).

combine(matrices)

source code 

Combine matrices in [(name,nexus-instance),...] and return new nexus instance.

combined_matrix=combine([(name1,nexus_instance1),(name2,nexus_instance2),...] Character sets, character partitions and taxon sets are prefixed, readjusted and present in the combined matrix.

_kill_comments_and_break_lines(text)

source code 

Delete []-delimited comments out of a file and break into lines separated by ';'.

stripped_text=_kill_comments_and_break_lines(text): Nested and multiline comments are allowed. [ and ] symbols within single or double quotes are ignored, newline ends a quote, all symbols with quotes are treated the same (thus not quoting inside comments like [this character ']' ends a comment]) Special [&...] and [...] comments remain untouched, if not inside standard comment. Quotes inside special [& and [are treated as normal characters, but no nesting inside these special comments allowed (like [& [ ]]). ';' ist deleted from end of line.

NOTE: this function is very slow for large files, and obsolete when using C extension cnexus

_adjust_lines(lines)

source code 

Adjust linebreaks to match ';', strip leading/trailing whitespace.

list_of_commandlines=_adjust_lines(input_text) Lines are adjusted so that no linebreaks occur within a commandline (except matrix command line)


Variables Details [hide private]

SPECIAL_COMMANDS

Value:
['charstatelabels',
 'charlabels',
 'taxlabels',
 'taxset',
 'charset',
 'charpartition',
 'taxpartition',
 'matrix',
...

KNOWN_NEXUS_BLOCKS

Value:
['trees', 'data', 'characters', 'taxa', 'sets', 'codons']

MRBAYESSAFE

Value:
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_'

DEFAULTNEXUS

Value:
'''#NEXUS
begin data; dimensions ntax=0 nchar=0; format datatype=dna; end; '''