Package nltk_lite :: Package contrib :: Package toolbox :: Module data :: Class ToolboxData
[hide private]
[frames] | no frames]

Class ToolboxData

source code

                    object --+        
                             |        
corpora.toolbox.StandardFormat --+    
                                 |    
       corpora.toolbox.ToolboxData --+
                                     |
                                    ToolboxData

Instance Methods [hide private]
 
__init__(self) source code
 
_tree2etree(self, parent, no_blanks) source code
string
chunk_parse(self, grammar, no_blanks=True, incomplete='record', **kwargs)
Returns an element tree structure corresponding to a toolbox data file parsed according to the chunk grammar.
source code
 
_make_parse_table(self, grammar)
Return parsing state information used by tree_parser.
source code
ElementTree._ElementInterface
grammar_parse(self, startsym, grammar, no_blanks=True, **kwargs)
Returns an element tree structure corresponding to a toolbox data file parsed according to the grammar.
source code

Inherited from corpora.toolbox.ToolboxData: parse

Inherited from corpora.toolbox.ToolboxData (private): _record_parse

Inherited from corpora.toolbox.StandardFormat: close, fields, open, open_string, raw_fields

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self)
(Constructor)

source code 
Overrides: corpora.toolbox.ToolboxData.__init__

chunk_parse(self, grammar, no_blanks=True, incomplete='record', **kwargs)

source code 

Returns an element tree structure corresponding to a toolbox data file parsed according to the chunk grammar.

Parameters:
  • grammar (string) - Contains the chunking rules used to parse the database. See chunk.RegExp for documentation.
  • no_blanks (boolean) - blank fields that are not important to the structure are deleted
  • incomplete - name of element used if parse doesn't result in one toplevel element
  • kwargs (keyword arguments dictionary) - Keyword arguments passed to toolbox.StandardFormat.fields()
Returns: string
Contents of toolbox data parsed according to the rules in grammar

grammar_parse(self, startsym, grammar, no_blanks=True, **kwargs)

source code 

Returns an element tree structure corresponding to a toolbox data file parsed according to the grammar.

Parameters:
  • startsym (string) - Start symbol used for the grammar
  • grammar (dictionary of tuple of tuples) - Contains the set of rewrite rules used to parse the database. See the description below.
  • no_blanks (boolean) - blank fields that are not important to the structure are deleted
  • kwargs (keyword arguments dictionary) - Keyword arguments passed to toolbox.StandardFormat.fields()
Returns: ElementTree._ElementInterface
Contents of toolbox data parsed according to rules in grammar

The rewrite rules in the grammar look similar to those usually used in computer languages. The difference is that the ordering constraints that are usually present are relaxed in this parser. The reason is that toolbox databases seldom have consistent ordering of fields. Hence the right side of each rule consists of a tuple with two parts. The fields in the first part mark the start of nonterminal. Each of them can occur only once and all those must occur before any of the fields in the second part of that nonterminal. Otherwise they are interpreted as marking the start of another one of the same nonterminal. If there is more than one in the first part of the tuple they do not need to all appear in a parse. The fields in the second part of the tuple can occur in any order.

Sample grammar:

   grammar = {
       'toolbox':  (('_sh',),      ('_DateStampHasFourDigitYear', 'entry')),
       'entry':    (('lx',),       ('hm', 'sense', 'dt')),
       'sense':    (('sn', 'ps'),  ('pn', 'gv', 'dv',
                                    'gn', 'gp', 'dn', 'rn',
                                    'ge', 'de', 're',
                                    'example', 'lexfunc')),
       'example':  (('rf', 'xv',), ('xn', 'xe')),
       'lexfunc':  (('lf',),       ('lexvalue',)),
       'lexvalue': (('lv',),       ('ln', 'le')),
   }