[Gd-chatter] r11726 - in trunk/libraries: registry/generic utilities/peg-parser

agent at gwydiondylan.org agent at gwydiondylan.org
Sun Mar 2 02:07:57 CET 2008


Author: agent
Date: Sun Mar  2 02:07:56 2008
New Revision: 11726

Added:
   trunk/libraries/registry/generic/peg-parser   (contents, props changed)
   trunk/libraries/utilities/peg-parser/
   trunk/libraries/utilities/peg-parser/Makefile   (contents, props changed)
   trunk/libraries/utilities/peg-parser/README.txt   (contents, props changed)
   trunk/libraries/utilities/peg-parser/library.dylan   (contents, props changed)
   trunk/libraries/utilities/peg-parser/parser-definers.dylan   (contents, props changed)
   trunk/libraries/utilities/peg-parser/parser-rules.dylan   (contents, props changed)
   trunk/libraries/utilities/peg-parser/parser-support.dylan   (contents, props changed)
   trunk/libraries/utilities/peg-parser/peg-parser.lid   (contents, props changed)
Log:
Job: minor
Added PEG recursive descent parser.


Added: trunk/libraries/registry/generic/peg-parser
==============================================================================
--- (empty file)
+++ trunk/libraries/registry/generic/peg-parser	Sun Mar  2 02:07:56 2008
@@ -0,0 +1 @@
+abstract://dylan/utilities/peg-parser/peg-parser.lid

Added: trunk/libraries/utilities/peg-parser/Makefile
==============================================================================
--- (empty file)
+++ trunk/libraries/utilities/peg-parser/Makefile	Sun Mar  2 02:07:56 2008
@@ -0,0 +1,7 @@
+peg-parser: peg-parser.lid library.dylan \
+            parser-definers.dylan parser-rules.dylan parser-support.dylan
+	d2c peg-parser.lid
+
+clean:
+	-rm -f *.o *.s *.a *.c *.mak *~
+	-rm -rf .libs

Added: trunk/libraries/utilities/peg-parser/README.txt
==============================================================================
--- (empty file)
+++ trunk/libraries/utilities/peg-parser/README.txt	Sun Mar  2 02:07:56 2008
@@ -0,0 +1,54 @@
+This is a recursive descent parser that handles parsing expression grammars
+(PEGs), as described at
+http://en.wikipedia.org/wiki/Parsing_expression_grammar.
+
+It isn't too tricky to use. Basically, a PEG can act as both a lexical and
+phrase grammar, and acts similarly to a forward-looking greedy regular
+expression processor, except that it acts on tokens instead of characters.
+
+This library allows for simple rules-based declarations and also custom parser
+functions (you'll need at least one of these to get characters and literals
+from the stream).
+
+The parser can simplify and consolidate syntax elements in an upwards
+direction along the syntax tree, and you can define a context class to do
+something similar to a true attributed grammar.
+
+-- Dustin Voss
+
+
+-----------------
+THINGS TO IMPROVE
+-----------------
+
+* Give it a more rigorous attributed grammar.
+
+* Implement caching. The cache could consist of a data structure similar to:
+    
+    Character
+    position   Cached parse results
+
+    130        for "verb" + context A: no match
+               for "verb" + context B: no match
+               for "article" + context C: matched, result "the"
+               for "article" + context B: no match
+               for "sentence" + context C: matched, result "the dog bit me"
+               
+    131        ...
+
+  Since the parser never looks backward, as the parser consumes tokens, old
+  data could be discarded. A look-ahead counter would have to be maintained,
+  as look-ahead *pretends* to consume tokens and old data shouldn't be
+  discarded while looking ahead. But what the look-ahead sees can stay in the
+  cache after the look-ahead finishes.
+
+  I expect a <deque> plus *cache-start-position* plus *lookahead-depth* would
+  do the trick. Each <deque> element would be a <table> where the keys are a
+  combination of a token name and a context and the values would be the
+  product of that token's parser (if it matches). The *lookahead-depth* would
+  be incremented then decremented by the req-next and not-next parsers.
+  Cache-hit testing would happen in the parse-xxx functions, and perhaps the
+  updating of *cache-start-position* and discarding of old data would too.
+
+* Soft error recovery. Right now, the parser just gives up at the first thing
+  it can't match. I'm not sure what it should do, but it doesn't do it.

Added: trunk/libraries/utilities/peg-parser/library.dylan
==============================================================================
--- (empty file)
+++ trunk/libraries/utilities/peg-parser/library.dylan	Sun Mar  2 02:07:56 2008
@@ -0,0 +1,24 @@
+module: dylan-user
+
+define library peg-parser
+   use common-dylan;
+   use io;
+   export peg-parser;
+end library;
+
+define module peg-parser
+   // from common-dylan
+   use dylan;
+   use common-extensions, exclude: {format-to-string};
+   // from io
+   use streams;
+   use format;
+   use standard-io;
+
+   export
+      parser-definer, parser-method-definer,
+      <parse-failure>, parse-expected, failure-position,
+      <token>, parse-start, parse-end,
+      seq, choice, many, opt, opt-seq, opt-choice, opt-many, req-next, not-next,
+      collect-subelements, *parser-trace*
+end module;

Added: trunk/libraries/utilities/peg-parser/parser-definers.dylan
==============================================================================
--- (empty file)
+++ trunk/libraries/utilities/peg-parser/parser-definers.dylan	Sun Mar  2 02:07:56 2008
@@ -0,0 +1,282 @@
+module: peg-parser
+synopsis: PEG parser macro definitions.
+
+
+/// SYNOPSIS: Defines an arbitrary 'rule parser'.
+/// DISCUSSION: This macro defines a rule parser that includes support for
+/// debugging and other features described for rule parsers.
+define macro parser-method-definer
+   {
+      define parser-method ?token:name
+      (?stream:name, ?context:name :: ?type:name)
+      => (?results:*)
+         ?:body
+      end
+   } => {
+      define function "parse-" ## ?token
+      (?stream :: <positionable-stream>, ?context :: ?type)
+      => (?results)
+         indent-trace();
+         format-trace(?"token" ## "...");
+         let pos = ?stream.stream-position;
+         block()
+            ?body
+         afterwards
+            format-trace(?"token" ## " matched chars %x-%x",
+                   pos, ?stream.stream-position);
+         cleanup
+            outdent-trace();
+         exception (err :: <parse-failure>)
+            ?stream.stream-position := pos;
+            indent-trace();
+            format-trace(?"token" ## " no match, exp. %s at char %x",
+                         err.parse-expected, err.failure-position);
+            outdent-trace();
+            error(err)
+         end;
+      end function;
+
+      *rule-names*["parse-" ## ?token] := ?"token";
+      *rule-name-parts*["parse-" ## ?token] := #( ?"token" );
+   }
+end macro;
+
+
+/// SYNOPSIS: Defines a 'rule parser' and perhaps a token class for a given
+/// token.
+///
+/// The macro takes three forms. A form like
+/// [code]
+/// define parser t (<c>)
+///   rule many(t2) => tokens;
+///   slot content = tokens[1];
+/// end parser;
+/// [end code]
+/// defines a rule parser named `parse-t` and a token class named `<t-token>`
+/// which inherits from `<c>` (optional) and `<token>` [code qv]. `<t-token>` will
+/// have a slot named `content` that gets initialized to the expression `tokens[1]`,
+/// where `tokens` is the product of the rule `many(t2)`.
+/// 
+/// A form like 
+/// [code]
+/// define parser t
+///   rule many(t2) => tokens;
+///   yield tokens[1];
+/// end parser;
+/// [end code]
+/// defines a rule parser that returns `tokens[1]` directly, without defining
+/// a `<t-token>` class.
+///
+/// A form like
+/// [code]
+/// define parser t
+///   rule many(t2)
+/// end parser;
+/// [end code]
+/// defines a rule parser that return `#"t"`.
+define macro parser-definer
+
+   //
+   // This form creates a parser that return an initialized <token> class.
+   {
+      define parser ?token-name:name (?supers)
+         rule ?rule => ?product-name:name;
+         ?class-slots
+      end
+   } => {
+      // Define the class.
+      class-specifier(?token-name; ?supers; ?class-slots);
+      
+      // Initialize the class's slots based on results of rule.
+      define method initialize (?token-name :: "<" ## ?token-name ## "-token>",
+            #next next-method, #key ?product-name = unsupplied())
+         next-method();
+         if (supplied?(?product-name))
+            slot-initializers(?token-name; ?class-slots)
+         end if;
+      end method;
+      
+      // Define the parser rule as the result of all the 'seq' etc. functions.
+      define constant ?token-name ## "-parser-rule" = ?rule;
+      
+      // Define the parser function including tracing and rollback. Result is
+      // <token> subclass, slots initialized by 'initialize' function above.
+      define function "parse-" ## ?token-name
+         (stream :: <positionable-stream>, context)
+      => (token :: "<" ## ?token-name ## "-token>")
+         indent-trace();
+         format-trace(?"token-name" ## "...");
+         let pos = stream.stream-position;
+         let production =
+               block()
+                  ?token-name ## "-parser-rule" (stream, context)
+               afterwards
+                  format-trace(?"token-name" ## " matched chars %x-%x",
+                               pos, stream.stream-position);
+               cleanup
+                  outdent-trace();
+               exception (err :: <parse-failure>)
+                  err.parse-expected :=
+                        concatenate!(err.parse-expected, " in " ## ?"token-name");
+                  indent-trace();
+                  format-trace(?"token-name" ## " no match, exp. %s at char %x",
+                               err.parse-expected, err.failure-position);
+                  outdent-trace();
+                  error(err)
+               end;
+         make("<" ## ?token-name ## "-token>", 
+              start: pos, end: stream.stream-position, ?product-name: production)
+      end function;
+      
+      // Names for the parser.
+      *rule-names*["parse-" ## ?token-name] := "?token-name";
+      *rule-name-parts*["parse-" ## ?token-name] := #( ?"token-name" );
+   }
+
+   //
+   // This form creates a parser that returns the result of an expression.
+   {
+      define parser ?token-name:name
+         rule ?rule => ?product-name:name;
+         yield ?:expression;
+      end
+   } => {
+      // Define the parser rule as the result of all the 'seq' etc. functions.
+      define constant ?token-name ## "-parser-rule" = ?rule;
+      
+      // Define the parser function including tracing and rollback. Result is
+      // yield expression.
+      define function "parse-" ## ?token-name
+            (stream :: <positionable-stream>, context) => (token)
+         indent-trace();
+         format-trace(?"token-name" ## "...");
+         let pos = stream.stream-position;
+         let ?product-name =
+               block()
+                  ?token-name ## "-parser-rule" (stream, context)
+               afterwards
+                  format-trace(?"token-name" ## " matched chars %x-%x",
+                               pos, stream.stream-position);
+               cleanup
+                  outdent-trace();
+               exception (err :: <parse-failure>)
+                  err.parse-expected :=
+                        concatenate!(err.parse-expected, " in " ## ?"token-name");
+                  indent-trace();
+                  format-trace(?"token-name" ## " no match, exp. %s at char %x",
+                         err.parse-expected, err.failure-position);
+                  outdent-trace();
+                  error(err)
+               end;
+         ?expression
+      end function;
+      
+      // Names for parser.
+      *rule-names*["parse-" ## ?token-name] := "?token-name";
+      *rule-name-parts*["parse-" ## ?token-name] := #( ?"token-name" );
+   }
+
+   //
+   // This form creates a parser that returns a symbol.
+   {
+      define parser ?token-name:name
+         rule ?rule;
+      end
+   } => {
+      // Define the parser rule as the result of all the 'seq' etc. functions.
+      define constant ?token-name ## "-parser-rule" = ?rule;
+      
+      // Define the parser function including tracing and rollback. Result is
+      // a symbol, same as token name.
+      define function "parse-" ## ?token-name
+            (stream :: <positionable-stream>, context) => (token :: <symbol>)
+         indent-trace();
+         format-trace(?"token-name" ## "...");
+         let pos = stream.stream-position;
+         block()
+            ?token-name ## "-parser-rule" (stream, context)
+         afterwards
+            format-trace(?"token-name" ## " matched chars %x-%x",
+                         pos, stream.stream-position);
+         cleanup
+            outdent-trace();
+         exception (err :: <parse-failure>)
+            err.parse-expected :=
+                  concatenate!(err.parse-expected, " in " ## ?"token-name");
+            format-trace("  " ## ?"token-name" ## " no match, exp. %s at %x",
+                         err.parse-expected, err.failure-position);
+            error(err)
+         end;
+         ?#"token-name"
+      end function;
+      
+      // Names for parser.
+      *rule-names*["parse-" ## ?token-name] := ?"token-name";
+      *rule-name-parts*["parse-" ## ?token-name] := #( ?"token-name" );
+   }
+   
+// Only allow names, the 'seq' etc. functions, or (as a fallback) any token.
+// Names are assumed to be tokens, and changed to the tokens' parser function.
+// 'seq' etc. takes those parser functions and generates a new function from
+// them.
+rule:
+   { seq(?nested-rule) ... } => { seq(?nested-rule) ... }
+   { choice(?nested-rule) ... } => { choice(?nested-rule) ... }
+   { many(?nested-rule) ... } => { many(?nested-rule) ... }
+   { opt(?nested-rule) ... } => { opt(?nested-rule) ... }
+   { opt-seq(?nested-rule) ... } => { opt-seq(?nested-rule) ... }
+   { opt-choice(?nested-rule) ... } => { opt-choice(?nested-rule) ... }
+   { opt-many(?nested-rule) ... } => { opt-many(?nested-rule) ... }
+   { req-next(?nested-rule) ... } => { req-next(?nested-rule) ... }
+   { not-next(?nested-rule) ... } => { not-next(?nested-rule) ... }
+   { ?:name ... } => { "parse-" ## ?name ... }
+   { ?:token ... } => { ?token ... }
+   { } => { }
+nested-rule:
+   { ?rule } => { ?rule }
+
+// Make sure token classes at least inherit from <token>.
+supers:
+   { ?:name, ... } => { ?name, ... }
+   { } => { <token> }
+
+// No transformation; simply describe/enforce syntax.
+class-slots:
+   { slot ?:variable = ?:expression; ... } => { slot ?variable = ?expression; ... }
+   { slot ?:variable; ... } => { slot ?variable; ... }
+   { } => { }
+end macro;
+
+
+// This auxiliary macro turns slot clauses into a class declaration. It can't
+// just do the slot part because `define class` can't have a macro call inside.
+define macro class-specifier
+   {
+      class-specifier (?:name; ?supers:*; ?class-slots)
+   } => {
+      define class "<" ## ?name ## "-token>" (?supers)
+         ?class-slots;
+      end class;
+   }
+class-slots:
+   { slot ?:variable = ?:expression; ... } => { slot ?variable; ... }
+   { slot ?:variable; ... } => { slot ?variable; ... }
+   { } => { }
+end macro;
+
+
+// This auxiliary macro turns slot clauses into field initializations. This is
+// done in the 'initialize' function for the token class so that we have access
+// to the parse product as declared.
+define macro slot-initializers
+   { slot-initializers (?token:name;
+                        slot ?slot-name:name :: ?slot-type:expression = ?:expression;
+                        ?more:*) }
+      => { ?token.?slot-name := ?expression; slot-initializers(?token; ?more) }
+   { slot-initializers (?token:name;
+                        slot ?slot-name:name :: ?slot-type:expression;
+                        ?more:*) }
+      => { slot-initializers(?token; ?more) }
+   { slot-initializers (?token:name) }
+      => { }
+end macro;

Added: trunk/libraries/utilities/peg-parser/parser-rules.dylan
==============================================================================
--- (empty file)
+++ trunk/libraries/utilities/peg-parser/parser-rules.dylan	Sun Mar  2 02:07:56 2008
@@ -0,0 +1,252 @@
+module: peg-parser
+synopsis: Discussion and implementation of PEG parser rules, as described at
+          http://en.wikipedia.org/wiki/Parsing_expression_grammar
+
+          
+/// FUNCTION: rule parser
+/// ---------------------
+/// SYNOPSIS: A function that partially parses a stream according to a rule.
+///
+/// Rule parsers are created by the 'seq', 'choice', 'many', 'opt', 'opt-seq',
+/// 'opt-choice', opt-many', 'req-next', and 'not-next' functions in the
+/// 'parser-definer' macro. They may also be created manually for efficiency
+/// or to support special behaviors via the 'parser-method-definer' macro.
+///
+/// Rule parsers must be named "parse-something" to work with the 'parser-definer'
+/// macro. If the parser fails to match, it must signal a <parse-failure> error
+/// after rolling back the position of the stream (so that another parser may 
+/// be tried).
+///
+/// These things are done automatically (except for signalling <parse-failure>)
+/// when using 'parser-method-definer' or 'parser-definer' macros or the 'seq'
+/// etc. functions.
+///
+/// ARGUMENTS:
+///   stream   - An instance of <positionable-stream>.
+///   context  - A context object.
+/// VALUES:
+///   product  - An instance of <sequence>, #f, or some other value (usually
+///              an instance of <token>), depending on the parser's rule(s).
+///              If the parser fails to match, it signals <parse-failure>.
+
+
+/// SYNOPSIS: Builds a 'rule parser' matching a sequence of elements.
+/// ARGUMENTS:
+///   "#rest sub-rules" - A series of 'rule parser's, all of which must succeed
+///                       for the returned parser to succeed.
+/// VALUES:
+///   rule-parser - A 'rule parser' returning a <sequence>. The sequence will
+///                 contain the sub-rules' products.
+define function seq (#rest sub-rules) => (rule-parser :: <function>)
+   local method seq-parser (stream :: <positionable-stream>, context)
+   => (product :: <sequence>)
+      let pos = stream.stream-position;
+      let product = make(<vector>, size: sub-rules.size);
+      block()
+         for (rule in sub-rules, i from 0)
+            product[i] := rule(stream, context);
+         end for;
+      exception (err :: <parse-failure>)
+         stream.stream-position := pos;
+         error(err);
+      end block;
+      product
+   end method;
+   
+   let format-string = apply(concatenate, "all of", map(always(" %s"), sub-rules));
+   *rule-name-parts*[seq-parser] := add(as(<list>, sub-rules), format-string);
+   seq-parser
+end function;
+
+
+/// SYNOPSIS: Builds a 'rule parser' matching one of several elements.
+/// ARGUMENTS:
+///   "#rest sub-rules" - A series of 'rule parser's, the first of which to
+///                       succeed supplies the parser's product.
+/// VALUES:
+///   rule-parser - A 'rule parser' returning one of the sub-rules' products.
+define function choice (#rest sub-rules) => (rule-parser :: <function>)
+   local method choice-parser (stream :: <positionable-stream>, context)
+   => (product)
+      let pos = stream.stream-position;
+      let product = #f;
+      let found = #f;
+      for (rule in sub-rules, until: product)
+         block()
+            product := rule(stream, context);
+            found := #t;
+         exception (err :: <parse-failure>)
+            stream.stream-position := pos;
+         end block;
+      end for;
+      if (~found)
+         error(make(<parse-failure>, position: pos, expected: rule-name(choice-parser)));
+      end if;
+      product
+   end method;
+   
+   let format-string = apply(concatenate, "one of", map(always(" %s"), sub-rules));
+   *rule-name-parts*[choice-parser] := add(as(<list>, sub-rules), format-string);
+   choice-parser
+end function;
+
+
+/// SYNOPSIS: Builds a 'rule parser' matching one or more elements.
+/// ARGUMENTS:
+///   sub-rule - A 'rule parser'.
+/// VALUES:
+///   rule-parser - A 'rule parser' returning a <sequence> containing the
+///                 sub-rule's products. 
+define function many (sub-rule :: <function>) => (rule-parser :: <function>)
+   local method many-parser (stream :: <positionable-stream>, context)
+   => (product :: <sequence>)
+      let pos = stream.stream-position;
+      let product = make(<deque>);
+      block()
+         while (#t)
+            push-last(product, sub-rule(stream, context));
+         end while;
+      exception (err :: <parse-failure>)
+         // Only consider it a failure if product is empty.
+         if (product.empty?)
+            stream.stream-position := pos;
+            error(err)
+         end
+      end block;
+      product
+   end method;
+   
+   *rule-name-parts*[many-parser] := list("one or more of %s", sub-rule);
+   many-parser
+end function;
+
+
+/// SYNOPSIS: Builds a 'rule parser' matching zero or one element.
+/// ARGUMENTS:
+///   sub-rule - A 'rule parser'.
+/// VALUES:
+///   rule-parser - A 'rule parser' returning the sub-rule's product, or #f if
+///                 the element is not present.
+define function opt (sub-rule :: <function>) => (rule-parser :: <function>)
+   local method opt-parser (stream :: <positionable-stream>, context)
+   => (product :: false-or(<object>))
+      block()
+         sub-rule(stream, context)
+      exception (err :: <parse-failure>)
+         #f    // Do nothing; indicates item not present.
+      end block
+   end method;
+   
+   *rule-name-parts*[opt-parser] := list("optional %s", sub-rule);
+   opt-parser
+end function;
+
+
+/// SYNOPSIS: Builds a 'rule parser' matching zero or more elements.
+/// ARGUMENTS:
+///   sub-rule - A 'rule parser'.
+/// VALUES:
+///   rule-parser - A 'rule parser' returning a <sequence> containing the
+///                 sub-rule's products, or #f if the elements are not present. 
+define function opt-many (sub-rule :: <function>) => (rule-parser :: <function>)
+   local method opt-many-parser (stream :: <positionable-stream>, context)
+   => (product :: false-or(<sequence>))
+      let product = make(<deque>);
+      block()
+         while (#t)
+            push-last(product, sub-rule(stream, context));
+         end while;
+      exception (err :: <parse-failure>)
+         // Do nothing; indicates series has ended.
+      end block;
+      if (product.empty?) #f else product end if
+   end method;
+   
+   *rule-name-parts*[opt-many-parser] := list("zero or more of %s", sub-rule);
+   opt-many-parser
+end function;
+
+
+/// SYNOPSIS: Builds a 'rule parser' matching all elements or none of them.
+/// ARGUMENTS:
+///   "#rest sub-rules" - A series of 'rule parser's, all of which must match
+///                       for this parser to match.
+/// VALUES:
+///   rule-parser - A 'rule parser' returning #f or a <sequence> containing
+///                 all sub-rules' products.
+define function opt-seq (#rest sub-rules) => (rule-parser :: <function>)
+   let parser = opt(apply(seq, sub-rules));
+   let format-string = apply(concatenate, "optionally all of",
+                             map(always(" %s"), sub-rules));
+   *rule-name-parts*[parser] := add(as(<list>, sub-rules), format-string);
+   parser
+end function;
+
+
+/// SYNOPSIS: Builds a 'rule parser' matching one of the specified elements or
+/// none of them.
+/// ARGUMENTS:
+///   "#rest sub-rules" - A series of 'rule parser's.
+/// VALUES:
+///   rule-parser - A 'rule parser' returning #f or the product of the
+///                 matching rule.
+define function opt-choice (#rest sub-rules) => (rule-parser :: <function>)
+   let parser = opt(apply(choice, sub-rules));
+   let format-string = apply(concatenate, "optionally one of",
+                             map(always(" %s"), sub-rules));
+   *rule-name-parts*[parser] := add(as(<list>, sub-rules), format-string);
+   parser
+end function;
+
+
+/// SYNOPSIS: Builds a 'rule parser' that looks ahead to match the sub-rule
+/// without consuming any elements.
+/// ARGUMENTS:
+///   sub-rule - A 'rule parser'.
+/// VALUES:
+///   rule-parser - A 'rule parser' returning #f.
+define function req-next (sub-rule :: <function>) => (rule-parser :: <function>)
+   local method req-next-parser (stream :: <positionable-stream>, context)
+   => (product :: <boolean>)
+      let pos = stream.stream-position;
+      block()
+         // Don't change context; don't keep results.
+         sub-rule(stream, context);
+         // If fails, condition continues on.
+      cleanup
+         stream.stream-position := pos;
+      end block;
+      #f // Doesn't consume anything.
+   end method;
+   
+   *rule-name-parts*[req-next-parser] := list("expect %s", sub-rule);
+   req-next-parser
+end function;
+
+
+/// SYNOPSIS: Builds a 'rule parser' that looks ahead to ensure the sub-rule
+/// does not match, but does not consume any elements in doing so.
+/// ARGUMENTS:
+///   sub-rule - A 'rule parser'.
+/// VALUES:
+///   rule-parser - A 'rule parser' returning #f.
+define function not-next (sub-rule :: <function>) => (rule-parser :: <function>)
+   local method not-next-parser (stream :: <positionable-stream>, context)
+   => (product :: <boolean>)
+      let pos = stream.stream-position;
+      let failure =
+            block()
+               sub-rule(stream, context);
+               make(<parse-failure>, position: pos, expected: rule-name(not-next-parser));
+            cleanup
+               stream.stream-position := pos;
+            exception (err :: <parse-failure>)
+               #f // Indicates success.
+            end block;
+      if (failure) error(failure) end;
+      #f // Doesn't consume anything.
+   end method;
+   
+   *rule-name-parts*[not-next-parser] := list("not %s", sub-rule);
+   not-next-parser
+end function;

Added: trunk/libraries/utilities/peg-parser/parser-support.dylan
==============================================================================
--- (empty file)
+++ trunk/libraries/utilities/peg-parser/parser-support.dylan	Sun Mar  2 02:07:56 2008
@@ -0,0 +1,107 @@
+module: peg-parser
+synopsis: Miscellaneous exports, tracing, and internal support.
+
+/// SYNOPSIS: Names of parsers.
+/// DISCUSSION: This is a table of <function> to <string>. Each 'rule parser'
+/// should have a corresponding <string>. This string will be used in exceptions
+/// and logging. If no string is provided for a rule parser, defaults to "?".
+define constant *rule-names* = make(<table>);
+
+
+/// A table keyed by rule parser function and containing a format string and
+/// arguments. Each argument is a parser function. 'initialize-parser' finds
+/// the name of each parser argument and substitutes it into the format string.
+/// The result is the name of the keying parser function.
+define constant *rule-name-parts* = make(<table>);
+
+
+/// SYNOPSIS: Signalled when parsing fails. Non-recoverable.
+define class <parse-failure> (<warning>)
+   // "expected:" is required because parent tokens add themselves to
+   // parse-expected.
+   slot parse-expected :: <string>, required-init-keyword: #"expected";
+   constant slot failure-position = #f, init-keyword: #"position";
+end class;
+
+
+/// SYNOPSIS: A token, a class containing information parsed from a stream.
+/// DISCUSSION: Subclasses of <token> are created by 'parser-definer', but
+/// tokens do not 'need' [em] to be of this class. Any object may be returned
+/// by a 'rule parser', but the rule parsers created by 'seq', 'choice', etc.
+/// use #f to indicate an optional item not present or a token that doesn't 
+/// consume any characters.
+define open abstract class <token> (<object>)
+   constant slot parse-start, required-init-keyword: #"start";
+   constant slot parse-end, required-init-keyword: #"end";
+end class;
+
+
+/// SYNOPSIS: Builds and caches a parser name, for debugging and exceptions.
+define function rule-name (rule-func :: <function>) => (name :: <string>)
+   if (~member?(rule-func, *rule-names*))
+      let parts = element(*rule-name-parts*, rule-func, default: #("?"));
+      let format-string = parts.first;
+      let rule-parameters = copy-sequence(parts, start: 1);
+      *rule-names*[rule-func] :=
+            apply(format-to-string, format-string, map(rule-name, rule-parameters));
+   end if;
+   *rule-names*[rule-func]
+end function;
+
+
+/// SYNOPSIS: Control of parser debugging output.
+/// DISCUSSION: Set to #f or an output stream. As each defined parser processes
+/// its rules, it will print a trace.
+define variable *parser-trace* :: false-or(<stream>) = #f;
+
+
+/// SYNOPSIS: Indents and prints a line, as a replacement for <indented-stream>
+/// which is not implemented in Open Dylan.
+define inline function format-trace(format-string :: <string>, #rest params) => ()
+   when (*parser-trace*)
+      write(*parser-trace*, make(<string>, size: *indent-level*));
+      apply(format, *parser-trace*, format-string, params);
+      write(*parser-trace*, "\n");
+   end when;
+end function;
+
+define variable *indent-level* = 0;
+
+define inline function indent-trace () => ()
+   when (*parser-trace*)
+      *indent-level* := *indent-level* + 2;
+   end when;
+end function;
+
+define inline function outdent-trace () => ()
+   when (*parser-trace*)
+      *indent-level* := *indent-level* - 2;
+      if (*indent-level* < 0) *indent-level* := 0 end;
+   end when;
+end function;
+
+
+/// SYNOPSIS: Combines common elements of a set of sequences into a new sequence.
+/// 
+/// : collect-subelements(#[#[0, 1, 2], #["red", "blue", "green"]], 1)
+/// returns
+/// : #[1, "blue"]
+///
+/// ARGUMENTS:
+///   sequences   - A collection of <sequence>.
+///   index       - An <integer>. The element of each of 'sequences' that should
+///                 be pulled out into a new resulting sequence.
+///   default:    - An <object>. If the sequence doesn't have an element at
+///                 'index', this value is used instead. Defaults to #f.
+/// VALUES:
+///   new-sequence - The resulting sequence.
+define function collect-subelements
+   (sequences :: <collection>, index :: <integer>, #key default = #f)
+=> (new-sequence :: <sequence>)
+   map-as(<deque>,
+          method (sequence :: <sequence>) => (item :: <object>)
+             element(sequence, index, default: default)
+          end,
+          sequences)
+end function;
+

Added: trunk/libraries/utilities/peg-parser/peg-parser.lid
==============================================================================
--- (empty file)
+++ trunk/libraries/utilities/peg-parser/peg-parser.lid	Sun Mar  2 02:07:56 2008
@@ -0,0 +1,5 @@
+library:    peg-parser
+files:	    library
+	    parser-definers
+	    parser-rules
+	    parser-support



More information about the chatter mailing list