jstrachan | 27 Jan 05:28 2005
Picon

Re: [groovy-dev] The opposite of builders

Thanks for these suggestions Mark.

On the first suggestion, is the idea to make the Groovy parser 
extensible with other grammars; or to provide a parser creation 
mechanism using Groovy itself, used as a class library withing Groovy? - 
I wasn't quite sure which you meant.

On the second idea of pattern matching; how about using the Groovy 
switch statement?

for (element in doc) {
switch (element) {
   case new QName("Library"):
     return doFoo(element)

   case new QName("Whatnot"):
     return doWhatnot(element)

   default:
      throw new Exception("Unknown element name $element.name")
}

The only 'trick' to this is figuring out some kind of filter/predicate 
expression which matches on the XPath / QName of the element. I made up 
some dummy QName class above - I'm sure there are other more expressive 
ways of doing this. e.g. the above switch could use regex expressions 
and so forth.

Maybe using dynamic properties on QName / Namespace might be nicer

xsl = new Namespace("http://www.w3.org/whateverItIs")

// match an explicit QName in a namespace
case xsl.element: return "xsl:element"

// match anything in the XSL namespace
case xsl: return "xsl"

// match a local name
case QName.foo: return "<foo>"

// match a full XPath expression
case new XPath("//foo[ <at> bar = 123]"): return "xpath"

The above are all XML specific; I'm sure we could add filter/matcher 
objects for other kinds of domains like annotations or SQL etc.

Thoughts?

Mark Chu-Carroll wrote:

> Hi folks.
> 
> I've been doing some hacking lately looking at the opposite of
> builders in groovy. That is, I think a lot of current code needs to
> deal with things like XML, which have a complex enough structure that
> having a good, generic language mechanism for dealing with it is
> really useful.
> 
> I've been working on writing libraries for doing this kind of complex
> object decomposition in pure Java for Stellation; and I'm wondering
> whether people would be interested in seeing it as a first class
> citizen in Groovy.
> 
> I've been experimenting with two different approaches.
> 
> The first is parser combinators. Parser combinators are a way of
> writing code that
> looks pretty much like the BNF grammar for some language, and which executes as
> a full backtracking parser for that grammar. For example, in my
> library, you could
> take a grammar:
> 
>    Symbol ::= alpha+
>    ListOrSymbol ::= List | Symbol
>   List ::= EmptyList | NonEmptyList
>   EmptyList ::= "(" ")"
>   NonEmptyList ::= "(" ListOrSymbol ( "," ListOrSymbol )* ")"
> 
> And a working parser for this would be:
> 
> 
>         ElementParser sym = new CharParser("abcdefghijk").some();
>         ElementParser lparen = new CharParser("(");
>         ElementParser rparen = new CharParser(")");
>         ElementParser comma = new CharParser(",");
>         
>         ElementParser symOrList = new AltParser(new ElementParser[] {
>                 sym,
>                ElementParser.byName("List") });
>         ElementParser nonEmptyList = ElementParser.seq(new ElementParser[] { 
>                 lparen, 
>                 symOrList,
>                 ElementParser.seq(new ElementParser[] { comma,
> symOrList }.many(),
>                 rparen});
>     ElementParser emptyList = ElementParser.seq(new ElementParser[] {
> lparen, rparen });
>     ElementParser list = ElementParser.alt(new ElementParser[] {
> nonEmptyList, emptyList });
>     ElementParser.bindParserToName("List", list);
> 
> Then, you could parse a list with "list.parse(thing to parse)". 
> 
> The real advantage of these things is that it's very easy to define
> primitives and new combinators to make it work. So, for example, you
> could add a new parser for an XML tag
> in just a few lines of code, and present it as if it were a primitive.
> 
> The main thing that's really ugly about the combinator library in Java
> is all of those
> "new ElementParser[] { ... }" things. In Groovy, we could actually
> implement parser
> combinators using the simple list syntax, so that the above would look
> more like:
> 
>         sym = new CharParser("abcdefghijk").some()
>         lparen = new CharParser("(")
>         rparen = new CharParser(")")
>         comma = new CharParser(",")
>         
>         symOrList = new AltParser([sym, ElementParser.byName("List") ])
>         nonEmptyList = ElementParser.seq([lparen,  symOrList,
>                 ElementParser.seq([comma, symOrList]).many(),
>                 rparen})
>     emptyList = ElementParser.seq([lparen, rparen])
>     list = ElementParser.alt([nonEmptyList, emptyList]);
>    ElementParser.bindParserToName("List", list);
> 
> To do it even better, once the builder syntax stabilizes, we could
> even use builders
> to specify parsers!
> 
> ====================================
> 
> The second approach that I've been looking at is pattern matching,
> like you find in
> functional programming languages. This is a very different approach
> from the combinator
> parser thing. CPs have the advantage that you can parse anything that
> you can write
> a grammar for - but they can be a lot of work to write. Pattern
> matching is a lot more
> limited, but a lot less effort to use.
> 
> The idea is that you write patterns that describe the abstract
> structure of something,
> with a number of choices, and when one choice matches, it implicitly
> decomposes the
> structure and binds some variables.
> 
> So, for example, say you wanted to use pattern matching for some
> simple XML, like
> the library list that shows up in so many documents:
> 
>    <Library>
>       <Book author="Mark Chu-Carroll" title="Having a big nose is fun"/>
>       <Series author="J.R.R. Tolkien" title="Lord of the Rings">
>         <Book title="Fellowship of the ring"/>
>          ..
>       </Series>
>    </Library>
> 
> You could match entries inside the library with something like:
> 
>    document.match(
>       Library( entries ) ->
>          entries.each { it.match(
>                                   Book(author,title) ->
> processBook(author, title)
>                                   Series(author,title) ->
> processSeries(author, title)) })
> 
> This is the approach that Scala uses for XML, and I've used it
> extensively in Objective CaML programming. It really does need some
> fancy syntax though - writing it as a Java library
> is positively stupid - insanely complex to write code to use it,
> unreadable when you write it.
> But in groovy, we could add a pattern match construct, and really make
> this work.
> 
> --------------------------------------
> 
> So - are people sufficiently interested in either or both of these to
> justify my writing up a more concrete proposal of what they would look
> like in Groovy?
> 
>     -Mark
> 
> 
> 
> 
> 


Gmane