27 Jan 2005 05:28
Re: [groovy-dev] The opposite of builders
Thanks for these suggestions Mark.
On the first suggestion, is the idea to make the Groovy parser
extensible with other grammars; or to provide a parser creation
mechanism using Groovy itself, used as a class library withing Groovy? -
I wasn't quite sure which you meant.
On the second idea of pattern matching; how about using the Groovy
switch statement?
for (element in doc) {
switch (element) {
case new QName("Library"):
return doFoo(element)
case new QName("Whatnot"):
return doWhatnot(element)
default:
throw new Exception("Unknown element name $element.name")
}
The only 'trick' to this is figuring out some kind of filter/predicate
expression which matches on the XPath / QName of the element. I made up
some dummy QName class above - I'm sure there are other more expressive
ways of doing this. e.g. the above switch could use regex expressions
and so forth.
Maybe using dynamic properties on QName / Namespace might be nicer
xsl = new Namespace("http://www.w3.org/whateverItIs")
// match an explicit QName in a namespace
case xsl.element: return "xsl:element"
// match anything in the XSL namespace
case xsl: return "xsl"
// match a local name
case QName.foo: return "<foo>"
// match a full XPath expression
case new XPath("//foo[ <at> bar = 123]"): return "xpath"
The above are all XML specific; I'm sure we could add filter/matcher
objects for other kinds of domains like annotations or SQL etc.
Thoughts?
Mark Chu-Carroll wrote:
> Hi folks.
>
> I've been doing some hacking lately looking at the opposite of
> builders in groovy. That is, I think a lot of current code needs to
> deal with things like XML, which have a complex enough structure that
> having a good, generic language mechanism for dealing with it is
> really useful.
>
> I've been working on writing libraries for doing this kind of complex
> object decomposition in pure Java for Stellation; and I'm wondering
> whether people would be interested in seeing it as a first class
> citizen in Groovy.
>
> I've been experimenting with two different approaches.
>
> The first is parser combinators. Parser combinators are a way of
> writing code that
> looks pretty much like the BNF grammar for some language, and which executes as
> a full backtracking parser for that grammar. For example, in my
> library, you could
> take a grammar:
>
> Symbol ::= alpha+
> ListOrSymbol ::= List | Symbol
> List ::= EmptyList | NonEmptyList
> EmptyList ::= "(" ")"
> NonEmptyList ::= "(" ListOrSymbol ( "," ListOrSymbol )* ")"
>
> And a working parser for this would be:
>
>
> ElementParser sym = new CharParser("abcdefghijk").some();
> ElementParser lparen = new CharParser("(");
> ElementParser rparen = new CharParser(")");
> ElementParser comma = new CharParser(",");
>
> ElementParser symOrList = new AltParser(new ElementParser[] {
> sym,
> ElementParser.byName("List") });
> ElementParser nonEmptyList = ElementParser.seq(new ElementParser[] {
> lparen,
> symOrList,
> ElementParser.seq(new ElementParser[] { comma,
> symOrList }.many(),
> rparen});
> ElementParser emptyList = ElementParser.seq(new ElementParser[] {
> lparen, rparen });
> ElementParser list = ElementParser.alt(new ElementParser[] {
> nonEmptyList, emptyList });
> ElementParser.bindParserToName("List", list);
>
> Then, you could parse a list with "list.parse(thing to parse)".
>
> The real advantage of these things is that it's very easy to define
> primitives and new combinators to make it work. So, for example, you
> could add a new parser for an XML tag
> in just a few lines of code, and present it as if it were a primitive.
>
> The main thing that's really ugly about the combinator library in Java
> is all of those
> "new ElementParser[] { ... }" things. In Groovy, we could actually
> implement parser
> combinators using the simple list syntax, so that the above would look
> more like:
>
> sym = new CharParser("abcdefghijk").some()
> lparen = new CharParser("(")
> rparen = new CharParser(")")
> comma = new CharParser(",")
>
> symOrList = new AltParser([sym, ElementParser.byName("List") ])
> nonEmptyList = ElementParser.seq([lparen, symOrList,
> ElementParser.seq([comma, symOrList]).many(),
> rparen})
> emptyList = ElementParser.seq([lparen, rparen])
> list = ElementParser.alt([nonEmptyList, emptyList]);
> ElementParser.bindParserToName("List", list);
>
> To do it even better, once the builder syntax stabilizes, we could
> even use builders
> to specify parsers!
>
> ====================================
>
> The second approach that I've been looking at is pattern matching,
> like you find in
> functional programming languages. This is a very different approach
> from the combinator
> parser thing. CPs have the advantage that you can parse anything that
> you can write
> a grammar for - but they can be a lot of work to write. Pattern
> matching is a lot more
> limited, but a lot less effort to use.
>
> The idea is that you write patterns that describe the abstract
> structure of something,
> with a number of choices, and when one choice matches, it implicitly
> decomposes the
> structure and binds some variables.
>
> So, for example, say you wanted to use pattern matching for some
> simple XML, like
> the library list that shows up in so many documents:
>
> <Library>
> <Book author="Mark Chu-Carroll" title="Having a big nose is fun"/>
> <Series author="J.R.R. Tolkien" title="Lord of the Rings">
> <Book title="Fellowship of the ring"/>
> ..
> </Series>
> </Library>
>
> You could match entries inside the library with something like:
>
> document.match(
> Library( entries ) ->
> entries.each { it.match(
> Book(author,title) ->
> processBook(author, title)
> Series(author,title) ->
> processSeries(author, title)) })
>
> This is the approach that Scala uses for XML, and I've used it
> extensively in Objective CaML programming. It really does need some
> fancy syntax though - writing it as a Java library
> is positively stupid - insanely complex to write code to use it,
> unreadable when you write it.
> But in groovy, we could add a pattern match construct, and really make
> this work.
>
> --------------------------------------
>
> So - are people sufficiently interested in either or both of these to
> justify my writing up a more concrete proposal of what they would look
> like in Groovy?
>
> -Mark
>
>
>
>
>
RSS Feed