jastrachan | 6 Apr 11:10 2004
Picon

Re: Extensible parser.

Very interesting! I think some kind of macro system would be very 
useful. My biggest fear is macros breaking refactoring support (or at 
least making it very hard).

A few use cases I can imagine for macros...

* defining properties along with bound/unbound listener notifications
* defining lazily-constructed properties
* listener stuff
* logging

     Log.{ About to open file ${foo} }

which could hide all the plumbing details of using a particular logging 
framework. Though maybe a mixin is just as easy for logging?

Though apart from the above, I do struggle to imagine when I'd find 
myself using macros. FWIW I see using AOP to introduce advice into 
existing objects as far more common - e.g. adding security, 
transactions, persistence and so forth.

On 6 Apr 2004, at 03:25, Bruce Chapman wrote:
> Hi all,
>
> I am a real newby here so forgive any faux pas.
>
> I have been thinking about this sort of thing for a while, and have
> experimented with extending the java compiler by running user code as 
> part
> of the compiler process.
>
> So here goes with a bit of a brain dump.
>
> With regard to macro processing
>
> In the wiki here http://wiki.codehaus.org/groovy/FoemmelsRequirements 
> there
> is a reference to http://www.ai.mit.edu/~jrb/Projects/dexprs.pdf which 
> has
> some useful background info and some ways of classifying macro systems 
> etc.
>
> Especially see section 3, An overview of Syntax representations.
>
> What I would propose is a token based macro system (as opposed to a 
> syntax
> tree based one as proposed by Neil), with the following form, (which 
> gets
> around the problems mentioned in 3.2 of the d-expressions paper)
>
> A macro "call" looks like this
>
> Classname ".{" anyTokensParenthesesMatch "}"
>
> anyTokensParenthesesMatch  is a list of tokens with matching 
> parentheses. I
> could write a formal definition but I wont for simplicity.
>
> The parser would just parse up to the closing "}"
>
> A macro "call"  can exist where a class member declaration, a 
> statement, or
> ( possibly)  an expression is required.
>
> The compiler then instantiates an instance of classname (which must
> implement a specific interface - say CodeGenerator.) Once it has 
> worked out
> the fully qualified name.
>
> - caution JSR14 syntax in following signatures -
>
> The CodeGenerator is passed the list of tokens and returns a list of 
> errors.
> (may be empty)  - List<ParsingError> parse(List<token>)
>
> -aside - The tokens would have type, value, and a source position.
>
> If there are errors the compiler displays them. ParsingError would 
> include
> description and the offending token or at least its source position.
>
> If there are no errors, the CodeGenerator is called to generate the 
> expanded
> macro.
>
> List<Token> generate(List<Token>)
>
> The compiler then parses the generated replacement list of tokens in 
> place
> of the orginal macro call construct. If there are errors in there they 
> are
> processed as normal, the tokens will either
>  - have been copied from the original list of tokens, and so will have 
> a
> source position in the original source.
>  Or - will have been generated inside the generate method (using a 
> token
> factory passed as an argument but not shown in the signature above) In 
> this
> case the factory can use a stack trace for each token synthesised, to 
> give a
> source position  inside the generate method. Of course good macro 
> generators
> won't have syntax errors in their synthetic tokens :)
>
> Example Usage
> -------------
>
> class Person {
>     Properties.{
> 	readonly String name
>       readonly Date dateOfBirth
> 	Any address
>     }
>     // more vanilla groovy code here
> }
>
> public class Properties implements ClassMemberGenerator {
>     public Properties() {} // so the compiler can instantiate it !!
>     List<ParseError> parse(List<Token> args) { .... }
>     List<Token> generate(List<Token> args, TokenFactory f, Context c) {
>         ...
>     }
> }
>
>
>
> Some advantages of this approach
> --------------------------------
>
> All macro syntax must maintain some semblance of groovyness in that 
> only
> groovy lexing is used, other than that anything goes (notwithstanding 
> "{"
> matching requirements). Even XML!
>
> Can use standard lexer (without extension).
>
> Macro syntax is defined as much or as little as required inside the
> generator - no need to generate some formal grammar object (tree) to 
> be used
> by the parser per-se.
>
> The output of the generate() method is a list of tokens, which is 
> closer to
> how mortal programmers think of their code, Compare this with a syntax 
> tree
> which requires far more specialist ( &internal?) knowledge.
>
> The existance of the macro call/expansion, and its scope ".{" and 
> matching
> "}" are very EXPLICIT with this syntax. One problem with reading code 
> that
> uses macros is that it can be hard to discern what is macro and what is
> formal language, this makes it really obvious, without being verbose. I
> would put both of these (obvious and NOT verbose) quite near the top 
> of any
> requirements list for a macro system.
>
> The macro call scope is natural (obvious), even if you hadn't seen the
> syntax before.
>
> The "classname.{ tokens }" syntax hints that classname is processing 
> the
> tokens in some way, which is not a method call or any other runtime
> operation. There are elements (hints of) of a method call syntax 
> there, but
> it quite definitely isn't a method call.
>
> Avoids the use of some new formal syntax for defining macros. Its all 
> just
> java (or compiled groovy) code. (Neil's proposal does not exhibit this
> problem either - but many macro systems do and it is a problem because 
> it
> makes the functionality less accessible to mortals).
>
> An IDE could easily toggle between displaying the macro call, and its
> expansion, by calling the parse and generate methods on the Generator. 
> You
> could maybe even edit the generated code and (because the input tokens 
> and
> output tokens are ===), reverse engineer the change back into the 
> macro call
> (or not if a token synthesised by the token factory was modified or was
> attempted to be).
>
> The macro generation code can use the full power of the JVM and 
> libraries,
> so you could for instance access a database and build a Data access 
> object.
>
> The places where a macro call may exist are quite clearly defined. this
> helps to keep things explicit (good for code maintainers).
>
>
>
>
> I have more ideas on this and some more details but I'll leave it 
> there for
> now and see what others think.
>
> Bruce Chapman
>
> _______________________________________________
> groovy-dev mailing list
> groovy-dev@...
> http://lists.codehaus.org/mailman/listinfo/groovy-dev
>
>

James
-------
http://radio.weblogs.com/0112098/

Gmane