7 Oct 2004 09:35
Re: [groovy-dev] making the bytecode generation more understandable
On 7 Oct 2004, at 08:21, John Rose wrote:
> On Oct 6, 2004, at 10:08, John Wilson wrote:
>> I'm quite keen in generating Java and compiling that. It doesn't look
>> too hard to walk an AST and generate Java. It's not a mechanism that
>> you would use in production but I think it might be very helpful in
>> investigating optimisations as you can try out different strategies
>> by just editing the generated code (a decompiler is another option,
>> of course).
>
> I agree with this. Source generation generally makes development
> easier, since there are more ways to look at and use source code. The
> new backend should be designed with the intention of generating both
> source and binary. This means that the source code has to be kept
> fairly low-level, without too much effort invested in making it
> beautiful to human readers.
Agreed. However its pretty trivial to walk a Java AST and generate
source code. So I'd expect that to be done for Janino/Serp anyways.
i.e. we could probably reuse this - or worst case we just write a
simple Java-AST walker/visitor.
Most importantly, we should just have 1 mapping of Groovy AST -> Java
AST and not have to maintain 2 mappings which could easily get out of
wack (Groovy AST -> bytecode and Groovy AST -> Java source).
>> Going through a Java compiler (either by generating Java or by
>> generating an AST) imposes Java limitations on us.
Other than names, there's little limitations really. e.g. pretty much
every language feature of Groovy maps to some pretty straight forward
Java code under the hood.
>> The Java is far more restrictive in the allowed spelling of
>> identifiers than the JVM is. If I read him right John Rose is
>> considering extending the Groovy definition of identifier to a
>> superset of Java's. This would be a problem.
>
> The Groovy language can support Unicode names without significant harm
> to interoperability with Java, by providing a mapping to JVM names
> ("bytecode names") that respects Java practices. I mean name
> mangling, something akin to what's done for nested classes, but with
> hex numbers for code points. The Borneo language provides a sketch of
> this sort of technique in the case of operator names.
>
> I'm thinking of something pretty low-impact, which does not conflict
> with other kinds of Java identifiers in wide use. I also want it to
> be relatively readable: A mangling should be short and easy to
> recognize, and should not encode "normal" characters. I just put a
> detailed proposal into the wiki:
> http://docs.codehaus.org/display/GroovyJSR/extended+names .
Agreed.
Incidentally one of the main drivers of making the bytecode generation
more understandable is to be able to really tune things. e.g. if you
use static typing (or the compiler can easily deduce the type of an
expression) we really should be able to generate bytecode which is as
efficient as Java. As well as being a completely dynamic scripting
language, I'd also like to use Groovy as a drop in replacement for Java
for high performance coding as well.
James
-------
http://radio.weblogs.com/0112098/
RSS Feed