6 Feb 08:13
Writing lexers in Lua
Neil Hodgson <nyamatongwe <at> gmail.com>
2010-02-06 07:13:07 GMT
2010-02-06 07:13:07 GMT
There is now some experimental support for writing lexers in Lua. The API is similar to the StyleContext class used in LexCPP although the low level calls from Accessor are also available. It is very likely that the API will change and it may still be 'experimental' when included in a release of SciTE so that the API can be fixed after more experience. The documentation is at http://www.scintilla.org/ScriptLexer.html Iteration is by character rather than byte and styler.Current(), styler.Next(), and styler.Previous() return strings containing all the bytes in multi-byte characters. If the document is in UTF-8 with the value "«" then the initial value of styler.Current() is "«" which is the same as "\xc2\xab". If the document is in Latin-1 then "«" is "\xab". This makes it easy to write lexers for a particular encoding in that encoding as code can be written naturally like styler.Match("«"). Lexers for languages that depend on characters outside ASCII for syntax and that have to deal with multiple encodings will be more complex. The API still uses byte positions for Position() and other calls since it is costly to convert byte positions to character positions and vice versa. Another change from previous lexers is that there is an imaginary extra NUL ('\0') character at the end of the document when using styler.More() .. styler.Forward(). This makes it easier to treat the end of the document as if it was the end of a line which means the normal code for determining that an identifier is a keyword will trigger. This avoids the common lexer problem of keywords at the end of the document not highlighting. Configuring a script lexer is indicated by using a lexer name that starts with "script_". There can be multiple script lexer languages mentioned in the properties files at once although there is only one OnStyle. All script languages use the same numeric lexer ID SCLEX_CONTAINER. The lexer name is available as a member of the styler object. The implementation is not great due to my not understanding section Chapter 28 of "Programming in Lua" http://www.lua.org/pil/28.html. Rather than producing a new Lua type, I made the styler a table and then had the 'methods' use closures to access a C struct containing the state. This means that the call is styler.More() rather than styler:More() which would be more expected. I hope someone understands this aspect of Lua better than me and can fix the code. The current code has *not* been committed to CVS. It is available from http://www.scintilla.org/scite.zip Source http://www.scintilla.org/wscite.zip Windows executable Neil -- -- You received this message because you are subscribed to the Google Groups "scite-interest" group. To post to this group, send email to scite-interest <at> googlegroups.com. To unsubscribe from this group, send email to scite-interest+unsubscribe <at> googlegroups.com. For more options, visit this group at http://groups.google.com/group/scite-interest?hl=en.
RSS Feed