Tuesday, May 08, 2007

New parser for RLisp - with string interpolation and regexps

Felipe by José Luna from flickr (CC-NC-SA)
A few hours ago I wrote about syntax supported by RLisp and complained that it lacked decent String interpolation and Regexp syntax.

After writing that I tried to add support for both to the existing ANTLR-based parser. Unfortunately the code generated for parsing interpolated strings raised exceptions instead of backtracking. I think I understand why it was happening, but I didn't see any obvious way of fixing it without making the grammar exceedingly complicated. So I rewrote it all with a bunch of regular expressions for lexer and recursive descend for a parser. Lisp parsers are easy. Now it's possible to write arbitrary RLisp code inside interpolated strings, and write regexps which aren't any uglier than usual:
rlisp> (let m ["40 + 2 = #{(+ 40 2)}" match /(\d+)$/])
#<MatchData:0xb7cacf94>
rlisp> [m get 1]
"42"

Perl-style $1, $2, $~, $` etc. are not supported, only explicit MatchData. $~ support in Ruby is already a big ugly hack (in Perl too, but in Perl everything is a big ugly hack, so it doesn't stick out that much). I'd rather see some clever macros for that. If macro solution won't be possible, I'll try adding $~ and friends to RLisp somehow.

And while we're talking about parsers, RLisp has half-decent backtraces (it had them with the old parser too, I just never got to write about it), with file names, line numbers and function names mostly right:

$ ./rlisp.rb
rlisp> (defun f (x) (g x))
#<Proc:0xb7c276dc@STDIN:1>
rlisp> (map f (list 1 2 3))
No such global variable: g
./rlisp.rb:376:in `default_globals'
STDIN:1:in `call'
STDIN:1:in `default'
STDIN:1:in `[]'
STDIN:1:in `f'
stdlib.rl:112:in `map'
stdlib.rl:112:in `send'
stdlib.rl:112:in `map'
STDIN:2:in `call'
STDIN:2:in `run'
./rlisp.rb:787:in `run'
./rlisp.rb:798:in `repl'
./rlisp_grammar.rb:30:in `each_expr'
./rlisp.rb:796:in `repl'
./rlisp.rb:884:in `main'
./rlisp.rb:888

The rewrite also reduced size of RLisp package from 1MB to some 30kB - earlier it included ANTLR jars just in case someone wanted to play with the grammar.

No comments:

Post a Comment