The best kittens, technology, and video games blog in the world.

Saturday, July 17, 2010

Another example of Ruby being awesome - %W

cuteness by sparkleice from flickr (CC-NC-ND)

And there I was thinking I knew everything about Ruby, at least as far as its syntax goes...

As you might have figured out from my previous posts, I'm totally obsessed about string escaping hygiene - I would never send "SELECT * FROM reasons_why_mysql_sucks WHERE reason_id = #{id}" to an sql server even if I was absolutely totally certain that id is a valid integer and nothing can possibly go wrong here. Sure, I might be right 99% of time, but it only takes a single such mistake to screw up the system. And not only with SQL - it's the same with generated HTML, generated shell commands and so on.

And speaking of shell commands - system function accepts either a string which it then evaluates according to shell rules (big red flag), or a list of arguments which it uses to fork+exec right away. Of course we want to do that - except it's really goddamn ugly. Faced with a choice between this insecure but reasonably looking way of starting MongoDB shard servers:

system "mongod --shardsvr --port '#{port}' --fork --dbpath '#{data_dir}' \
--logappend --logpath '#{logpath}' --directoryperdb"

And this secure but godawful (to_s is necessary as port is an integer, and system won't take that):

system *["mongod", "--shardsvr", "--port", port, "--fork",
"--dbpath", data_dir, "--logappend",
"--logpath", logpath, "--directoryperdb"].map(&:to_s)

Even I have my doubts.

And then I found something really cool in Ruby syntax that totally solves the problem. Now I was totally aware of %w[foo bar] syntax Ruby copied from Perl's qw[foo bar], and while useful occasionally, is really little more than constructing a string, and then calling #split on that.

And I though I was also aware of %W - which obviously would work just like %w except evaluating code inside. Except that's not what it does! %W[foo #{bar}] is not "foo #{bar}".split - it's ["foo", "#{bar}"]! And using a real parser of course, so you can use as many spaces inside that code block as you want.

system *%W[mongod --shardsvr --port #{port} --fork --dbpath #{data_dir}
--logappend --logpath #{logpath} --directoryperdb]

There's nothing in Perl able to do that. Not only it's totally secure, it looks every better than the original insecure version as you don't need to insert all those 's around arguments (which only half-protected them anyway, but were better than nothing), and you can break it into multiple lines without \s.

%W always does the right thing - %W[scp #{local_path} #{user}@#{host}:#{remote_path}] will keep the whole remote address together - and if the code block returns an empty string or nil, you'll get an empty string there in the resulting array. I sort of wish there was some way of adding extra arguments with *args-like syntax like in other contexts, but %W[...] + args does exactly that, so it's not a big deal.

By the way, it seems to me that all % constructors undeservingly get a really bad reputation as some sort of ugly Perl leftover in Ruby community. This is so wrong - what's ugly is excessive escaping with \ which they help avoid. Which regexp for Ruby executables looks less bad, the one with way too many \/s - /\A(\/usr|\/usr\/local|\/opt|)\/bin\/j?ruby[\d.]*\z/, or one which avoids them all thanks to %r - %r[\A(/usr|/usr/local|/opt|)/bin/j?ruby[\d.]*\z]?

By the way - yes I used []s inside even though they were the big demarcator. That's another great beauty of % constructions - if you demarcate with some sort of braces like [], (), <>, or {} - it will only close once every matched pair inside is closed - so unlike traditional singly and doubly quoted strings % can be nested infinitely deep without a single escape character! (Perl could do that one as well)

And speaking of things that Ruby copied from Perl, and then made them much more awesome, here's a one-liner to truncate a bunch of files after 10 lines, with optional backups. Which language gets even close to matching that? ($. in both Perl and Ruby will keep increasing from file to file, so you cannot use that)

ruby -i.bak -ple 'ARGF.skip if ARGF.file.lineno > 10' files*.txt


Anonymous said...

%Q is pretty goddamned sweet as well.

Alan said...

interpolated strings: go big with a side of fries and a chocolate shake. mmm.

Anonymous said...

Just for the record, Perl5 has the quotemeta function, or \Q operator, to help protect against code injection on a "system" call:

system "ls \Q$path";

But a %W Perl5 equivalent would be nice, such as qqw//, which is missing from the Perl5 language but is available in CPAN, and in Perl6.

taw said...

Anonymous: Is Perl getting #{} sometimes too? Ruby %q{} %Q{} #{} are a lot like Lisp quote, quasiquote, and unquote.

Every time I need to write some Perl code this is the first thing I miss.

And once you have #{} you don't really need sigils, and can have object aware of what they are so just one ==/<=> that correctly works even for nested collections - and then you can have sane assert_equal like every language except Perl...

Perl5 was amazing at its time, but once you decide that backwards compatibility doesn't matter - Ruby is a much better successor of Perl5 than Perl6.

Anonymous said...

Is this my daughter's rabbit? If it is, what is his picture doing here???

Anonymous said...

Yes, he is :). A little finnish bunny :))).

sparkleice said...

And I took the picture, which has Creative Commons rights, so it's cool that you used it :). Thanks!

taw said...

Anonymous: If you mouseover you'll see credits and licensing information - cuteness by sparkleice from flickr (CC-NC-ND).

Cute bunny, definitely :-D

Anonymous said...

Thanks to both of you :)))