The best kittens, technology, and video games blog in the world.

Friday, February 05, 2010

What is all this Perl doing in my Ruby?

Spacecat. by kmndr from flickr (CC-BY)

First, some quick background. C is a very simple programming language and doesn't have exceptions - problems are indicated with return codes, which you're supposed to check but always forget about it, resulting in all sorts of problems. C++ tried to retrofit exceptions on top of that, and it was a spectacular failure due to bad interactions between exceptions and manual memory management, but let's skip that.

Shell doesn't have exceptions either, but almost all problems result in some error message being printed to stderr, so at least you know that something went wrong.

Perl is trying to be higher level but is modeled after C and shell, so while it sort of support exceptions for some high level packages, almost all of its basic OS-interacting functions like open will fail quietly and you need to manually check their return codes - at least it's easier than C, and ... or die "Cheeseburger acquisition failed: $!"; usually suffices.

Ruby mostly copies Perl when it comes to OS interaction, but fixes this particular problem - OS interaction always raises an exception when something goes wrong. Or does it?


There is one really infuriating exception, where not only Perl error handling is worse than both C and shell, Ruby copies this design failure straight from it, and it's not even fixed in Ruby 1.9 yet!

C function system is fairly straightforward - it executes whatever string you pass to it in shell. So if there's an error and let's say the command fail doesn't exist - int main(){ system("fail"); return 0; } results in "sh: fail: command not found" printed on stderr, or somesuch depending on your variant of Unix. Just like shell would do it, and what would be sane.

Both Perl and Ruby copy this function - except they do it wrong! system funciton is not terribly efficient - it first spawns shell process, which only then executes the relevant command. So some smarty pants decided to optimize it a bit - if the string passed to Perl/Ruby system looks straightforward enough, Perl/Ruby will execute it directly (split on spaces, fork, pass to exec) without spawing the shell process.

And in this micro-shell implementation inside system they both just forgot to check for error conditions altogether. system "fail >/dev/null" (in either of these languages) looks "non-trivial", so it spawns shell process, and results in sh: fail: command not found. But system "fail" - as it's so simple - goes straight to the optimized micro-shell, and fail silently. No exception, no stderr warning, no error code, nothing.

Well yes, you could check process return code - but process return code is non-zero not only on errors, but as a generic way for Unix processes to communicate - for example diff will return non-zero if files differ, which is in no way an error condition.

The fix

The optimization should be either fixed or turned off. As a trivial workaround - because the triviality check verifies that string contains none of *?{}[]<>()~&| \ $;'`" or newline, prepending "" in front of the first non-empty character passed system() seems to work well enough. "" evaluates to an empty string in shell.

$ ruby -e 'system "\"\"fail"'
sh: fail: command not found
$ ruby1.9 -e 'system "\"\"fail"'
sh: fail: command not found
$ perl -e 'system "\"\"fail"'
sh: fail: command not found

 But seriously, please fix this, okay? Even Python gets it right already.


Keith Sader said...

Can't you just monkey-patch the offending Ruby libraries locally, then hope you never forget to re-patch them after a new version of Ruby comes out. :-)

taw said...

I submitted it here, maybe they'll fix it:

It's not terribly complex problem once you get pass #ifdefs to handle weird operating systems - I think it would be enough if rb_proc_exec in process.c checked return value from execl and print error message if it gets ENOENT, but obviously if would be nicer if someone did the patch for me ;-p

Daniel Berger said...

Too late. Way too late. There's always the 'shell' library. It's closer to what you want, and it's part of the Ruby stdlib, at least for 1.8.x. I haven't checked 1.9.x.

taw said...

Daniel Berger: Why is it really too late? I only want the heuristic to correctly print error message if things go wrong, to better match what shell does.

It shouldn't break anything - it's more an ancient bug than a change request.

Jesse said...

And as of 5.10.1, even Perl has a way to fix this sort of problem in core.

It's not a perfect solution, but it's a heck of a lot better than not having it.

Daniel Berger said...

Too late because of backwards compatibility. Use open3 and check for stderr.

But, at the end of the day, if you're using system calls in your code, you have almost certainly screwed up. Always use an API when possible. Always.

Unknown said...

I'm glad to be reading this article, I simply want to offer you a huge thumbs up for your great information.
Tableau Guru