The best kittens, technology, and video games blog in the world.

Wednesday, April 20, 2016

Patterns for testing command line scripts

Lab Mouse checkin out the camera by Rick Eh? from flickr (CC-NC-ND)
It's relatively easy to test code which lives in the same process as test suite, but a lot of the time you'll be writing standalone scripts, and it's a bit more complicated to test those. Let's talk about some patterns for testing them.

Examples in RSpec, but none of these patterns depend on test framework.

Manual Testing Only

That's actually perfectly legitimate. If your script is a few lines of straightforward code, you can just check that it works manually a few times, and then completely forget about it. Usefulness of automated tests in such case probably won't be very high.

I'd recommend not relying on just that for more complicated scripts.

STDOUT testing

A lot of scripts take arguments from command line or STDIN, and output results to STDOUT, possibly STDERR or exit code as well.

A bunch of expect(`script --input`).to eq("output\n") style tests are very easy to use and can go very far.

If you need to test a bit more complicated interactions - setting environmental variables, writing to STDIN, reading from both STDOUT and STDERR, checking error code etc. - IO.popen and Open3 module offer reasonably convenient APIs.

Of course only certain category of scripts can be reasonably tested this way, but it's a fairly big category.

Testing as library code

A fairly common pattern is to move most of the code from "script" file to a separate "library" file, which can be required by both. It's a bit awkward, as script no longer lives in one file.

It's not always obvious where to divide the library from the script - if you put everything in the library, it makes it pretty much useless for anything except program itself. If you keep things like parsing command line arguments separate, that results in possibly useful "library", but leaves more "script" code untested.

if __FILE__ == $0

It used to be a very common pattern which I don't see that often these days. What if we have a file which works as a library you can require, but it acts as a script if it's ran directly? Here's the typical code for such script:

class Script
  def initialize(*args)
  def run!

if __FILE__ == $0*ARGV).run!

Depending on how you feel you might do command line argument parsing either in initializer, or in if __FILE__ == $0 block.

Code written in this style generally doesn't intend to be used as a library, and this hook is there primarily for sake of testing.

Temporary directory

Frequently scripts interact with files. That's more complicated to setup. Don't try anything silly like using current directory or single tmp where leftovers from previous test runs might be left.

I'd recommend creating new temporary directory and going there. Add code like this to your test helpers:

def Pathname.in_temporary_directory(*args)
  Dir.mktmpdir(*args) do |dir|
    Dir.chdir(dir) do

Then you can then use Pathname.in_temporary_directory do |dir| ... end in your tests, and it will handle switching back to previous directory and removing temporary one automatically.

In every such block you can write files you want, run command, and check any generated files, without worrying about contaminating filesystem anywhere.

There's just a minor complication here - you'll be changing your working directory, so you'll need to call your script using absolute rather than relative path. Simply do something like:

let(:script) { Pathname(__dir__) + "../bin/script"  }

To get absolute path to your script and then use that.

Mocking network

All that covers most of possible scripts, but I recently figured out one really fun trick - how to test scripts which read from network?

Within our tests we have gems like webmock and vcr can fake network communication, but what if we want to run a script? Well, just save this file as mock_network.rb:

require "webmock"
require "vcr"

VCR.configure do |config|
  config.cassette_library_dir = Pathname(__dir__) + "vcr"
  config.hook_into :webmock

VCR.insert_cassette('network', :record => ENV["RECORD"] ? :new_episodes : :none)

END { VCR.eject_cassette }

And then run your script as system "ruby -r#{__dir__}/mock_network #{script} #{arguments}", possibly in conjunction with any other of the techniques presented here.

To record network traffic you can run your tests with RECORD=1 rspec, then once you're finished just run rspec normally and it will use recorded requests.

Mocking other programs

Previous pattern assumed the script was using some Ruby library like net/http or open-uri for network requests. But it's very common to use a program like curl or wget instead.

In such case:
  • write your mock curl, doing whatever you'd like it to do for such test
  • within test, change ENV["PATH"] to point to directory containing your mock curl as first element
  • run script under test
This works reasonably well, as almost all programs call each other via ENV["PATH"] search, not by absolute paths, and usually expect fairly simple interactions.

Like all heavy handed mocking, this can fail miserably if the program decides to pass slightly different options to curl etc., and unlike webmock this style of interaction doesn't block network access so you can miss something.

All these patterns leak

None of these pattern are perfect - they assume how script is going to interact, and they don't actually isolate script from network, filesystem (outside temporary directory you created), Unix utilities etc., so a buggy script can still rm -rf your home directory.

For testing very complicated interactions, you might need to use virtual machine, or some OS-specific isolation mechanism like chroot. Fortunately only relatively few scripts really need such techniques.


Andrew Radev said...

Nice article, thanks.

A while back, I wrote a tool that would run the "rails server" command and play elevator music until it boots up (waiting-on-rails). Testing this was pretty new for me, but you might appreciate my attempt.

Basically, I created a bunch of stubs for all commands, like this one for rails and a "Command stub" object that wraps one of these stubs and lets you feed it fake output via unix domain sockets. It worked surprisingly well, I think. Here's an example of one of the specs.

taw said...

Andrew: Very interesting approach.