Saturday, July 28, 2012

Meaningful highlighting of failed tests for test/unit in Ruby

Keep Your Eyes on the String by Picture Zealot from flickr (CC-NC-ND)

People are often confused by what tests actually do - their role is not merely indicating that something is wrong.

If somehow Flying Spaghetti Monster bestowed upon you a divine gift of a Perfect Test Server, which would literally read your mind, and tested any of your programs instantaneously indicating by either green or red light if the tested program Does What You Mean - such a gift would be pretty much worthless, except for some extremely narrow domains like crypto protocols (which actually tend to fail because your threat model was insufficiently imaginative...).

First, it would universally show the red fail light on any program, because with sufficiently strong testing some imperfection will be eventually found, and second - without some kind of indication what's the nature of the problem there's pretty much nothing useful you can do other than stare at code and hope for sudden burst enlightenment.

This by the way is another reason why static typing and proving properties of code are a total waste of time - try making a typo in any nontrivial C++ STL program, and see for yourself how much compiler's perfectly correct information that "something's wrong with your program" is going to help you.

The primary function of a test suite is helping you locate the nature of a problem, and which code is likely to be responsible for the problem. That's why tests produce meaningful messages on what differed between expectation and actual result, why we have different kinds of tests (with very inconsistently applied names like "unit", "functional", "integration", "regression" and so on), why continuous integration servers test every commit to correlate code change with test failure and so on.

Bright Eyes by o palsson from flickr (CC-BY)

Piles of microassertions anti-pattern

One fairly common testing anti-patterns which annoys me a lot are tests which have ton of microassertions, which tend to pass together or all fail together, moral equivalent of this:

  assert_equal "Hello", hw.message
  assert_equal ", ", hw.separator
  assert_equal "world", hw.target
  assert_equal "!", hw.punctuation

Now imagine that hello world package accidentally changed to German somehow, or UTF-16BE, or some other crazy thing - every single assertion will fail simultaneously. Unfortunately you will never get any information about what actually happened with any assertion other than first - and debug prints will come.

This can be improved somewhat to not terribly pretty but much more useful:

  assert_equal ["Hello", ", ", "world", "!"], [hw.message, hw.separator, hw.target, hw.punctuation]

If they fail together, you'll be given full information on what precisely happened.

This kind of structure-vs-structure comparison is much more useful - especially for regression testing where you test against saved known complex output, and integration testing where you test outputs of individual component against outputs of entire subsystem. Probably less so in low level unit tests where you'd actually have to type expected value manually.

Unfortunately we immediately run into second problem - when such comparison fails, we get a massive message in which it might be hard to localize which parts are the same and which differ.

Highlighting

This is where my small library comes into play. It overrides assert_equal, and if expected and actual value aren't equal, it calls #inspect on them, tokenizes them with a simple regular expression, uses diff/lcs library to compute diffs, and then outputs the same message as plain old test/unit's assert_equal except with added and deleted parts highlighted using ANSI color codes which should work on just about any kind of terminal.

It also includes a small hack to make TextMate's test runner window display this highlighting.

Now this library has just been extracted from old-style Rails plugin, it's a bit messy, but it doesn't depend on Rails anymore.

It shouldn't be too hard to adapt to other testing libraries if you need to do so.

jrpg packages for OSX

pixie, macbook guardian by atomicshark from flickr (CC-NC-SA)
jrpg packages for OSX are now available for download. Big thanks to Carlos Fontes for help with setting this up.

py2app's documentation is fairly unhelpful, but it turns out that in the end it's not all that different from making Windows packages. Python tries to provide common functionality with distutils / setuptools, but it doesn't quite work without a bit of extra help.

Oh, and it also turns out that setting up an OSX VMs to test the installer was pretty damn simple. If you didn't do this because you were told that it's too hard, you were told wrong.

Due to some silliness save files are created within jrpg.app bundle, so if you try to move it to /Applications and share between multiple users on the same computer - I'm not actually sure if it will work. I guess could fix it to use ~/Documents/jrpg or something if anybody cares...

Thursday, July 26, 2012

Youtube video recommendations for nerds and nerdettes

My dearest nerds and nerdettes, I want to share with you some awesome Youtube videos I found. For everyone mentioned if you like what you see you can go to their channels for more.

meekakitty

meekakitty makes ton of great music videos for nerds.

Navi's Song. Zelda's Navi, do not confuse with Avatar's Na'vi.


Wizard Love - a pretty good wizard rock.

Don't unplug me.

LadyGameLyric

Tik Tok, probably best Majora's Mask song ever. Somehow it gets far less covers than Ocarina of Time.

And speaking of Ocarina of Time songs:

Here's a rare example of a good song about Great Recession (Skyrim-related):

lindseystomp

Lindsey doesn't do any singing, just highly choreographed violin performances which look really awesome.


Here's her Skyrim performance:


And here's Zelda:



malufenix

Age of Aggression (since I'm pro-Empire):


lindybeige

No music videos this time, just some awesome speculation about historical hardware. His channel is a gold mine for anyone who liked The Ancient Art of Stabbing People.

A few examples - battle axes:

Spears:


Ads

This Java movie ad is totally fake, but hilarious:

This Nando's ad is real, and just as hilarious:

Science

And here's some real science - economics of addiction:

It mostly explains why microfoundations are the cancer destroying economics.

Anyways, here are my recommendations, if you have any good videos to share, comment section is below.

Wednesday, July 25, 2012

Firebase at matchnhack's hackathon

Smoky by Jakes_World from flickr (CC-NC)

So on Saturday I took part in matchnhack's hackathon - they also have this startup speed dating session, which I'm not sure works all that well, but a hackathon is always fun time.

Anyway, the big thing I discovered on the hackathon was Firebase - a pure-Javascript in-cloud schema-less NoSQL database for agile AJAX-based Web 2.0 development - if you can say this stream of buzzwords with a straight face that is.

It is actually really good for throwing something together real fast. It cuts just about all the bullshit - there's Javascript database in Cloud, a library which establishes connection with this database and streams update events to your event handlers in your app for both initial setup and any changes, and you use jQuery (or whatever you want) to do all the user interface. There is no server-side involved. Synchronization between clients on different browsers happens automagically.

They have a build a sample app in 5 minutes interactive tutorial, which is just as impressive as Rails screencasts were six years ago, and if you do any kind of web development, for work or for fun, you should definitely spend that 5 minutes on it.

Now contrary to what they say this is currently entirely unsuitable for any production use, they provide zero security whatsoever (open Firebug console and you can delete the database), and their claims to scalability should be treated very skeptically, but for throwing something together fast it's truly amazing.

We made an chat app for complaining about London 2012 Olympics, which currently lives here and for some mysterious reason people still use it even after hackathon ended. It's not a complicated app, but just look how trivial the code is compared with any server-based solution even with a decent framework like Rails.

This app I made with @woodrat84, venkatesh, and GiftApp team even won hackathon's top prize, probably more for hilarity of the chats than for any technical achievement.

Other than Firebase, the second thing I'd like you to see is Style Nibble - it's a startup targeted for women which tries to do fashion recommendation by some combination of machine learning and in-house experts. I'm a huge fan of recommendation engines of all kinds and forms, and I'm really delighted someone is taking them in this direction.

They even have clear path towards monetization from day one by actually selling stuff, and it's high margin market, and one which so far Amazon and the rest of market leaders have been unsuccessful at. I fully expect something like this (buying clothes based on recommendation engines) to get really huge in a few years. Maybe it will be one of the startups, maybe it will be Amazon after all after who knows which attempt (possibly by buying one of these startups), but there's no way in hell something like this can stay unexploited for long.

Saturday, July 21, 2012

Collection of small Unix utilities written in Ruby

Warning!!!...Tiger in training...:O)) by law_keven from flickr (CC-SA)

Just about every Unix hacker writes hundreds of small Unix scripts for various tasks for personal use and they very rarely see light of the day - they're too small to turn them into a proper Open Source projects, and in any case it would take a lot of effort.

But these days publishing code on code sharing site like github is just so easy, we should change this. I'm doing my part here - out of hundreds of scripts scattered all over my ~/all repository I took a sample which doesn't depend too much on my personal setup, doesn't break any website's ToS too hard, and generally might be of some use to other people.

Feel free to use these scripts any way you wish. Some of them can be simply used, others can show you how to solve common scripting problems.

All of them have been written by me, except ~/bin/rename by Larry Wall which I'm bundling with the rest for convenience since a lot of Unix boxes don't have it, and it's just ridiculously useful.

These are all meant to work on OSX. Most but not all will work on other Unix distributions. If you have patches to make them work elsewhere, send them via pull request or another convenient method.

All scripts can be accessed in my github repository.

I'll probably be adding more scripts later, but the existing batch is pretty sizable already.

A few tricks

A lot of tricks and good techniques are included in these scripts.

SIGPIPE

One such technique - which I don't remember seeing anywhere else is starting the script with trap("PIPE", "EXIT").

What does it do? When you setup a pipe of processes like foo | bar | head -n 10, sometimes one of consumer processes will quit first - in this case head after reading 10 lines. The system wants all other processes to exit cleanly so it sends them SIGPIPE signal, which by default kills the rest of the processes in the group.

Ruby decided to override this behaviour, and instead you get an exception. This makes sense for most programs, but for utilities meant to be used as pipe producers (randswap and tac here), it's better to restore default Unix behaviour. Which you can do very easily with this one line.

Avoid shell escaping


An extreme bad scripting practice is generating command string like "rm -rf #{directory}" and executing that via system function.

There's rarely any reason to do so. system takes multiple commands, so you can use safer system "rm", "-rf", directory instead.

It's more complicated to do it "the right way" if you want to setup redirects, get command's output, or pipe multiple processes together, but 90% of the time "the right way" is also the simplest way so why not do it right?

system *%W[]

Ruby %W is an almost unknown feature - I wrote about it some time ago if you want more.

Very often, it is the most convenient to execute some command. system *%W[rm -rf #{directory}] looks like it does string interpolation, but it actually evaluates to absolutely safe system("rm", "-rf", "#{directory}") call and you're totally safe regardless of special characters in directory.

Use FileUtils

Usually you don't even have to call system - most common commands for filesystem interactions are available more conveniently via FileUtils module. Like FileUtils.mkdir_p "/some/path", FileUtils.rm_rf "/another/path" etc.

If you have to escape shell metacharacters

Sometimes system *%W[] interface is not enough. In such case, just copy and paste String#shell_escape from my scripts (not security-audited or anything):


class String
  def shell_escape
    return "''" if empty?
    return dup unless self =~ /[^0-9A-Za-z+,.\/:=@_-]/
    gsub(/(')|[^']+/) { $1 ? "\\'" : "'#{$&}'"}
  end
end

Then use it like this (without extra ''s): `tar -tzf #{fn.shell_escape}`.

EDIT: Ruby since 1.8.7 added String#shellescape method in shellwords
 library in stdlib. So use that unless you need to support older systems.
Playing white tiger cub by Tambako the Jaguar from flickr (CC-ND)

Individual commands

annotate_sgf


It uses Gnu Go debug mode to annotate your go game in SGF.
It will find a lot of tactical mistakes for most games by kyu players.

Usage:

annotate_sgf game.sgf


Output saved to annotated-game.sgf in the same directory as game.sgf.

For more details, read this blog post.

convert_to_png


Converts various image formats to PNG.
Mostly useful for mass conversion, for example when you have a directory
with 100 svg files dir/file-001.svg to dir/file-100.svg:

    convert_to_png dir/*.svg


will convert them all.

dedup_files

Deletes duplicate files in huge directories by hash, with some optimization to avoid unnecessary hashing.

Usage:

    dedup_files   ...

   
For example:

    dedup_files my_little_pony_wallpapers/


which will work pretty well even if you have 100GB of My Little Pony wallpapers.

diffschemas


Gives diff of mysql schemas.

Do dump mysql schema use:

    mysqldump -uuser -ppassword -h hostname --where 0=1 database >schema.sql


Then run:

    diffschemas schema_1.sql schema_2.sql

which will strip garbage like autoincrement counters and give you clean diff.

e


This utility has extremely short name since it's meant to be used as your primary
way to call text editor.

If you give it a path containing /, or file with such name exists in current directory,
it will call your editor on that file.

Otherwise - it will search your $PATH for this file, and execute your editor on it,
avoiding opening binaries, and other false positives.

This is extremely helpful if you have a ton of scripts you edit a lot.

These two commands achieve similar effect:

    mate `which foo`

    e foo

except e is shorter, doesn't force you to think about paths,
will expand all symlinks in name (avoiding issues like accidentally editing the
same file under different name in two editor window), and won't accidentally open binaries.

Currently configured to call TextMate of course.

gzip_stream


Pipe through it to gzip log without having infinitely long buffers.

Usage example:

    my_server | gzip_stream >log.gz

If you use regular gzip the last few hundred lines will be in memory indefinitely,
so you won't be able to see what's going on in log.gz without killing the server,
even if it happened yesterday. gzip_stream flushes every 5s (easily configurable),
sacrificing tiny amount of compression quality for huge amount of convenience.

Read more about it here.

namenorm

Safely normalizes file names replacing upper case characters and spaces with
lower case characters and underlines.

Usage:
    namenorm ~/Downloads/*

openmany

Runs open command on multiple files, either as command line arguments,
or one-per-line in STDIN.


Usage:
    openmany <urls.txt
    openmany *.pdf

It uses OSX open command. For Linux edit to use whatever was Linux equivalent.
(I keep forgetting since alias open=... is always in my .bashrc)

pomodoro

Count downs 25 minutes (or however many you specify as command line argument),
printing countdown on command line, and when it's over turning volume to maximum
and playing selected sound.

Usage:
    pomodoro   # 25 minutes

    pomodoro 5 # 5 minutes


Read more about Pomodoro Technique on Wikipedia.

Setting volume and playing sound assume OSX commands, but I'm sure you'll be able
to figure out Linux equivalents.

pub

Fixes directory tree by making it publicly readable and editable by you.

Very useful when fixing permissions on files you just unpacked from an archive,
since many archive formats store stupid permissions (like read only on directories) inside,
which is a bad idea for everything except backups.

Usage:

    pub file.txt

    pub directory/



randswap

Randomly swaps lines of STDIN.

Usage:

    randswap <urls.txt | head -n 10 >sample.txt

rbexe

Creates executable script path with proper #! line and permissions.

Defaults to Ruby executable but supports a few other #!s.

Usage:

    rbexe file.rb

    rbexe --9 file.rb

    rbexe --pl file.pl

 
If file exists, it will only change its permissions without overwriting it,
so it's safe to use.

rename

Larry Wall's rename script, included in Debian-derived distribution, but not on any other Unix
I know of - which is literally criminal, since it's one of core Unix utilities.

If your distribution doesn't have it (or worse - has some total crap as rename script),
do yourself a service and install something more sensible, and in the meantime copy this
file to your ~/bin.

split_dir

Splits directories with excessively many files into multiple directories with about
equal number of about-200 files.

Usage example:

    split_dir my_little_pony_wallpapers/


Mostly useful for directories containing images.

strip_9gag

Removes extremely annoying 9gag watermark they put on files they didn't make.

Usage examples:

    strip_9gag file.jpg

    strip_9gag http://some.site.example/file.jpg

tac

Reverses order of lines of whatever is on STDIN, prints to STDOUT.

Usage example:

    tac <pokemon_by_newest.txt >pokemon_by_oldest.txt

Some distributions already have tac command - for those that don't like OSX, it's really easy to use this replacement.

terminal_title


Changes title of current terminal window. Extremely useful if you have too many terminal titles.

Usage example:

    terminal_title 'Production server (do not accidentally killall -9)'; ssh production.server.example

unall

Universal unarchiver. Possibly the most useful nontrivial utility in this repository (not counting Larry Wall's rename).

Command like interface to various archives formats is a total failure compared with convenience of desktop programs.

They have huge number of incompatible interfaces, which one can get used to, but there's a much more severe failures - sometimes an archive contains files without a single directory to contain them all.
This problem is solved by most good desktop unarchivers, but in command line world any such archive will ruin your day.

unall fixes all these problems - it checks what's inside the archive, if it's broken archive with multiple files not in same directory it will creature directory for it, if directory already exists it will rename it to something else etc.

If it was successful, it will then delete archive after unpacking (with trash command which puts it into OSX Trash, feel free to change it to whatever your system uses).

Usage:
    unall *.zip *.rar *.7z *.tar.bz2 *.tar.gz


unall assumes you have 7za, unrar, and sane version of tar installed.

xmlview

Reindents XML and cuts it to 150 column limit for easy viewig.

Usage example:

    xmlview huge_machine_generated_xml_file.xml

xnorm

A version of namenorm script which also removes random garbage from file names like ".x264".
Useful mostly for TV episodes.

Usage:

    xnorm ~/Downloads/*


It's included more as an example than as actually useful utilities since garbage they include in file names changes constantly.

xpstree

A much superior replacement for pstree.

Shows directory tree of processes with a lot of garbage cleaned up (like kernel processes removed, scripts displayed by their script name not their interpreter name etc.).

Regexps used to cleanup the tree might require some customization for your situation.

Usage examples:
    xpstree

    xpstree -u          # By current user

    xpstree -p          # Show pids

    xpstree -s          # Highlight current process's tree

    xpstree -h java     # Highlight anything with /java/ in process path

    xpstree -s Terminal # Ignore /Terminal/

    xpstree -x Terminal # Ignore /Terminal/ and all its children

    xpstree -f Terminal # Show only /Terminal/ and all its children

    xpstree -h Terminal # Highlight /Terminal/

Lower case options -sxfh are exact match (sane insensitive).

Upper case options -SXFH are regexp match.

xrmdir

Works like rmdir for OSX. Since OSX creates garbage files like .DS_Store in every single
directory you ever open with Finder (or just because it can), many empty directories
are technically non-empty.

xrmdir deletes this worthless file, then calls rmdir on it.

Usage example:

    xrmdir ~/101/reasons/why/osx/sucks/*

Thursday, July 19, 2012

magic/xml for Ruby on github

Silver Profile at Window - Slightly Blurry 702 x 749 by ♥ Crystal Writer ♥ from flickr (CC-NC-SA)
Making all my old software publicly available on github continues. Now it's time for magic/xml - which can be now downloaded from github here.

I needed to do a few fixes for Ruby 1.9 compatibility - mostly Array#to_s is now Array#inspect, not Array#join, and it works just fine now with either 1.8 or 1.9.

Over all these years I haven't used magic/xml as much as I thought I would. Somehow the XML and XHTML dominance everybody was expecting never materialized, and instead JSON and HTML5 took over. I can't say I'm too unhappy about this development - but magic/xml remains the most convenient way to solve the (now less important) XML problem in Ruby.

I think I might have overdone monkey patching a bit - especially with regards to pattern matching, which in Ruby is pretty hard to extend cleanly due to lack of multimethods (and general massive hackery in Ruby's implementation of it). I'd love it if some future version of Ruby did pattern matching right, but since no other language ever did so (I might write more about it sometime later), it might be too much to hope for.

Monday, July 02, 2012

libgmp-ruby is on github now

Cat and Calculator - Top View by Felix Idan from flickr (CC-NC-ND)

As promised I've been moving my old code to github lately.

One of these libraries is libgmp-ruby, which I once resurrected from Ruby 1.6 era into early Ruby 1.8 era already. So I just put it on github, cleaned it up a bit, and made it working with 1.9 (except for some weird memory corruption crashes which I was still in process of figuring out)... and it turns out it was all for naught since my library has been and on github and actively maintained all that time by strawlins.

Now one of the benefits of Open Source is that when the original author or maintainer is too busy someone else can easily take over, but it would be nice if someone emailed me... I wonder which programs I wrote forever ago live independent lives now.

Anyway, all's well which end's well, and if you want to use libgmp-ruby, use strawlins' version, not mine.

Magic 2013 Sealed Simulator

ireallymeanit by renedepaula from flickr (CC-NC-ND)

There was a lot of interest in Sealed simulator I wrote for Avacyn Restored, so I made a version for Magic 2013.

You can play with it here: Magic 2013 Sealed Simulator.

It still has the same problem of not having print runs implemented (which is pretty much unavoidable without some huge effort), nor even any hacks like forcing color balance at common which would really help with realism.

If you have any code improvements, please message me in some way. I'm especially interested in two things:
  • better CSS to indicate foil cards (red frame around card looks pretty lame)
  • some hacks to make colors more balanced in absence of full print run simulation
  • I don't think it works with IE, so if you can fix that, I'll upload fixed code, and people who use IE will be grateful.
Enjoy the simulator and enjoy the Prerelease Party!

EDIT: I added support for sorting by cmc. It doesn't work in alternative layout, since it has only 5 columns, and there could be a lot more cmcs (0 to 10 in M13). If it's not working for you, you may need to refresh your cache. I know most of the interest in M13 Sealed Simulator is probably gone by now, but the same code can be used for Return To Ravnica, just saying... (except for guild-specific boosters, no idea how they're made)