Wednesday, April 18, 2007

set_trace_func, smoke and mirrors

we built this city on smoke and mirrors by zen♫ from flickr (CC-NC-SA)Ruby interpreter provides event hooks, which can be used to trace execution of Ruby program. To setup an event hook, call set_trace_func(some_proc), to remove it, call set_trace_func(nil).

Tracers are probably the most useful debugging tool ever invented, far more useful than debuggers based on single-stepping and breakpoints.

Ruby traces 8 events:
  • line - file/line changed
  • class - module/class definition started
  • end - module/class definition ended
  • call - Ruby function was called
  • return - Ruby function returned
  • c-call - C function was called
  • c-return - C function returned
  • raise - exception was raised
Some events are not traced, like setting and getting variables (local, global, instance, or class).
set_trace_func proc { |event, file, line, id, binding, classname|
STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}

@welcome = "Hello"
separator = ", "
$object = "world"
@@excl = "!"
puts(@welcome + separator + $object + @@excl)
$ ./example-1.rb
line ./example-1.rb:7 false
line ./example-1.rb:8 false
line ./example-1.rb:9 false
line ./example-1.rb:10 false
line ./example-1.rb:11 false
c-call ./example-1.rb:11 + String
c-return ./example-1.rb:11 + String
c-call ./example-1.rb:11 + String
c-return ./example-1.rb:11 + String
c-call ./example-1.rb:11 + String
c-return ./example-1.rb:11 + String
c-call ./example-1.rb:11 puts Kernel
c-call ./example-1.rb:11 write IO
c-return ./example-1.rb:11 write IO
c-call ./example-1.rb:11 write IO
c-return ./example-1.rb:11 write IO
c-return ./example-1.rb:11 puts Kernel

In the example String#+ was run 3 times, and IO#write 2 times - the second one most likely to write out newline character.

Many things in Ruby are magical and are not converted to method calls. One such thing is string interpolation with "... #{ code } ... ", which is not converted to String#+, or Array#join, but handled specially by the interpreter.

set_trace_func proc { |event, file, line, id, binding, classname|
STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}

@welcome = "Hello"
$object = "world"
puts("#{@welcome}, #{$object}\n")
$ ./example-2.rb
line ./example-2.rb:7 false
line ./example-2.rb:8 false
line ./example-2.rb:9 false
c-call ./example-2.rb:9 puts Kernel
c-call ./example-2.rb:9 write IO
c-return ./example-2.rb:9 write IO
c-return ./example-2.rb:9 puts Kernel
In the same way, creating classes is not converted to Class#new, and adding methods is not converted to Module#define_method. Fortunately if we need to capture such events Ruby provides specialized hooks - when a class is created its parent's inherited method is called, and when a method is added, Ruby calls method_added on its class.

set_trace_func proc { |event, file, line, id, binding, classname|
STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}

class Geolocation
def initialize(lat, long)
@lat = lat
@long = long
end
def to_s
"<#{@lat}, #{@long}>"
end
end
$ ./example-3.rb
line ./example-3.rb:7 false
c-call ./example-3.rb:7 inherited Class
c-return ./example-3.rb:7 inherited Class
class ./example-3.rb:7 false
line ./example-3.rb:8 false
c-call ./example-3.rb:8 method_added Module
c-return ./example-3.rb:8 method_added Module
line ./example-3.rb:12 false
c-call ./example-3.rb:12 method_added Module
c-return ./example-3.rb:12 method_added Module
end ./example-3.rb:7 false
If we want to trace something useful it's probably time to change the format. If we're interested in tracing execution, file/line information is usually just noise, return/c-return/end would be far better represented by indentation, and we're really interested in receiver (self), not its class.
$indent = 0
set_trace_func proc { |event, file, line, id, binding, classname|
if event == "line"
# Ignore
elsif %w[return c-return end].include?(event)
$indent -= 2
else
if event == "class"
STDERR.printf "%*s%s %s\n", $indent, "", event, eval("self", binding)
else
STDERR.printf "%*s%s %s.%s\n", $indent, "", event, eval("self", binding), id
end
$indent += 2 if %w[call c-call class].include?(event)
end
}

class Geolocation
def initialize(lat, long)
@lat = lat
@long = long
end
def to_s
"<#{@lat}, #{@long}>"
end
end

a = Geolocation.new(51.12, 17.03)
puts a
p a
$ ./example-4.rb  >/dev/null
c-call Object.inherited
class Geolocation
c-call Geolocation.method_added
c-call Geolocation.method_added
c-call Geolocation.new
call <, >.initialize
c-call main.puts
call <51.12, 17.03>.to_s
c-call 51.12.to_s
c-call 17.03.to_s
c-call #<IO:0xb7c80fc0>.write
c-call #<IO:0xb7c80fc0>.write
c-call main.p
c-call <51.12, 17.03>.inspect
c-call 17.03.inspect
c-call 17.03.to_s
c-call 51.12.inspect
c-call 51.12.to_s
c-call #<IO:0xb7c80fc0>.write
c-call #<IO:0xb7c80fc0>.write
There's still a minor quirk in the output, as Geolocation#to_s is called by the tracer when Geolocation#initialize starts, and before it can be meaningfully printed. In this case we simply got garbage <, >, but some objects raise exceptions if printed before initialization.
require 'complex'

$indent = 0
set_trace_func proc { |event, file, line, id, binding, classname|
if event == "line"
# Ignore
elsif %w[return c-return end].include?(event)
$indent -= 2
else
obj = eval("self", binding)
if event == "class"
STDERR.printf "%*s%s %s\n", $indent, "", event, obj
else
obj = "<#{obj.class}##{obj.object_id}>" if id == :initialize
STDERR.printf "%*s%s %s.%s\n", $indent, "", event, obj, id
end
$indent += 2 if %w[call c-call class].include?(event)
end
}

a = Complex.new(11.0, -5.0)
b = Complex.new(2.0, 13.5)
c = a * b
$ ./example-5.rb
c-call Complex.new
call <Complex#-605829048>.initialize
c-call 11.0.kind_of?
c-call 11.0.kind_of?
c-call -5.0.kind_of?
c-call -5.0.kind_of?
c-call Complex.new
call <Complex#-605832038>.initialize
c-call 2.0.kind_of?
c-call 2.0.kind_of?
c-call 13.5.kind_of?
c-call 13.5.kind_of?
call 11.0-5.0i.*
c-call 2.0+13.5i.kind_of?
c-call 11.0.*
c-call -5.0.*
c-call 22.0.-
c-call 11.0.*
c-call -5.0.*
c-call 148.5.+
call 11.0-5.0i.Complex
c-call 138.5.==
call 89.5.real
call 138.5.imag
c-call 89.5.-
call 89.5.imag
call 138.5.real
c-call 0.+
c-call Complex.new
call <Complex#-605842188>.initialize
c-call 89.5.kind_of?
c-call 89.5.kind_of?
c-call 138.5.kind_of?
c-call 138.5.kind_of?

Many otherwise hard to find bugs become trivial with a custom tracer, grep, and some Ruby one-liners to massage the logs.

Smoke and mirrors

set_trace_func is useful not just for debugging, but also for all kinds of magic. Binding.of_caller, which unfortunately doesn't work any more, was implemented with set_trace_func. magic/help uses set_trace_func to provide convenient help access in irb:

irb> help { STDERR.write }
--------------------------------------------------------------- IO#write
ios.write(string) => integer
------------------------------------------------------------------------
Writes the given string to _ios_. The stream must be opened for
writing. If the argument is not a string, it will be converted to a
string using +to_s+. Returns the number of bytes written.

count = $stdout.write( "This is a test\n" )
puts "That was #{count} bytes of data"

_produces:_

This is a test
That was 15 bytes of data

magic/help works by setting a set_trace_func handler, executing the passed block, and then aborting execution by throw after first interesting event. This means that calling help { rm_rf "/" } should be perfectly safe.

One nasty case I found while developing magic/help was inconsistent traces when functions are called with wrong number of arguments.

For functions implemented in C, even with explicit number of arguments specified, Ruby always issues c-call before calling ArgumentError.new. On the other hand for Ruby functions, ArgumentError.new is issued, then return from the function, but call never happens.

To illustrate, tracing C function STDERR.write gives:

c-call #<IO:0xb7c98fa8>.write
c-call ArgumentError.new
c-call #<ArgumentError: ArgumentError>.initialize
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize
c-return ArgumentError.new
c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
raise #<IO:0xb7c98fa8>.write
c-return #<IO:0xb7c98fa8>.write
On the other hand tracing Ruby function FileUtils.remove_entry gives:
# "call FileUtils.remove_entry" never happens !
c-call ArgumentError.new
c-call #<ArgumentError: ArgumentError>.initialize
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize
c-return ArgumentError.new
c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
raise FileUtils.remove_entry
# "return FileUtils.remove_entry" does happen.
return FileUtils.remove_entry
In both cases checking number of arguments happens before the call. IO#write in defined as 1-argument function:
rb_define_method(rb_cIO, "write", io_write, 1);
New versions of magic/help handle this case too - if the first event is c-call to ArgumentError.new, magic/help waits for the first return event (if any) instead of instantly aborting execution of the passed block. Just grab the new version and enjoy the magical help.

3 comments:

  1. http://unroller.rubyforge.org/
    uses this method to print out code including variable values during execution. This makes an awesome debugging tool

    ReplyDelete
  2. Anonymous05:40

    great posting dude. opened another world of ruby up for me, thanks!

    ReplyDelete
  3. I'm glad you liked it.

    ReplyDelete