taw's blog: set_trace_func, smoke and mirrors

Wednesday, April 18, 2007

set_trace_func, smoke and mirrors

Ruby interpreter provides event hooks, which can be used to trace execution of Ruby program. To setup an event hook, call set_trace_func(some_proc), to remove it, call set_trace_func(nil).

Tracers are probably the most useful debugging tool ever invented, far more useful than debuggers based on single-stepping and breakpoints.

Ruby traces 8 events:

line - file/line changed
class - module/class definition started
end - module/class definition ended
call - Ruby function was called
return - Ruby function returned
c-call - C function was called
c-return - C function returned
raise - exception was raised

Some events are not traced, like setting and getting variables (local, global, instance, or class).

set_trace_func proc { |event, file, line, id, binding, classname|
   STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}

@welcome  = "Hello"
separator = ", "
$object   = "world"
@@excl    = "!"
puts(@welcome + separator + $object + @@excl)

$ ./example-1.rb
   line ./example-1.rb:7                false
   line ./example-1.rb:8                false
   line ./example-1.rb:9                false
   line ./example-1.rb:10               false
   line ./example-1.rb:11               false
 c-call ./example-1.rb:11          +   String
c-return ./example-1.rb:11          +   String
 c-call ./example-1.rb:11          +   String
c-return ./example-1.rb:11          +   String
 c-call ./example-1.rb:11          +   String
c-return ./example-1.rb:11          +   String
 c-call ./example-1.rb:11       puts   Kernel
 c-call ./example-1.rb:11      write       IO
c-return ./example-1.rb:11      write       IO
 c-call ./example-1.rb:11      write       IO
c-return ./example-1.rb:11      write       IO
c-return ./example-1.rb:11       puts   Kernel

In the example String#+ was run 3 times, and IO#write 2 times - the second one most likely to write out newline character.

Many things in Ruby are magical and are not converted to method calls. One such thing is string interpolation with "... #{ code } ... ", which is not converted to String#+, or Array#join, but handled specially by the interpreter.

set_trace_func proc { |event, file, line, id, binding, classname|
   STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}

@welcome  = "Hello"
$object   = "world"
puts("#{@welcome}, #{$object}\n")

$ ./example-2.rb
   line ./example-2.rb:7                false
   line ./example-2.rb:8                false
   line ./example-2.rb:9                false
 c-call ./example-2.rb:9        puts   Kernel
 c-call ./example-2.rb:9       write       IO
c-return ./example-2.rb:9       write       IO
c-return ./example-2.rb:9        puts   Kernel

In the same way, creating classes is not converted to Class#new, and adding methods is not converted to Module#define_method. Fortunately if we need to capture such events Ruby provides specialized hooks - when a class is created its parent's inherited method is called, and when a method is added, Ruby calls method_added on its class.

set_trace_func proc { |event, file, line, id, binding, classname|
   STDERR.printf "%8s %s:%-2d %10s %8s\n", event, file, line, id, classname
}

class Geolocation
   def initialize(lat, long)
       @lat  = lat
       @long = long
   end
   def to_s
       "<#{@lat}, #{@long}>"
   end
end

$ ./example-3.rb
   line ./example-3.rb:7                false
 c-call ./example-3.rb:7   inherited    Class
c-return ./example-3.rb:7   inherited    Class
  class ./example-3.rb:7                false
   line ./example-3.rb:8                false
 c-call ./example-3.rb:8  method_added   Module
c-return ./example-3.rb:8  method_added   Module
   line ./example-3.rb:12               false
 c-call ./example-3.rb:12 method_added   Module
c-return ./example-3.rb:12 method_added   Module
    end ./example-3.rb:7                false

If we want to trace something useful it's probably time to change the format. If we're interested in tracing execution, file/line information is usually just noise, return/c-return/end would be far better represented by indentation, and we're really interested in receiver (self), not its class.

$indent = 0
set_trace_func proc { |event, file, line, id, binding, classname|
   if event == "line"
       # Ignore
   elsif %w[return c-return end].include?(event)
       $indent -= 2
   else
       if event == "class"
           STDERR.printf "%*s%s %s\n", $indent, "", event, eval("self", binding)
       else
           STDERR.printf "%*s%s %s.%s\n", $indent, "", event, eval("self", binding), id
       end
       $indent += 2 if %w[call c-call class].include?(event)
   end
}

class Geolocation
   def initialize(lat, long)
       @lat  = lat
       @long = long
   end
   def to_s
       "<#{@lat}, #{@long}>"
   end
end

a = Geolocation.new(51.12, 17.03)
puts a
p a

$ ./example-4.rb  >/dev/null
c-call Object.inherited
class Geolocation
 c-call Geolocation.method_added
 c-call Geolocation.method_added
c-call Geolocation.new
 call <, >.initialize
c-call main.puts
 call <51.12, 17.03>.to_s
   c-call 51.12.to_s
   c-call 17.03.to_s
 c-call #<IO:0xb7c80fc0>.write
 c-call #<IO:0xb7c80fc0>.write
c-call main.p
 c-call <51.12, 17.03>.inspect
   c-call 17.03.inspect
     c-call 17.03.to_s
   c-call 51.12.inspect
     c-call 51.12.to_s
 c-call #<IO:0xb7c80fc0>.write
 c-call #<IO:0xb7c80fc0>.write

There's still a minor quirk in the output, as Geolocation#to_s is called by the tracer when Geolocation#initialize starts, and before it can be meaningfully printed. In this case we simply got garbage <, >, but some objects raise exceptions if printed before initialization.

require 'complex'

$indent = 0
set_trace_func proc { |event, file, line, id, binding, classname|
   if event == "line"
       # Ignore
   elsif %w[return c-return end].include?(event)
       $indent -= 2
   else
       obj = eval("self", binding)
       if event == "class"
           STDERR.printf "%*s%s %s\n", $indent, "", event, obj
       else
           obj = "<#{obj.class}##{obj.object_id}>" if id == :initialize
           STDERR.printf "%*s%s %s.%s\n", $indent, "", event, obj, id
       end
       $indent += 2 if %w[call c-call class].include?(event)
   end
}

a = Complex.new(11.0, -5.0)
b = Complex.new(2.0, 13.5)
c = a * b

$ ./example-5.rb
c-call Complex.new
 call <Complex#-605829048>.initialize
   c-call 11.0.kind_of?
   c-call 11.0.kind_of?
   c-call -5.0.kind_of?
   c-call -5.0.kind_of?
c-call Complex.new
 call <Complex#-605832038>.initialize
   c-call 2.0.kind_of?
   c-call 2.0.kind_of?
   c-call 13.5.kind_of?
   c-call 13.5.kind_of?
call 11.0-5.0i.*
 c-call 2.0+13.5i.kind_of?
 c-call 11.0.*
 c-call -5.0.*
 c-call 22.0.-
 c-call 11.0.*
 c-call -5.0.*
 c-call 148.5.+
 call 11.0-5.0i.Complex
   c-call 138.5.==
   call 89.5.real
   call 138.5.imag
   c-call 89.5.-
   call 89.5.imag
   call 138.5.real
   c-call 0.+
   c-call Complex.new
     call <Complex#-605842188>.initialize
       c-call 89.5.kind_of?
       c-call 89.5.kind_of?
       c-call 138.5.kind_of?
       c-call 138.5.kind_of?

Many otherwise hard to find bugs become trivial with a custom tracer, grep, and some Ruby one-liners to massage the logs.

Smoke and mirrors

set_trace_func is useful not just for debugging, but also for all kinds of magic. Binding.of_caller, which unfortunately doesn't work any more, was implemented with set_trace_func. magic/help uses set_trace_func to provide convenient help access in irb:

irb> help { STDERR.write }
--------------------------------------------------------------- IO#write
    ios.write(string)    => integer
------------------------------------------------------------------------
    Writes the given string to _ios_. The stream must be opened for
    writing. If the argument is not a string, it will be converted to a
    string using +to_s+. Returns the number of bytes written.

       count = $stdout.write( "This is a test\n" )
       puts "That was #{count} bytes of data"

    _produces:_

       This is a test
       That was 15 bytes of data

magic/help works by setting a set_trace_func handler, executing the passed block, and then aborting execution by throw after first interesting event. This means that calling help { rm_rf "/" } should be perfectly safe.

One nasty case I found while developing magic/help was inconsistent traces when functions are called with wrong number of arguments.

For functions implemented in C, even with explicit number of arguments specified, Ruby always issues c-call before calling ArgumentError.new. On the other hand for Ruby functions, ArgumentError.new is issued, then return from the function, but call never happens.

To illustrate, tracing C function STDERR.write gives:

c-call #<IO:0xb7c98fa8>.write
 c-call ArgumentError.new
   c-call #<ArgumentError: ArgumentError>.initialize
   c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize
 c-return ArgumentError.new
 c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
 c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
 c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
 c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
 raise #<IO:0xb7c98fa8>.write
c-return #<IO:0xb7c98fa8>.write

On the other hand tracing Ruby function FileUtils.remove_entry gives:

# "call FileUtils.remove_entry" never happens !
 c-call ArgumentError.new
   c-call #<ArgumentError: ArgumentError>.initialize
   c-return #<ArgumentError: wrong number of arguments (0 for 1)>.initialize
 c-return ArgumentError.new
 c-call #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
 c-return #<ArgumentError: wrong number of arguments (0 for 1)>.backtrace
 c-call #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
 c-return #<ArgumentError: wrong number of arguments (0 for 1)>.set_backtrace
 raise FileUtils.remove_entry
# "return FileUtils.remove_entry" does happen.
return FileUtils.remove_entry

In both cases checking number of arguments happens before the call. IO#write in defined as 1-argument function:

rb_define_method(rb_cIO, "write", io_write, 1);

New versions of magic/help handle this case too - if the first event is c-call to ArgumentError.new, magic/help waits for the first return event (if any) instead of instantly aborting execution of the passed block. Just grab the new version and enjoy the magical help.

3 comments:

Unknown said...: http://unroller.rubyforge.org/
uses this method to print out code including variable values during execution. This makes an awesome debugging tool; 19:16
Anonymous said...: great posting dude. opened another world of ruby up for me, thanks!; 05:40
taw said...: I'm glad you liked it.; 09:54

taw's blog

Wednesday, April 18, 2007

set_trace_func, smoke and mirrors

Smoke and mirrors

3 comments:

About Me

Popular Posts

Blog Archive

Labels

My software

Creative Commons