The best kittens, technology, and video games blog in the world.

Sunday, August 20, 2006

Programming in Blub

Photo by Marc Crow from flickr (CC-BY-ND)Languages vary in power. Like Paul Graham, let's take a language of intermediate power and call it Blub. Now someone who only speaks Blub can look at less powerful languages and see that they're less powerful. But if they look at more powerful languages, they won't think of them as more powerful - such language are going to look simply weird, as it is often very hard to see the power without actually being fluent in them. Even if you are shown a few examples where some language beats your favourite Blub, you're more likely to think something along the lines of "who would actually need such feature" or "well, it's just some syntactic sugar", not "omg, I've been coding Blub all my life, I have been so damn wrong". I'm not going to criticize such thinking, in many cases weird is just weird, not really more powerful. Oh by the way, while Paul Graham noticed this paradox, he is also its victim - thinking that his favourite language (Common Lisp) is the most powerful language ever, so when it doesn't have some feature (object-oriented programming), then by definition such feature is "just weird", and not really powerful. Of course we all know better ;-) Now when you really get those features of a more powerful language, you can sometimes program better even in Blub. Why am I talking about that ? Because we can move one step forward and consider Ruby our new Blub. It is pretty much the most powerful general purpose language right now, but it still doesn't taze a few nice features natively :-) Fortunately we can try to understand and emulate them. Multiple dispatch - Think about addition. Addition can easily be a method (like in Ruby or Smalltalk). You send one object a message (+, other_object). So far so good. Now how about adding Complex class ? Complex's method + must know how to add Complex numbers and normal Float numbers. But how can we tell Float what to do about Complex arguments ? Float#+ is already taken and doesn't have a clue about Complex numbers. We need to do some ugly hacks like redefining +. Or some more complex example - pattern matching. Regexp matches String. We want to implement pattern Parser that matches String. And make Regexp match IOStream. It's ugly without multiple dispatch. Of course Ruby is not a Blub and we can easily make defgeneric ;-)

def defgeneric(name, class0, *classes, &block)
  old_method = begin class0.instance_method(name) rescue nil end

  class0.send(:define_method, name) {|*args|
      if classes.zip(args).all?{|c,a| a.is_a? c}
          block.call(self, *args)
      elsif old_method
          old_method.bind(self).call(*args)
      else
          method_missing(name, *args)
      end
  }
end

defgeneric(:+, Complex, Object) {|a,b|
  a + Complex.new(b)
}

defgeneric(:+, Complex, Complex) {|a,b|
  Complex.new(a.re + b.re, a.im + b.im)
}

defgeneric(:+, Float, Complex) {|a,b|
  Complex.new(a) + b
}
Now Ruby uses a trick with coerce for handling +, but if you ever need your own generics, here they are :-) Higher-order functions - everyone knows how to use first-order functions (those that operate only on normal non-functional objects). Most people know how to use second-order functions (those that operate on first-order functions) like map or each. But is it useful to go any higher ? It turns out that it is. Imagine a container. It contains second-order methods like each, each_with_index, each_pair, each_by_depth_first_search, and two dozens other each_whatever. It also contains other second-order functions like map, filter, all?, any?, sort_by etc. Ever wanted map_with_index ? How about all_pairs? ? I sure did. In Ruby 1.9 you can use Enumerator::Enumerable for that !
["a", "b", "c"].enum_with_index.map{|a,i| "#{i}: #{a}"} # ["0: a", "1: b", "2: c"]
Now if you look at this example long enough, you can see that it's an extremely elegant use of third-order functions :-) Pattern matching - languages like Objective Caml do a lot of pattern matching. Pattern matching is basically a thing that verifies that given value matches given pattern and at the same time it deconstructs the value into pieces according to that pattern. For example this function is Objective Caml matches its argument to one of two patterns - either [] pattern (empty list) or hd::tl pattern (a non-empty list - bind hd to head of the list, and tl to its tail).
let rec sum = function
| [] -> 0
| hd::tl -> hd + (sum tl)
Or take a look at Perl/Ruby's regular expressions - they deconstruct string into $1, $2 etc. Now it's the second feature - deconstruction - that gives pattern matching all its power. So why do most programming languages lack such feature ? My guess is that it's really difficult to make patterns that deconstruct your object extensible - and verification alone isn't really all that useful. Oh yeah, and we sometimes want to build such patterns at run-time. In Objective Caml we cannot do that - we need some way to tell the language whether hd means "match contents of hd" or "match anything and put into hd". Now what if we were able to say ?:
def sum(x)
  case x
  when []
      0
  when [_{hd}, *_{tl}]
    hd + sum(tl)
  end
end
Useful ? Somewhat. How about this one ?
if xml_node =~ xmlp(_{name}, {:color=>_{c}}, /Hello, (\S+)/)
  print "A #{c}-colored #{name} greets #{$1} !"
end
Now as far as I can tell nobody actually implemented that, but I get a feeling that using Binding.of_caller, Parse::Tree and something like defgeneric it should be possible. Macros - well, you can do that too :-)

8 comments:

Anonymous said...

I may be misreading you, but if you are in fact saying that Lisp doesn't have object-orientation, you are mistaken. It's called CLOS in Common Lisp and it's part of the CL ANSI standard. Of course, Lisp isn't totally OO, because you can in fact ignore the whole CLOS system as you please, but it is there.

Anonymous said...

I think you're missing the point- but you're inadvertantly illustrating it..

clos is lisp's object system, but the idea of multiple dispatch is not found in most OO systems -(clos has it though)
so the function he's given emulates the multiple dispatch feature
which
it might've been be a good idea to explain it first.

taw said...

Anonymous: First, please do not confuse Lisp with CL. CL is a crime against programming that tried to steal the name of "Lisp" and isn't even fully Lisp.
Actually if they didn't use deceptive name, CL would quickly be forgotten and everyone would be coding in real Lisp by now. Or maybe they do anyway.

But back to the point.
CLOS is not object oriented. CLOS is a system for defining generic functions, nothing more than that. Generic functions can emulate a small part of object-oriented programming, but this small part is totally insufficient. Steve Yegge wrote more about it.

Anonymous said...

"Languages vary in power."

Do they? As far as I know, the powers of languages are equivalent if they are turing-complete.

A feature more or less is not an indication of more or less power. An analogy is cars.

Languages vary in convenience, nothing else.

taw said...

Leo: The word "power" used in this context means something completely unrelated to Turing-completeness.

I'm puzzled. Who started using word "power" to mean "is it Turing-complete or not". Computational complexity theory definitely doesn't use it in this sense.

Anonymous said...

I find it discomforting that you use the term "power" as a synonym for "feature-richness". Graham in your linked article [3] does the same. (See there for equality of power in the sense of Turing equivalence.) He is very vague and imprecise about these terms.
Doing so means confusing _what_ can be done with _how_ it can be done. Attributing values to programming languages based on such confusion seems like a major fallacy to me.

taw said...

Leo: Concept of computer language "power" is imprecise and hard to define, because it really belongs to the human side of programming, not machine side. I find it very intuitive, but people have different intuitions of course.

"Power" is certainly not about what can be done, because even in language with very low "power" like assembly, you can write an interpretter or a compiler for a more powerful language, and then write what you originally wanted.

It is more about feasibility of making various kinds of programs. If you can practically solve more problems in one language than in another, this language is more powerful. "More problems" is imprecise again, because different languages are suited to different problems.
But certainly you can do more in Ruby than you could in assembler.

Other similar concept is expressiveness. There are various styles and "idioms" in programming. Some languages support more such styles and "idioms" than others. This isn't about having many features. A small number of core features like functions, closures, objects, reflection, callcc etc. means you can emulate majority of idioms in a few pages of code. Lisp programmers really love proving how completely different features can be implemented in a few lines of code, using such basic features.

Conceptual gonciseness is also part of "power". An example - Ruby/Smalltalk blocks are very concise - they do not contain any information that isn't relevant. Java anonymous inner classes have similar function, but using them requires a lot of conceptual overhead.

Textual conciseness is sometimes mentioned, but only because most languages have similar number of characters per concept and can actually be measured.

"Power" is something between these concepts. At least the way I understand it.

Jon Harrop said...

Yes. The fact that so many people try to program in languages like Lisp that lack proper pattern matching never ceases to amaze me.

The only thing I find more surprising is the never ending supply of Lispers who believe they can lash together ad-hoc reimplementations of part of OCaml or Haskell in Lisp, like the pattern matchers.