The best kittens, technology, and video games blog in the world.

Saturday, August 04, 2012

7 Languages in 7 Weeks

Maine coon kittens by eleda 1 from flickr (CC-NC)


I got Bruce Tate's "Seven Languages in Seven Weeks: A Pragmatic Guide to Learning Programming Languages" book ages ago, but like with all books which require significant active effort it gathered dust on my bookshelf for quite some time. Now that I've read it and did all the recommended exercises I must say it's absolutely awesome!

The book introduces 7 diverse programming languages, and for each of them it skips generic hello world nonsense, and dives directly into what makes each of them special.

Each chapter was a lot of fun even though I had some (or pretty massive) amount of experience with some of the languages in the book.

I put my solutions to the exercises on my github in case you're interested, but I recommend not looking at them until you try to solve them yourself.

1. Ruby

Ruby is the most questionable inclusion in the book, since of the kind of people who read programming books for fun, every one of them already knows Ruby. Fortunately the book mostly focuses on metaprogramming, so it won't be too boring for most people who know some basic Ruby.

Another good thing are all the "compare how solves " exercises scattered throughout the book - small but important things you don't normally notice about languages.

2. Io

This is by far my favourite chapter. Io has absolutely insane programming model vaguely resembling message passing.

In "normal" Smalltalk-style message passing the basic operation is:
send(receiver object, message identifier, list of evaluated arguments passed by value)
Ruby, Javascript and all other proper object-oriented languages all do more or less the same thing with some minor complications.

What Io does is completely reversing this:
send(receiver object, sender's lexical context, message s-expession)
If receiver wants to evaluate arguments in the s-expressions (OK, it's not literally s-expression, it's just the best analogy I could come up with, call them ASTs if you wish) - which it does 99% of the time, it sends them back as new messages to sender's lexical context. And there they're evaluated by lexical context object (which doesn't have particularly much to do with sender object).

For "normal" messages it's syntactic-sugared sufficiently that you don't have to think about it.
fib := method(n,
  if(n <= 2, return 1)
  return(fib(n-1) + fib(n-2))
)

Here, since method(n, body) has multiple arguments, first argument is evaluated (in sender's lexical context) and assigned to n, then body is evaluated.

But how do you implement high-order methods? Why, by subclassing sender's lexical context of course. I'll let the awesomeness of this sink in for a bit.

Here's Matrix foreach method I wrote for one of the exercises (it's not directly anything they're ask for, so it's technically not a spoiler):

Matrix foreach := method(
  args := call message arguments
  xi := args at(0) name
  yi := args at(1) name
  vi := args at(2) name
  msg :=  args at(3)
  ctx := Object clone
  ctx setProto(call sender)
  for(i,1,xsize,
    for(j,1,ysize,
      ctx setSlot(xi, i)
      ctx setSlot(yi, j)
      ctx setSlot(vi, get(i,j))
      msg doInContext(ctx)
    )
  )
)

First, since method( ) has only one argument (function body), there's no automatic evaluation. Instead we parse argument, extracting names for the first three, and msg for the last.

Then we subclass sender's lexical context into ctx, set some variables in it with setSlot message, then send msg (body which was passed to our foreach as 4th argument) to ctx. Why does it work? That's because evaluating anything is sending a message to some lexical context. Our entire function body is one big message consisting of a sequence of smaller messages like yi := args at(1) name and ctx := Object clone.

Unfortunately the book focuses mostly on prototype-based programming part of Io, while Javascript and even Ruby do that much just as well. The brilliant insanity hidden within Io is barely scratched. I recommend playing with it a lot more than the book recommends.
maine coon kitten by dirk huijssoon from flickr (CC-NC-ND)

3. Prolog

If the first thing which comes to your mind when someone mentions Prolog is sudoku solvers, you're right! If you've never done any Prolog, it's a fun few evenings to spend.

Sadly in my experience programming in Prolog is about as practical as "programming in SQL" or "programming in XSLT". It's not really a programming language, it only pretends to be one. If someone took good ideas of Prolog and implemented them as an extension for a real programming language, it might be getting somewhere.

4. Scala

It feels like modern version of Ocaml in so many ways. Not only it's statically-typed mostly-functional somewhat-object-oriented language, they've thrown everything including a kitchen sink into the language and added such massive amounts of syntactic sugar for each of its features it all looks really weird and gets you thinking that there's surely some simpler syntax which does most of these things just fine.

Scala seems to have significantly less awful standard library than Ocaml - I'm not really a big fan of Java libraries so many people fawn over so much, but just about anything beats the pile of mess Ocaml standard library is.

Scala's syntax, while it won't be winning many converts, is still the best I've ever seen in an Ocaml-like language. Its feature set also looks like mostly a nice improvement over what Ocaml has.

So I'd recommend giving it a serious look if you're an Ocaml programmer.

5. Erlang

I blogged about my first impressions of Erlang a few years ago, and giving it another go didn't really change that much.

It's the Ocaml Symptom all over again - it was developed in isolation from the mainstream programming world, so its standard library is awful mess, syntax is awkward as hell, and all the small things other languages get right (like having string type, or ^D to exit) it gets totally wrong. In this case it was isolation at Erlang not in the academia, but it's a very similar problem.

Now by throwing some significant effort and abandoning concern for backwards compatibility you could probably turn languages like Erlang and Ocaml into something actually good, but nobody seems to have any intention of doing so.

This shouldn't stop you from playing with it a bit, but using Erlang in production environment? I don't think so.
Ragnificent Ragdolls - Peekaboo by DirtBikeDBA (Mike) from flickr (CC-NC-ND)

6. Clojure

Lisp is my favourite language I never actually use. Clojure is trying to do something vaguely similar to RLisp - being a sensible Lisp integrated with an object-oriented environment in existing VM (Ruby's vs JVM). And just like RLisp Clojure also attempts to fix one of the worse aspects of Scheme and such traditional Lisp dialects - far too many completely worthless parentheses.

It makes some very different choices too, and that's part of the beauty of Lisp - you can stretch the concept of Lisp really far, and it's still Lisp in the end.

Anyway, If I accidentally win a few billion drachmas somehow, I'll hire a bunch of people to turn RLisp into a real programming language.

7. Haskell

Haskell is a wonderful programming language - I'm surprised it's not used more in psychology as it'd make for some really great case studies in Stockholm Syndrome.

Here's a stylized history of Haskell design, in which they've been digging themselves deeper and deeper with each decision:

  • Traditional static typing doesn't handle high level functions well, so let's add a really sophisticated type system.
  • Our new sophisticated type system requires too much typing, so let's add type inference.
  • Type inference cannot handle programs with mutable state, so let's remove all mutable state.
  • Without mutable state we cannot actually do much, so let's add monads.
  • Nobody understand monads, so let's make hundreds of tutorials "explaining" monads in terms of containers, astronauts, dragon liars, and high grade nuclear waste.
  • and so on

What's surprising is how many Haskell programmers don't understand that monads are simply a hack to make I/O in Haskell bearable, they seriously think monads are the greatest thing ever which should be ported to programming languages that don't really need them, and everything else Haskell did as a part of general digging itself deeper and deeper is the One True Way to Program.

Disregarding that rant, while monads (and comonads, and arrows, and the rest of such insanity) have no place in any sane language, they are fun to play with for a bit.

Summary

I really recommend this book. It's not meant as a way to learn any particular programming language (and except for Ruby and arguably Scala/Clojure, none of them are remotely suitable for production use), it's a celebration of diversity of programming languages.

Or at least it will give you better arguments for your rants.

16 comments:

Anonymous said...

"...using Erlang in production environment? I don't think so"

You mean like Facebook?

taw said...

They also use PHP in production, so forgive me if I don't have a terribly high opinion about their judgment.

Anonymous said...

Their judgement for the implementation of a service that went live with 90M users?

taw said...

Apple uses AppleScript in production. Microsoft uses Vista in production. Just because some big company does something doesn't mean it's sane.

Anonymous said...

So, in your opinion, what is the problem with using a language that has been designed specifically for building large scale, robust apps in production? The fact that ^D doesn't exit the interactive shell?

taw said...

Whatever it was "specifically built" for, pretty much nobody ever uses Erlang for " large scale, robust apps in production" because pretty much nobody ever uses Erlang period.

You can find a handful of users for pretty much any technology, no matter how narrow and/or how awful, so listing one isn't terribly convincing to anyone, unless they're already pre-convinced.

And even with your example facebook chat, it only implements a small part of its functionality in Erlang. It's mostly C++/PHP, and Erlang does only message queueing and delivery - it'd be an extremely simple problem other than for scale.

The few cases when somebody actually uses Erlang seem to be for the sake of VM (which has some unusual feature set and performance characteristics), not the language itself (which is universally awful). If someone kept Erlang VM but replaced the language with something saner, even a Lisp dialect, it would probably be much more usable. LFE doesn't seem to be very active as far as I can tell, but then I'm not following it too closely.

Anonymous said...

How about companies that use RabbitMQ or Riak?

I think you skipped through the Erlang at Facebook slides a little too quickly. See the Channel Application section for what Erlang is used for (message auth, presence and long-poll).

You've still not provided reasons for Erlang not to be used in production.

I think a more balanced view is represented here http://blog.couchbase.com/why-membase-uses-erlang

Daniel Lewis said...

Your "stylized history of Haskell" is half backwards and half plain wrong. Haskell's direct ancestors are, in chronological order:

- ISWIM, a dynamically-typed imperative language with first-class functions and algebraic data types.

- SASL, a dynamically-typed purely functional language based on combinatory logic.

- Miranda, a lazy, purely functional language with ML-style types and type inference.

There was never a stage involving "traditional" static typing (whatever that means--the typed lambda calculus is older than digital computers), nor a stage with static types but no type inference. The decision to get rid of mutable state was made long before a type system was being considered, in order to facilitate equational reasoning about code. You should well know that type inference can coexist with mutable state, as it does in both O'Caml and Scala.

You are correct, though, that having programs that could only compute without being able to do anything made Miranda fairly useless. What kept Haskell from going the ML route and simply adding side effects was not its type system (substantially identical to ML's) but rather its laziness. In a language where parts of values are only computed as needed, having those computations cause side effects would destroy all hope (as a human or a mathematician) of being able to figure out what a program will do.

The solution (after years of embarrassing head-scratching) was monads, which are nothing more than a way to represent side-effecting computations as data and combine them into larger computations using ordinary functions. Making computations first-class lets you implement not only boring things that every language has (IO, mutable state) but also coroutines, backtracking parsers, or Prolog-style logic programming. You can even apply the same pattern to JavaScript to write callback-using code in direct style, though the syntax leaves much to be desired. Saying monads are just a hack to make I/O work in Haskell is like saying blocks are just a hack to make `each` work in Ruby (don't laugh, actual Python programmers have spoken those very words).

My pet theory about monad tutorials is that they are a hilarious but unfortunate vicious cycle. People wander confused through the maze of crappy monad tutorials until at random two analogies recombine in their heads into something that appears to make sense. They decide they finally get monads and proceed to write their own crappy monad tutorial--after all, it took them so long to "get it"; clearly all existing tutorials are crappy! Occasionally someone who knows what they're doing bravely decides to bear the social stigma of having written a monad tutorial (all of which are crappy, recall) in order to attempt to turn the tide, but by their audiences have like as not been already exposed to enough misleading analogies to scar them for life. Much like continuations, monads are at once too simple and too convoluted to explain without some effort on the part of the listener.

(Incidentally, monads and delimited continuations have equivalent expressive power: you can represent "computations that use continuation effects" as a monad, and if you replace your side-effecting primitives with appropriate continuation-invoking routines you can turn code that causes side effects into code that returns a monadic representation of those side effects. But that is a story for another day.)

taw said...

Daniel: ISWIM never existed as an implementation, and if SASL did nobody has seen any code (Wikipedia hilariously describes it as "SASL appears to be..." - verifiable much?). If you consider more realistically widely used functional languages like Lisp, ML, and Miranda as precursors of Haskell my "stylized history" more or less matches.

If you ever tried to use type inference in any Ocaml program with heavy use of mutable state, you quickly approach the point when you need massive number of type annotations. Any reference type forces both covariance and contravariance, and as soon as you loop unification-based type inference engines give up. This is especially true if you use objects in Ocaml (the "O" part) - prepare for as many type annotations as if it was Java.

Type inference engines based on some other principles like constraint solving might possibly be able to handle this for all I know, but it's a very long time since I last looked into it.

Callcc can implement monads since callcc can implement everything - it really doesn't work the other way. I've seen someone implementing monads using Ruby's half-assed callcc, so in a language with proper callcc it would most likely be even easier.

Daniel Lewis said...

I found a version of the SASL manual earlier when trying to determine whether or not it had continuations. I may take a shot at updating the Wikipedia page with some information from that and from the History of Haskell paper. Even if you add Lisp and ML to the list, though, there's still no ancestor with a type system that doesn't handler higher order functions, a type system without type inference, or a type system that doesn't handle mutable state.

OCaml's type inference problems come from its extensions to the Hindley-Milner system, in particular subtyping. HM (ML's original type system) has the property that every expression has a most general type: the types "int -> int" or "string -> string" would both be valid for "map", but the type 'a -> 'a is better. Anywhere you could use a "map" with one of the overly-specific types, you could use the more general one too.

Introducing subtyping (of objects and of polymorphic variants) breaks this property. With covariance and contravariance it's no longer the case that a more general type is always acceptable where a specific one is requested, and you start having to write coercions (which are not inferrable in the general case). It happens that mutable state is the simplest way to get yourself into variance trouble in OCaml, but state itself is not the culprit; it works just fine in (subtyping-free) vanilla (S)ML.

You can have monads with call/cc in them in Haskell (though of course you can only use it within the monad). That <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.8213>(composable) continuations can macro-express any monad</a> <i>without mutable state</i> should at least be a little bit surprising. Goto, for instance, can't implement much at all without state.

(In practice the main effect of this equivalence is that Scheme and Haskell programmers get more chances than they otherwise would to say "Man, those people do simple things in weird ways.")

Rodolfo Martínez said...

"Prolog [...] It's not really a programming language, it only pretends to be one"

Prolog is not only a "real" programming language but also a Turing complete one, thus its computing power is the same as of any other.

Rodolfo Martínez said...

"Prolog [...] It's not really a programming language, it only pretends to be one"

Prolog is not only a "real" programming language but also a Turing complete one, thus its computing power is the same as of any other.

taw said...

Rodolfo Martínez: No language is really Turing-complete since that would require infinite memory sizes.

In any case, being Turing complete has nothing to do with being a real language - brainfuck [ https://en.wikipedia.org/wiki/Brainfuck ] is Turing complete, and nobody thinks it's a real language.

eustachio said...

"except for Ruby and arguably Scala/Clojure, none of them are remotely suitable for production use"

Can't speak to the others, but for Haskell, see:

http://www.haskell.org/haskellwiki/Haskell_in_industry

Running a startup on Haskell:
www.youtube.com/watch?v=ZR3Jirqk6W8

http://www.scribd.com/doc/19502765/Engineering-Large-Projects-in-Haskell-A-Decade-of-FP-at-Galois

Joker_vD said...

Monads is not a "hack" to implement I/O. Monads were brought into CS as just yet another way to formalize what the hell I/O even is.

And for yet another crappy analogy, I/O monad is pretty much just Unices' pipes.

taw said...

Joker_vD: "I/O monad is pretty much just Unices' pipes" is about as accurate as "Javascript is pretty much just Scheme".