Saturday, July 14, 2007

Truth, falsehood and voidness in dynamic languages

claws by theogeo from flickr (CC-BY)
One of the things which different dynamic languages do differently is how truth, falsehood, and voidness are handled. I checked how it's done in 9 most popular dynamic languages - Common Lisp, JavaScript, Lua, Perl, PHP, Python, Ruby, Scheme, and Smalltalk.

The first question - does the language has dedicated booleans ? That is - do questions like 2 > 1 return special booleans or something else ?
  • Ruby, Lua, Smalltalk, JavaScript - Yes (true and false)
  • Python - Yes (True and False)
  • Scheme - Yes (#t and #f)
  • Common Lisp - No, it returns symbol t for true and empty list (nil) for false.
  • Perl - No, it return 1 for true, and undef for false.
  • PHP - Kinda. Since PHP4 there are booleans true and false, but their behavior is full of hacks - print true prints 1, print false prints nothing, false == 0, false == NULL, true == 1, even true == 42.
If booleans are used in boolean context their interpretation is obvious. If most objects are used in boolean context they usually are treated the same way as true. There are a few common exceptions. How are empty list, integer 0, floating point 0.0, and empty string treated in boolean context ?
  • Ruby, Scheme, Lua - all are true
  • Perl, PHP, Python - all are false
  • JavaScript - empty list is true, others are false
  • Common Lisp - empty list is false, others are true
  • Smalltalk - NonBooloanReceiver exception is raised if anything but booleans is used in boolean context.
Is string "0" false ?
  • PHP, Perl - unfortunately "0" is false, and this is a huge source of nasty bugs
  • Ruby, Scheme, Lua, JavaScript, Python, Common Lisp - "0" is true
  • Smalltalk - NonBooloanReceiver exception is raised
Is there a special value denoting absence of value ? What accessing nonexistent array element returns ?
  • Ruby, Lua - nil, accessing nonexistent elements returns it
  • JavaScript - undefined, accessing nonexistent elements returns it
  • Perl - undef, accessing nonexistent elements returns it
  • PHP - NULL, accessing nonexistent elements returns it
  • Python - None, accessing nonexistent elements throws an exception
  • Smalltalk - nil, accessing nonexistent elements throws an exception
  • Scheme - there isn't one, accessing nonexistent values is an error
  • Common Lisp - there isn't one, but empty list acts as one in most contexts, it is also returned when accessing nonexistent elements
Is the nonexistent value false in boolean context ?
  • Ruby, Lua, JavaScript, Perl, PHP, Python, Common Lisp - it is false
  • Scheme - there is no nonexistent value marker
  • Smalltalk - NonBooloanReceiver exception is raised
The most common answers are: there are dedicated booleans, and dedicated absence marker; it is possible to use normal objects in boolean context, most of which (including string "0") are treated as true, while absence marker is treated as false.

There is no clear consensus whether 0, 0.0, "", and empty list should be treated as true or false. Personally I think it's better to make them all true. Otherwise either libraries can define other false objects (like decimal 0.00, various empty containers, and so on) what complicates the language, or they cannot what makes it feel inconsistent.

Is most languages accessing nonexistent elements of an array returns an absence marker instead of throwing an exception, and in my opinion that's the right way and it makes the code look much more natural.

16 comments:

  1. Anonymous06:00

    I can't speak for any other languages, but javascript's "empty list" evaluates to true because it's not really empty- it's an object that has properties.

    ReplyDelete
  2. This is interesting. Thanks for collecting this information.

    ReplyDelete
  3. This is one of those confusing situations where I'd rather have explicitness. Incidentally java gets this mostly right - there is no notion of 'truthy' or 'falsy' - booleans can be compared only to booleans, anything else is a syntax error.

    e.g.: if ( false == 0 ) /*do something */; - >this is not legal code, but 'if ( false == 2 > 5 ) ', while strange, is legal, as the result of ' 2 > 5' is a boolean.

    ReplyDelete
  4. Reinier Zwitserloot: Situation in static languages is very different than in dynamic languages. In dynamic languages you cannot make use of non-boolean in boolean a syntax error, because the compiler doesn't know what will be returned. The most you can do is throw an exception.

    Java did what it did because it was the only politically acceptable way of fixing C/C++ mess. They wanted real booleans, consistency (so int 0 and bignum 0 would be treated the same), but didn't want operator overloading (the only way to get bignum 0 be false), and would get lynched if they introduced non-C/C++-compatible rules (0 to be true), so they made the whole thing illegal instead.

    Personally I like nil/false are false, everything else is true rule.

    ReplyDelete
  5. The practical equivalent of 'syntax error' in a dynamic language is runtime error. See SmallTalk, it does virtually the same thing (almost all non-obvious conversions generate an exception, where most conversions are non-obvious, which is what this very post shows). Given the complete mess, with 'empty list' being true or false, your preference of having nil/false/0 being 'false' and the rest being 'true', isn't consistent.

    For one, most languages out there don't follow this rule, and for two, one man's definition of obviously falsy doesn't fit another's (see javascript's approach to empty arrays).

    ReplyDelete
  6. Reinier Zwitserloot: I actually prefer 0 to be true just like in Ruby.

    I just prefer situation where 0 is consistently false to situation where it is sometimes false and sometimes true depending on type of 0 (like native int false 0 vs library bignum true 0).

    ReplyDelete
  7. Anonymous16:29

    Interesting compilation, thanks! Just one thing:

    "Personally I think it's better to make them all false."

    I think you meant "true" there, otherwise the next sentences don't make sense.

    ReplyDelete
  8. Robin: Thanks, that was a late night thinko indeed :-)

    ReplyDelete
  9. I prefere in dynamic languages when everything except false is not false and everything except Null/Nil/None is not Null/Nil/None - Integer etc. are just next type of object.

    But is it not a difference betwean strongly and weak typed languages?

    ReplyDelete
  10. "Is most languages accessing nonexistent elements of an array returns an absence marker instead of throwing an exception, and in my opinion that's the right way and it makes the code look much more natural."

    Hmm.. I'd say the opposite. Lets say I have an array a = [1,2,3,4,None,6,7]

    a[4] should return None. It's not an error condition, its just the value of that element.

    a[100] should *not* return None IMHO. It's an error condition since the element does not exist and must be differentiated from the state above. So throwing an exception is the right thing to do.

    I'm generally wary of situations where the error value is one of the legal values of the operation.

    ReplyDelete
  11. Anonymous08:46

    @The first question - does the language has dedicated booleans ?

    Your result is not because of php's hackishness but because you are miss-using it, if you want to compare booleans like that use === it is stricter and does some kind of type checking, and there is a boolean type after 4, print is for display only and that is why it appears to work funny.

    (posting as ac for defending php)

    ReplyDelete
  12. Anonymous15:29

    Common Lisp signals an error condition when accessing a nonexistent array element.

    ReplyDelete
  13. Anonymous: Yeah, but Common Lisp moral equivalent or an array (that is - main sequential data structure) is linked list, and Common Lisp happily returns nil if you car od cdr nil.

    ReplyDelete
  14. For picking a nit: if ( false == 0 ) is not a syntax error in java.

    ReplyDelete
  15. Great overview! I've been bothered by the lack of 'nil' in Scheme. Many (most?) Scheme implementations define a 'void' type which amounts to the same thing, but it's not standardized and there's usually no reader syntax for it.

    If one were to add such reader syntax to Scheme (trivially done, of course), what are your thoughts on what the literal should be? Given that we have #t and #f, would #n make sense? Or the full monty, #nil? I'd be loath to define it as anything else but a sharp-sign macro (e.g. Common Lisp's 'nil' infringes on the user's namespace).

    ReplyDelete
  16. Arto Bendiken: RLisp calls them true, false, and nil (#t and #f are also supported, but just for compatibility). I guess if we have #t and #f in Scheme, #n would be a logical choice.

    ReplyDelete