The best kittens, technology, and video games blog in the world.

Monday, July 31, 2006

XML-oriented programming with Ruby

Photo from Commons by Bfe, GFDL.
XML programming is important. There are even special languages that make it easy. But do we need a new language or can we have the same level of expressiveness in an existing one, like Ruby ? Let's try to program XML - but not starting from implementation and bending our mental model until it fits, but starting from a mental model and bending the language ! And let's care about the real thing first, DTD, entities, validation and whatnot are less common so they can wait. The basic structure should of course be an XML node object:
a_node = XML.new(:a, {:href => "http://www.google.com/"}, "Google")
What do we want the constructor to accept ? Well, the first argument must be a Symbol (or a String) with XML tag name. Then the attributes in a hash table. Then the contents - random collection of strings and XML nodes. Notice that we can easily make attribute hash optional, if the second attribute is not a hash it's simply part of the content.
a = XML.new(:br)
b = XML.new(:br, {:style => "clear: right"})
c = XML.new(:h3, "My new blog")
d = XML.new(:body, c, "Hello, ", XML.new(:a, {:href=>"http://earth.google.com/"}, "world"), "!")
Of course our XML object will have the appropriate to_s method, so we can simply say print xml_node to get the correctly formatted XML. XML should also support XML.parse(xml_as_string), XML.from_file(file_name_or_handler) and XML.from_url(url) methods for fetching existing XML files. We probably also want to be able to easily generate subclasses of XML:
H3 = XML.make_subclass(:h3)
a = H3.new("Hello")
The subclasses factory may even get much complex, but let's stay with the easy one for now. We would like our XML nodes to support the "natural" operations like:
a_node[:href]  = "http://en.wikipedia.org/"
body_node << H3.new("New section")
node.each{|c| print c}
node = node.map{|c| if c.name == :h3 then H2.new(c.attr, c.contents) else c end}
Because Ruby 1.8 doesn't support arbitrary *-expansion, we probably also want to automatically expand Arrays passed as constructor arguments - XML nodes cannot have arrays as elements anyway. The we will be able to say (compare with analogous program in CDuce):
require 'magic/xml'
include XML::XHTML # Import all XHTML subclasses

posts = XML.from_url "http://taw:password@del.icio.us/api/posts/all"

def format_posts(posts)
  posts.contents.map{|post| LI.new(A.new({:href=>post[:href],}, post[:description])) }
end

print HTML.new(
  HEAD.new(
    TITLE.new("del.icio.us summary magically generated by Ruby")
  ),
  BODY.new(
    H3.new("The list"),
    format_posts(posts)
  )
)
Yay, isn't it cute ? Uhm, now it only needs to be implemented :-D

Sunday, July 30, 2006

XML-oriented programming with CDuce

Image from flickr by Tom Harpel, CC-BY.
I don't think many people believed me when I was talking about language-level support XML. But they actually do make languages for XML-oriented programming, and maybe we can borrow some of their ideas. One of such languages is CDuce, which is based mostly on OCaml. So the first thing we can do is loading XML file:
let posts = load_xml "http://taw:password@del.icio.us/api/posts/all";;
Wow, it wasn't that bad. Unfortunately if we do it like that, posts is going to have type AnyXml and CDuce's type system won't let us code without explicitly dealing with weird kinds of data. So unfortunately we need to specify types:
type Post=<post href=String description=String ..>[];;
type Posts=<posts ..>[Post*];;

let posts :? Posts = load_xml "http://taw:password@del.icio.us/api/posts/all";;
So we have a list of all my del.icio.us entries.
let format_posts (Posts -> Any)
 <posts ..>items ->
 <ul>(map items with
   <post description=dsc href=href ..>_ ->
   <li>[<a href=href>dsc]
 );;
The first line defines a function name and its type. Functions can have types Any -> Any, but then we'd have to catch "other" patterns. So the type system is only moderately annoying here. The rest should become intuitive if you look at it for 5 minutes ;-) And let's just apply the function and print the results:
let the_list = format_posts posts;;

dump_to_file "out.html" (print_xml
 <html>[
   <head>[
     <title>"del.icio.us summary magically generated by CDuce"
   ]
   <body>[
     <h3>"List"
     the_list
   ]
 ]
);;
And here we are, with nicely formatted XHTML output. Of course it has more fancy features, like queries, joins and whatnot. Now the question is - is it worth it, additng XML support at language level. I would say yes. XML processing is something that most programs either already do, or would do if it was easier, like when parsing configuration files. And easy things should be easy.

The programming language of 2010

Picture from CuteOverload!
The programming language of 2010 is of course going to be RLisp 2 or Perl 6 or Ruby 3, whichever you think is least of a a vaporware ;-)

Now, let's think for a moment, what would that language look like. We cannot just Star Trek ourselves a few years forward, but we can try making an educated guess about it, and getting at least some things right. So my guesses are:
  • It's going to be have support for all the common things people are programming, at language or stdandard library level. That means direct support for: HTTP, TCP/IP, XML, Unicode, i18n. I wouldn't be surprised if it even had AJAX. Some languages are already trying to explore this kind of integration. See for example CDuce. Oh yeah, and GUIs are probably not in this category - they are common, but nobody knows how to make a decent portable GUI library (see recent Java GUI issues).
  • It's going to be simple. C was much simpler than Ada and it won. Java was much simpler than C++ and it won. Ruby is much simpler than Perl and it's winning. Ruby on Rails is much simpler than PHP and J2EE and it's winning. If the language can still surprise you after a few months of using it, has ugly corner cases, and many things that "kinda work, but not always", it means huge loss of productivity.
  • It will cooperate with the outside world. You know what was great about Perl ? You never had to care about dealing with the outside, you could support even the weirdest interfaces with just a few lines of code and get back to coding the real thing. If the outside world changed completely, you needed a few minutes to make your program work again.
  • It will give a damn about security. Most programs run on insecure networks. Take a look at some existing programming languages - C is just great for crackers - it makes writing stack smashing friendly code so easy. And PHP is so SQL injection friendly. I wouldn't be very surprised if the authors of C and PHP were member of CIA or some other GNAA trying to make people write insecure code so they can exploit it. But people are smarter now, and I think the language of 2010 will make it easy and natural to write secure code.
  • It won't give a damn about performance. Designing for performance is root of all evil. All languages designed for performance suck. You can get most of the performance later. For God's sake, even Java is faster than C++ these days. The language's only ahead-of-time compiler will probably be JIT compiler running in a hacked mode.
  • It will be dynamically typed and completely object oriented, kinda like Ruby. It will also have high-order functions and metaprogramming, kinda like Ruby.
  • It will use reasonably familiar syntax and be based on principle of least surprise. You can change syntax a little - after all syntax of Perl, Python and Ruby is not strictly C-based, but reasonably readable for people with C-syntax background. But if you go too far, people will reject your language. Every programmer's brain has a part that is responsible for parsing. If they used Java-style or Ruby/Python-style syntax all their lives, they will have trouble programming in Lisp or ML style languages. It would take years for them to switch their brains enough for the different syntax to feel natural. The same with semantics - if you changed traditional variables into Prolog style variables, or traditional control structures into monads people would probably reject your language, even if the new semantics was technically superior. That means it can take many iterations before the good idea gets accepted. Like, people went from C to C++ to Java to Ruby to start real object-oriented programming even though Smalltalk was available back then. It was probably too weird back then.
  • It will be implementation-defined and will evolve. Standards are a great way to kill a language, we just need a single good Open Source implementation. All the recent winners were implementation-defined: Perl, Java, Ruby. Compare with standards-based languages like C/C++ which are simply dying now and being replaced by Java and others instead of evolving. And portability of programs written in standard-based C/C++ is so horrible that everybody is using (implementation-defined) stuff like autoconf. So what was the point of standards again ? Oh design by committee can lead to horrible results - like while a few things about Scheme suck (Scheme has a small standard covering just the basics), it would be more accurate to say that a few things about (the paragon of design by committee) Common Lisp do not suck. Should I even mention that most of the "standards" are not available online for legal reason ?
  • It will be easy to code interactively and IDE-friendly. Java folks are recently doing some really great things with IDEs that the rest of the world doesn't even know about yet. And we need a way to code everything interactively or many things will be really painful to debug. A common problem with many of the today's systems is that you cannot interactively run client-server programs and you need to debug by ugly hacks. This has to be fixed.
  • It will probably have macro support and something callcc-like. Macros are the most obvious way we can extend power of the languages nowadays, so I guess the language will support those. callcc is used only occasionally in the end programs, but it makes extending infrastructure much easier.
I guess these are the basics. As far as I can tell, Ruby is the closest to the target, but still not quite there.

See also: Paul Graham's idea about language design.

Thursday, July 27, 2006

Polish localization of Linux sucks

The ducks from CuteOverload, as usual.
And unless we do something about it it's going to suck forever. And as long as localizations suck, we'll never be able to fix Bug #1.

In the last few years some things got fixed about i18n - all sane distros use UTF-8 only, have fonts that cover most of the world preinstalled (including CJK), and contain packages with input method switch (scim).

What is needed now is:
  • Instant feedback mechanism - when the translation sucks, the user should be able to correct it easily and submit it upstream as easily. This requires some serious changes in i18n infrastructure, but there is no way we'll be able to have quality localizations to hundreds of languages without instant feedback.
  • Consistency of the translation (like not using "Skopiuj" form half of the time in place of "Kopiuj").
  • Quality check - the translations are often plain wrong. Wrong is worse than unlocalized.
  • Translation of everything. Everything. If something is not translated, it should be considered a serious bug.

Wednesday, July 26, 2006

Using del.icio.us to deal with lack of tags on Blogger

I don't have much experience with blogging websites, but Blogger definitely lacks many useful features, like tags. Nothing is lost, however, we can easily work around it by using del.icio.us. First let's download the archives:

$ wget t-a-w.blogspot.com/2006_0{5,6,7}_01_t-a-w_archive.html
Now let's extract the posts. Because they're in reverse chronological order, we want to explicitly specify order of files if we don't want to have a total mess. I use simple regular expression, but the HTML is very nicely tagged, so you can also use a "real" HTML-aware parser:
$ cat 2006_0{7,6,5}* | ruby -e 'STDIN .read .scan( %r[<h3 class="post-title">\s*(.*?)\s*</h3>.*?<a href="(\S+)" title="permanent link">]m) .each{|title, url| print "#{title}\n#{url}\n\n\n"}' >POSTS
Now the POSTS file looks something like this:
$ head POSTS
Free beer for the first 10 cool RLisp programs
http://t-a-w.blogspot.com/2006/07/free-beer-for-first-10-cool-rlisp.html


List of things that suck in Scheme
http://t-a-w.blogspot.com/2006/07/list-of-things-that-suck-in-scheme.html


RLisp gets HTTP support
http://t-a-w.blogspot.com/2006/07/rlisp-gets-http-support.html
We should edit it, by inserting the relevant tags below the URLs:
$ head POSTS
Free beer for the first 10 cool RLisp programs
http://t-a-w.blogspot.com/2006/07/free-beer-for-first-10-cool-rlisp.html
beer ruby rlisp lisp

List of things that suck in Scheme
http://t-a-w.blogspot.com/2006/07/list-of-things-that-suck-in-scheme.html
scheme rant

RLisp gets HTTP support
http://t-a-w.blogspot.com/2006/07/rlisp-gets-http-support.html
OK, so everything's ready and we can send the posts to del.icio.us. Unfortunately the API documentation didn't seem quite right and I kept getting 404 errors. Well, 15 minutes is more or less my attention span, so I just googled for cpan del.icio.us and found Aaron Straup Cope's package.
#!/usr/bin/perl -w
$|=1; # Automatic STDOUT flushing

use Net::Delicious;

@_=<>;
my @p=();
while(@_){
 my $title = shift@_; chomp $title;
 my $url = shift@_; chomp $url;
 my $tags = shift@_; chomp $tags; $tags = "taw blog $tags";
 shift@_;
 push @p, [$title, $url, $tags];
}

my $del = Net::Delicious->new({user=>"taw", pswd=>"password"});
@p = reverse @p;
#while ($p[0][0] ne "Title of the earliest post we want") {shift @p};
for(@p)  {
 my %args = (
   url => $_->[1],
   description => $_->[0],
   tags => $_->[2],
 );
 print "Posting $title: ";
 my $retval = $del->add_post(\%args);
 print $retval, "\n";
 exit unless $retval == 1;
 sleep 5;
}
And new we have easy access to things like list of all blog posts about RLisp.

Free beer for the first 10 cool RLisp programs

New release of RLisp is available at http://taw.github.io/rlisp/.

Cool new features include:

  • Lexical scoping, so returning lambdas works as expected without any extra work. (set! var value) special form lets you modify bindings different than the innermost.
  • Standard library separated from the Ruby code, including a lot of the stuff you would be expecting from a minimal but reasonable Lisp-like environment.
  • More hooks into Ruby, so the integration becomes somewhat nicer. You don't have to call (ruby-eval "code code") all the time, unless you want to.
  • Official copyright statement added (MIT licence, no warranty, do what you want). Because there is no clear boundary between system and user program, GPL-like licences would make no sense.
  • Some minimal documentation.
  • Bugfixes, code cleanup, more examples, more features, the usual stuff.
Now RLisp needs a field test. I guess that writing some cool small to medium sized programs is the best way to test the design and correct it if any major deficiencies are found.

So the first 10 people who write a cool RLisp program get a free beer from me. I would also like to add these programs to the RLisp repository, so people who want to learn RLisp can see some real programs, not just toy examples. I'm usually in Germany or Poland if you want to collect the prize :-D

The program should do something cool, and if possible be readable and take advantage of RLisp's features. Using existing Ruby modules in your program is perfectly ok, after all integration with Ruby is one of the main features of RLisp.

Monday, July 24, 2006

List of things that suck in Scheme

So here's another post in a great series that so far featured Ruby, Ocaml and various other things. So now the list of things that suck about Scheme, as defined in R5RS standard. You may want to take a look at the standard, because it's reasonably short and well-written compared to average standard.

  • No support for such basic data types as arrays (vectors are not resizable, so they don't count), and hash tables/dictionaries. Most programs need both, and in Scheme one needs to either reimplement them in every program or hack around this lack. Compare with handling of numerical types, which are overspecified far beyond needs of typical program.
  • The standard has horribly complex support for numeric types. Vast majority of programs require a single numeric type - bignum. Significant minority of programs require a second numeric type - hardware-native (typically IEEE754 double precission) floating point numbers. Number of programs that use rationals or complex numbers or the rest of the extremely complex type scheme is within statistical error equal to zero. I'd guess that regular expressions, XML handling, or POSIX file system operations are all >100x more common than those weird types, and resizable arrays and hash tables/dictionaries are like >1000x more common. One would think that the number system is at least any good if it takes significant portion of the standard ? Actually it sucks, it's impossible to easily extend it to add support for your own types (like vectors or matrices). In the end - huge loss.
  • Character type is totally broken. What a character really is is a very short string. If we use Unicode (and we do use Unicode), a string is a series of code points. Now the characters are one or more codepoints. For example U+0041 code point is simply Latin Capital Letter "A", but U+0065 U+0301 together make a single character Small Latin Letter "e" with Acute Accent (é). So no - code point is not a character. And Scheme explicitly prohibits implementing characters as strings - they are explicitly required to be separate types. This is totally broken. Most language suck at Unicode support in implementation level, but Scheme managed to do that at standard level. I know it's older than Unicode, but back thern i18n issues existed.
  • Scheme is not even minimally object oriented. So no generic comparisons (> a b), no shalow copy operator (dup x), no type queries (is_a? a 'blob). Half of that functionality is hacked in highly inelegant way, so you can ask objects whether they're strings by (string? x), but this is a new hack for every type. Goo gets it right.
  • Multiple value return. This is the semantic equivalent of Sam the ugliest dog. Functions can return (values 1 2) instead of (list 1 2) and then you can do some magic to get the extra values back. It was probably marginally faster than using lists/tuples/arrays to return multiple values. Oh yeah, assembly performance hacks and let's screw the semantics.
  • Highly excessive indentation levels. Compare Scheme:
    (define veclen
      (lambda (x y)
        (let ((x2 (* x x)) (y2 (* y y)))
          (let ((z2 (+ x2 y2)))
            (sqrt z2)
          )
        )
      )
    )
  • with RLisp:
    (defun veclen (x y)
      (let x2 (* x x))
      (let y2 (* x x))
      (let z2 (+ x2 y2))
      (sqrt z2)
    )
    You can be a Lisp without such excess (it seems that Arc is also saner here). Oh yeah, and there are let* and letrec to confuse everyone (Ocaml has the same problem with let rec) .
  • Macro system seems harder to use and less powerful compared to Common Lisp's and RLisp's. But in the real life, all implementations of Scheme seem to provide "real" macros.
  • car/cdr/lambda - come on, just throw away the historical garbage. The only true names are hd/tl/fn.

RLisp gets HTTP support

This isn't quite a hello-world, because I want to show how to use macros for building HTTP services. First, we need to tell Ruby to load webrick module and bind class name into RLisp namespace (there is no automatic importing of constants, maybe there should me):

(ruby-eval "require 'webrick'") ; import module
(let HTTPServer (ruby-eval "WEBrick::HTTPServer")) 
Now let's create the server object and tell it to bind to port 1337 instead of default port 80. Configuration is simply a hash table:
; Configure the server
(let config [Hash new])
[config set 'Port 1337]
; Tell the class to make us a server object
(let server (send HTTPServer 'new config))
Now we could basically send mount_proc message to server object to bind a lambda to URL.
; Tell server to call our Hello World handler
(send server 'mount_proc "/hello"
   (lambda (req res)
       [res body= "<html><body><h3>Hello, world!</h3></body></html>"]
       [res field_set "Content-Type" "text/html"]
   )
)
But this is not very nice. Let's macro around this definition. First let's make a macro that lets us bind more easily:
; Define a macro for HTML mounts
(defsyntax def-server-html-mount args
   `(send ,[args get 0] 'mount_proc ,[args get 1]
       (lambda (req res)
           (send res 'field_set "Content-Type" "text/html")
           (send res 'body= (do ,@(tl (tl args))))
           (print res)
       )
   )
)
Now we can rewrite the original handler to:
; Tell server to call our Hello World handler
(def-server-html-mount server "/hello"
   "<html><body><h3>Hello, world!</h3></body></html>"
)
The mount is as good as it gets, but we'd still rather have nice syntax for HTML. So let's define a macro that creates html-generating functions:
; Define a macro for HTML tag function
(defsyntax def-html-tag (tagname)
   `(defun ,tagname args
       (+
       "<"  (send ',tagname to_s) ">"
       [args join]
       "</" (send ',tagname to_s) ">"
       )
   )
)
And define a few such functions:
(def-html-tag html)
(def-html-tag body)
(def-html-tag h3)
Could it be any more cute than that ?
(def-server-html-mount server "/hello2"
   (html (body (h3 "Macros Greet you")))
)
Happy about the results, we start the server:
; Tell the server to go !
(send server 'start)
RLisp is available for download from http://taw.github.io/rlisp/. Previous posts about RLisp are here [1] and here [2].

Sunday, July 23, 2006

gaim sucks

Partial list of gaim's suckiness, as far as gadu gadu protocol support is concerned:

  • gaim does not support search (really, it doesn't, not even a bit). You can't check in the public directory with whom are you speaking.
  • gaim does not quietly reconnects if the connection is broken, even when i use a plugin for that (yeah, there's an actual plugin to get half-sane behaviour). With the plugin, it will pop-up the contact lists. Without it, it will flood me with error messages. And connection to gadu gadu servers gets broken all the time.
  • gaim does not encrypt connections to gadu gadu servers, and it does not support end-to-end encryption
  • gaim does not support emoticons
  • gaim does not display the last few old messages when opening a chat and does not have single keystroke history. You actually have to open history (Conversation menu > View log), then select month from the tree view, then select day, and then scrool down. Seriously, the guy who designed this UI must have been stoned.
  • on your contact list, you cannot have inactive contacts shaded out and on the bottom. You either see no inactive/hidden contacts (what's lame, because half of the people are in hidden state all the time), or you have them sorted alphabetically (what's even lamer, because you can't see who's there).

Code generator works

Code generator for the Renderman shading language compiler I'm writing for a masters thesis works. The world is one (small) step closer to raytracing graphic cards ;-). So now it can go all the way through, from source code to abstract syntax tree to assembly to pretty picture. So it can compile this thing:

color silly(point P; point Eye; color near_color; color far_color)
{
   vector ray = P-Eye;
   float ray_dist = sqrt(ray.ray);
   if(ray_dist < 0)
       ray_dist = 0;
   if(ray_dist > 1)
       ray_dist = 1;

   color c = near_color + (far_color-near_color) * ray_dist;

   return c;
}
To this assembly:
    add R1.xyz, R0.xyz, -R1.xyz
   dp3 R2.w, R1.xyz, R1.xyz
   mov_rsq R15.w, R2.w
   mov_rcp R15.w, S.w
   add R15, 0.0, -S.w + jmp 4 xyzw (>0)
   mov R3.w, S.w
8:
   add R15, R3.w, -1.0 + jmp 7 xyzw (>0)
9:
   add R3.xyz, R3.xyz, -R2.xyz
   mul R3.xyz, R3.xyz, R3.w
   add R0.xyz, R2.xyz, R3.xyz
   return
7:
   mov R3.w, 1.0
   jmp 9
4:
   mov R3.w, 0.0
   jmp 8
For now it only supports functions which pretend to be shaders, not "real" surface shaders. And more about our cool hardware here.

RLisp gets basic OO support

Update on state of RLisp. Using Ruby metaprogramming facilites it's possible to do some basic OOP in RLisp:

$ cat examples/class.rl
(let Point [Class new])
[Point attr_accessor 'x]
[Point attr_accessor 'y]
[Point define_method 'to_s
 (lambda x
 (+
     "<"
     [[self x] to_s]
     ","
     [[self y] to_s]
     ">"
 ))
]
(let a [Point new])
[a x= 2]
[a y= 5]
(print a)
$ ./rlisp < examples/class.rl
<2,5>
And it has hash tables:
$ cat examples/hash.rl
(let ht [Hash new])
[ht set 'a 2]
[ht set 'b 4]
[ht set 'a 6]
(print (+ [ht get 'a] [ht get 'b]))
(print ht)
$ ./rlisp.rb <examples/hash.rl
10
{:a=>6, :b=>4}
Iterators:
$ cat examples/iter.rl
(let a '(1 2 3))
(sendi a 'each
 (lambda (i)
     (print i)
 )
)
$ ./rlisp.rb <examples/iter.rl
1
2
3
Lists (car/cdr/cons do not share storage, they create a new list):
$ cat examples/array.rl
(let a [Array new])
(print (car '(1 2 3)))
(print (cdr '(4 5 6)))
(print (cons 7 '(8 9)))
[a push 1]
[a push 2]
[a shift]
(print a)
$ ./rlisp.rb <examples/array.rl
1
(5 6)
(7 8 9)
(2)
And (drum drum drum drum) it even has macros:
$ cat examples/macro.rl
(defsyntax cond args
 `(if (>= (length ',args) 2)
      (if (eval (hd ',args))
          (hd (tl ',args))
          (cond ,@(tl (tl args)))
      )
      (if (eq? (length ',args) 1)
        (hd ',args)
        nil
      )
 )
)

(defun sign (x)
 (cond
     (>  x 0) 1
     (== x 0) 0
     -1
 )
)

(print (sign 42))
(print (sign 0))
(print (sign -8))
$ ./rlisp.rb <examples/macro.rl
1
0
-1

RLisp - Lisp naturally embedded in Ruby

I just wrote a Lisp interpretter embedded in Ruby. I used ANTLR 3.0ea7 for parsing (it's alpha-quality software, unfortunately).

$ cat examples/fib.rl
(defun fib (n)
(if (<= n 1)
  1
  (+ (fib (- n 1)) (fib (- n 2)))
 )
)
(map fib '(1 2 3 4 5))
$ ./rlisp.rb <examples/fib.rl
#<lambda:0 ...>
(1 2 3 5 8) 
The dialect is supposed to be vaguely based on Scheme, but the idea was for it to be naturally embedded in Ruby, so many thing had to be changed compared to Scheme:
  • send function for method calls: (send 2 '+ 2) and syntactic sugar for it [obj method arguments], which evaluates to (obj 'method arguments). So it's possible to write [2 + [3 * 4]], what expands to (send 2 '+ (send 3 '* 4)). The syntax was stolen from Objective C.
  • lists are implemented as arrays. So no cons, car or cdr. Now of course we need some replacement, and I haven't really thought about it. You can send get/set method to Array object, like that ('(a b c) set 0 'd) (or ['(a b c) get 0]).
Things that would be required before RLisp is usable (even as a funky toy):
  • macro system - doh, that's the point of the whole thing
  • links to Ruby object system, so we can define classes, methods etc.
  • some support for iterators, I have no idea what syntax to select for iterators (send-with-iterator '(1 2 3) 'each (lambda (i) (print i))) would probably do the trick, but it's kinda uglyish.
  • support for other basic Ruby objects like hash tables, regular expressions etc. Most of them need little more than (let a (send Hash 'new)), but some might not be very usable in such form.
  • exception handling and callcc support would be cool
Oh, and it uses dynamic binding like Emacs Lisp instead of lexical like Scheme/Common Lisp, because it was much easier to do ;-). It should probably be fixed later. My box on which the code repository is located is behind a nasty nat, so just mail me if you want the code. :-)

Tuesday, July 18, 2006

Language popularity metrics


We all agree with Bjarne Stroustrup of C++ fame when he says that there are only two kinds of languages - those that nobody uses and those that everybody bitches about. So let's ask Google what are the most popular languages nowadays:

  • "Java sucks" 33,200
  • "PHP sucks" 29,100
  • "C++ sucks" 14,100
  • "C sucks" 12,800
  • "Perl sucks" 9,170
  • "Cobol sucks" 1,010
  • "Ada sucks" 961
  • "Python sucks" 694
  • "Scheme sucks" 631 (most about color scheme sucking, not about the language)
  • "Ruby sucks" 602
  • "Lisp sucks" 593
  • "C# sucks"555
  • "Fortran sucks" 205
  • "Prolog sucks" 127
  • "Visual Basic sucks" 79
  • "Smalltalk sucks" 74
  • "OCaml sucks" 61
  • "SML sucks" 7 (none related to the language)
I guess some of the rants are pretty old, so it includes also past popularity. Anyway - cool thing C/C++ are finally going down - they're getting very few new rants, as most people moved to something else. Now that rocks.

Another interesting thing - why do languages suck ? Java used to suck because it's "not C++", but nowadays it sucks because it's "not Smalltalk/Ruby".

How about Ruby ?
And Python sucks:

Monday, July 17, 2006

new kind of hacking

There are two kinds of objects, those with identity and those without. Objects without identity (value objects) are considered equivalent if they have the same fields, are usually used for boring stuff, and are usually immutable. On the other hand two objects with identity are considered different even if they have exactly the same state at the moment - they are usually mutable and interesting. It's kinda intuitive that a and b are the same thing, but c and d are not:

a = Complex.new(1.0, 2.0)
b = Complex.new(1.0, 2.0)
c = Array.new()
d = Array.new()
Now what if we have some value objects and we want to attach state to them ? For example we want to mark some complex numbers as lucky while others as unlucky ? Doing it naively wouldn't work: a = Complex.new(1.0, 2.0) a.lucky = true b = Complex.new(1.0, 2.0) print b.lucky # -> false And it's not obvious how to do that without introducing huge changes in the program. Well in Java at least. Now in Ruby on the other hand, because new is actually a normal method and not a magical constructor, we need just 3 lines of code...
class ValueClass
  def self.new(*args)
    @world ||= {}
    @world[args] = super(*args) unless @world.include? args
    @world[args]
  end
end

class Person < ValueClass
  attr_reader :first_name, :last_name
  attr_accessor :sucks
  def initialize(first_name, last_name)
    @first_name = first_name
    @last_name = last_name
  end

  def to_s
    "#{first_name} #{last_name} #{sucks ? 'sucks' : 'rocks'}\n"
  end
end

a = Person.new("Weasley", "Crusher")
a.sucks = true

b = Person.new("Weasley", "Crusher")
print b # -> Weasley Crusher sucks
and it corrently states that Weasley Crusher sucks. Even works with subclassing:
class Dog < Person
  def to_s
    "Dog #{first_name} #{last_name} #{sucks ? 'sucks' : 'rocks'}\n"
  end
end

a = Person.new("Bill", "")
a.sucks = true
p a # -> Bill sucks

b = Dog.new("Bill", "")
p b # -> Dog Bill rocks

Saturday, July 15, 2006

Football rules

Of course by football I mean Unreal Tournament 2004 Bombing Run.

As an old Quake3Arena player, I'm not very enthusiastic about weapons and the other stuff in UT2004 (railgun equivalents don't instafrag ? come on), but the new modes are really cool. I mean, one can't play only FFA, Team Deathmatch and CTF all the time. Bombing Run is particularly funny, it's just like football only with guns :-D. Some other cool modes are Mutant (one player gets superpowers, everyone is trying to kill him, one who kills the mutant becomes new mutant) and Assault (well, a scenario).

last.fm got new look

It's so pretty now, isn't it ? :-)

Friday, July 14, 2006

Gmail sucks

Gmail is kinda cool for regular mails, but it's horrible for mailing lists.

The feature it lacks is "ignore this thread" button. Very often there are threads that are completely uninteresting, or maybe they were interesting at first, but became flamewars later. Now the notifier is flashing every 5 minutes and you don't know whether it's an actual mail or simply the flamewar going on.

There isn't even any obvious way to make notifier ignore mailing list trafic. First, one cannot create filters on List-Id (only on From, To, Subject and body), and even if it worked, Firefox gmail notifier doesn't have such option.

Kmail supposedly has this thing done right, but I'm not very enthusiastic about having one more big program running all the time on this small machine, and email being available only when I'm here. I'll probably just drop all mailing lists, they're waste of time anyway ;-)

Metaprogramming koans

I found more koans for expanding awareness. Compared to callcc koans, these are much simpler, except for trick with evaluating default block argument in instance context.

It seems that most people did it Scheme way (with hygienic macros analogue), but I followed the Common Lisp path to enlightenment with gensym and $gensym_counter ^^; That was pure fun evil. Of course binding variables in lexical context is almost always better than gensyming in Ruby.

Continuation koans

Zen Panda from Airy Nothing.
Callcc is the ultimate control structure, but it requires some getting used to. Continuation koans will help you meditate callcc to expand your awareness.
taw@taw-desktop:~/koans$ ./continuation_koans.rb throwcatch.rb
koan_01 has expanded your awareness
koan_02 has expanded your awareness
koan_03 has expanded your awareness
koan_04 has expanded your awareness
koan_05 has expanded your awareness
koan_06 has expanded your awareness
koan_07 has expanded your awareness
koan_08 has expanded your awareness
koan_09 has expanded your awareness
koan_10 has expanded your awareness
mountains are again merely mountains
As far as I know the only mainstream languages in which callcc is available are Ruby and Scheme (and maybe Smalltalk). There was once Python fork (Stackless Python) with callcc, but it's dead. And there is some talk about adding callcc to Perl 6, but we all know it's Duke Nukem Forever of programming languages ;-)

Thursday, July 13, 2006

Autovivification in Ruby

W00t, one of my favourite Perl features can be ported to Ruby in just a single line of code. As everyone knows, in Perl it's possible to write:

$x{a}{b}{c}{d} = "e";
And all intermediate-level hashtables will be created automagically. This is often extremely useful. On the other hand in lesser languages like Python or Java the only way to set a value in a nested hash is to create intermediate-level hashtables by hand:
x={}
x["a"] = {}
x["a"]["b"] = {}
x["a"]["b"]["c"] = {}
x["a"]["b"]["c"]["d"] = "e"
This may be a small thing, but multi-level hashes are used so often that one actually misses this functionality. For long I thought that Ruby doesn't have autovivification. And well, it doesn't have the full Perl thing, but it can get reasonably close with just this small definition:
def autovivifying_hash
   Hash.new {|ht,k| ht[k] = autovivifying_hash}
end
And now you can say:
x = autovivifying_hash
x["a"]["b"]["c"]["d"] = "e"
And that's going to be good enough most of the time :-)

Wednesday, July 12, 2006

List of things that suck in Ruby


As Bjarne Stroustrup said: "There are only two kinds of programming languages: those people always bitch about and those nobody uses". Well, let's bitch about Ruby. And not about lame things like performance or libraries, but about expressiveness.

  • It's not possible to change class of an existing object. Now, why the heck would anybody want to do that ? It is actually quite useful - you use object factory to create user interface widgets from XML, determining their look, and then move them to subclass to determine their behaviour. Without this we have to use some ugly tricks. evil.rb lets us do that in some cases, but not with UI widgets. Languages where it works: Perl.
  • Ruby has really nice and concise syntax with one nasty exception - one needs to write end almost all the time. Relying on whitespace instead would make the code look much nicer. Languages where it works: Python, Haskell.
  • Keyword arguments to function are not ordered. foo :a => 1, :b => 2 is identical to foo :b => 2, :a => 1, and that's often not what we want. Languages where it works: Perl, PHP.
  • irb doesn't remember command history between sessions. Languages where it works: Octave.
  • Most libraries use Java-style constants with huge namespace prefixes instead of real Ruby symbols. So we have Gtk::Widget::TEXT_DIR_RTL which evaluates to some number instead of :rtl which would simply be converted to number on call (actually it's converted to magical object, not to a number, but that makes little difference). This makes many APIs a lot harder to remember and a lot less natural. And adding Enums to Ruby would be going in completely wrong direction. Languages where it works: it sucks in all languages I know, but that's no excuse ;-).
  • Symbol is not Comparable. I don't care what's the order between :foo and :bar, but there should be some. Without it, it's impossible to sort structures that contain Symbols (to get canonical representations etc.), and that seriously limits Symbol's usability. Languages where it works: Prolog. It does not work in Scheme or Common Lisp.
I think all these issues are reasonably easy to fix.

Monday, July 10, 2006

More movies


Some recently watched movies, in approximate order of coolness:

  • All about Eve - A great movie about how evil can people be in their quest for fame and greatness. There are well developed characters, great acting, and even an actual plot.
  • Hotel Rwanda - A genocide story. It's much more realistic and interesting than all the lame Holocaust movies. The characters actually feel real, completely unlike Schindler's List cardboard.
  • Chinatown - A film noir classic. It is evil, it is sick, it has complex plot, and it is fun.
  • Touch of evil - Another film noir classic. A little less twisted than Chinatown, but with more cool camera effects. Like the must-see opening scene.
  • The Great Dictator - I thought Charlie Chaplin movie would be idiotic commedy about slipping on banana peels and stuff like that. Surprisingly it wasn't. The movie tries to be serious and funny at the same time, and it kinda works.
  • Rebecca - I kinda liked it, but it was so slow. And the characters somehow didn't seem very believable.
  • 2001 A Space Oddyssey - A "hard science fiction" classic that is lamer than Star Trek: The Original Series as far as realism is concerned. Were people back then so scared of the computers or what ? And the monkey scenes are not even silly, they are plain dumb.
  • The wizard of Oz - Now who the heck voted that to IMDB top list ? The movie throws away all social commentary from the original book, and replaces it with a children story. So lame.

Monday, July 03, 2006

Unit testing

Good karma for Ruby and for Extreme Programming today. I've just tried to port libgmp-ruby from handmade testing system to Test::Unit. It is totally sweet:


#!/usr/bin/ruby

require 'test/unit'
require 'gmp'

class TC_Z < Test::Unit::TestCase
   def test_init_null
       assert_equal(GMP::Z.new(), 0, "GMP::Z.new() should initialize to 0")
   end

   def test_init_fixnum
       assert_equal(GMP::Z.new(1), 1, "GMP::Z.new(x : Fixnum) should initialize to x")
   end

   def test_init_z
       b = GMP::Z.new(1)
       assert_equal(GMP::Z.new(b), b, "GMP::Z.new(x : GMP::Z) should initialize to x")
   end

   def test_init_string
       assert_equal(GMP::Z.new("1"), 1, "GMP::Z.new(x : String) should initialize to x")
   end

   def test_init_bignum
       assert_equal(GMP::Z.new(2**32), 2**32, "GMP::Z.new(x : Bignum) should initialize to x")
   end
end
Executing this script runs all the tests:

$ ./unit_tests_1.rb
Loaded suite ./unit_tests_1
Started
.....
Finished in 0.006574 seconds.

5 tests, 5 assertions, 0 failures, 0 errors
Do you see any executable code, any list of tests to run ? Nah, it works by smoke and mirrors. We're going to leave XML BDSM to Java programmers, they seem to like it. ;-) It even found an actual bug in libgmp-ruby (something was pointing the wrong way around) in the first 5 minutes of testing. That's kinda cool. You can use multiple assertions per test, fixtures and so on. It still works by magic.

class TC_Q_Basic < Test::Unit::TestCase
   def setup
       @a=GMP::Q.new(100,11)
       @b=GMP::Q.new(200,17)
       @c=GMP::Z.new(40)
       @d=2**32
   end

   def test_add
       assert_equal(@a + @b, GMP::Q(3900, 187),       "GMP::Q should add GMP::Q correctly")
       assert_equal(@a + @c, GMP::Q(540,  11),        "GMP::Q should add GMP::Z correctly")
       assert_equal(@c + @a, GMP::Q(540,  11),        "GMP::Z should add GMP::Q correctly")
       assert_equal(@a +  2, GMP::Q(122,  11),        "GMP::Z should add Fixnum correctly")
       assert_equal(@a + @d, GMP::Q(47244640356, 11), "GMP::Z should add Bignum correctly")
       assert_equal( 2 + @a, GMP::Q(122,  11),        "Fixnum should add GMP::Q correctly")
       assert_equal(@d + @a, GMP::Q(47244640356, 11), "Bignum should add GMP::Q correctly")
   end
end

Sunday, July 02, 2006

Linux permissions system rant


Hello, today I want to rant about Linux permission system and how totally broken it is.

Summary of the system: In the Linux permission system each program gets one User-ID and a set of Group-IDs. Objects in the file system belong to some user, and may also be accessible to some group. Processes with the same User-ID can control each other (stop, look inside etc.). Also any process with User-ID 0 (root) can do whatever it wants to the system, including changing own User-ID/Group-IDs. Some executable files on the system may have SETUID/SETGID flag set, what means that when they're run they get an extra User-ID/Group-ID.

This is pretty much the same system Unix had 20 years ago. I think we should drop it completely and get a one that actually works.

Here's a short list of things that are wrong with the system:

  • Group-IDs are per-process, not per-user. So if user fred is added to group audio, so he can play music on the computer now, the programs he is running still aren't in group audio ! So he has to log out and log in again before the change is effective. Reboot after every change ? Is it Windows 95 or what ?
  • Of course the permission system isn't even supposed to run like that. The only user who can control audio should be the one that is currently logged-in to the physical console. One that is logged remotely should not have any access to the audio system. This is pretty much unimplementable with the current permission scheme.
  • There are no sandboxes, in particular there are no sandboxes for normal users. So it's impossible to run untrusted programs without risk.
  • There is no real nouser/nogroup. Each program in nouser/nogroup can mess with other programs in nouser/nogroup.
  • There are no per-process root sandboxes either. So one cannot start a foobard server in a way that even if the server is compromised, it has access to nothing outside. If the server runs as foo User-ID it can mess with other servers with the same User-ID. Even if there's no other program with the same user-id now, it can still create a SETUID binary and control servers running with the same User-ID in the future.
  • Normal users cannot install programs using standard interfaces. They should be able to install whatever they want for themselves (even if system-wide instalations still require administrator permissions).
  • For single-user systems, having to remember user and administrator password is silly. Well, Ubuntu is a bit better here using sudo instead of a root account.
  • Users cannot make public only one subdirectory of their home directory without granting some access to their whole directory. One might want public ~/public_html/ and private everything else. It can't be done (unless public_html is outside the normal directory hierarchy).
  • There are no guest accounts, which would be created just for one session without being able to affect other guests.
So it sucks for single-user desktop, it sucks for multi-user desktop, it sucks for public access terminals, and it sucks for the server. Why are we still keeping it ?

Full Metal Panic: The Second Raid

Some more recently watched stuff - Full Metal Panic: The Second Raid is more like the original series, and not like Fumoffu. It is sometimes a bit darker than the original and I'd rather they didn't add those parts to an inherently cheerful series like FMP.

This is quite funny - in Full Metal Panic the best part I like most is the filler and the main plot is passable, but not all that great. My favourite series is Fumoffu and it consists of 100% filler and pretty much no plot.

Oh well, there is enough filler in TSR too, so it isn't that bad :-)