Tuesday, October 09, 2007

Two months with Ruby on Rails

Migraine! by annia316 from flickr (CC-BY)
I've been coding Ruby on Rails for the last two months and this rant is long overdue. There are just so many thing that are wrong with Ruby on Rails. Being better than PHP or J2EE is not enough to get away from a quick bashing on my blog.

Views


I don't hate HAML any more. Total hatred was my first reaction to it but I more or less got used to it. The main problem with HAML, RHTML and probably all other solutions is not providing any sort of XSS protection. Hand-escaping all strings in views is almost PHP SQL injection hell all over again. The few times one needs to insert raw HTML in the output are far outweighed by the huge security problems caused by "insecure by default" model. And it wouldn't be hard to implement secure templating - just make a subclass of String meant for raw html and make default String and anything wichh to_ses get HTML-escaped.

Both HAML and RHTML are very powerful as templating languages. Ruby is simply very well suited for the job. Completely unlike Python and Java which needs hundreds of lame templating languages. With a few partials, helpers and RJS snippets it's usually hard to imagine a shorter and more natural way of writing it all.

One more nice thing about ERB - it can be used pretty much everywhere, like in database.yml for switching database adapter depending on whether it's JRuby or matz's Ruby, or in SQL snippets meant for initializing database. Maybe that's not a huge deal but there's not other language where it's so natural to do.

Controllers


Controllers are basically small bags of actions which actually do stuff. Separation between controllers and views has one huge wart - flash. It doesn't clearly belong in either. And if you want things like markup and links inside flash messages - a perfectly reasonable thing to do - it gets uglier that Perl on a bad day. I'm not sure what's the right way to do it, flash partials maybe ? Or full set of link helpers available inside controller.

Functional testing reuses controller objects without cleaning out their instance variables between requests. That's just wrong. It also reuses request and response objects for no particular reason. Oh, and it silently ignores 403s, flash[:error], doesn't follow redirects, and relies on cross-site request forgery to get any testing done - if forms include a security token in hidden field you cannot test by directly posting, you must use the actual form ! This is probably the most broken part of Ruby on Rails.

A good functional test would look something like that:
def test_that_edit_respects_item_ownershit
login
get :edit, :id => your_item
form.fill_and_sumbit :x => "Foo", :y => "Bar"
assert_raises Error403 {
get :edit, :id => somebody_elses_item
form.fill_and_sumbit :x => "Foo", :y => "Bar"
}
assert_equal "This item belongs to someone else.", flash[:error]
end


but it's a long way to get anywhere near that point. form-test-helper is probably a good start.

Models


Models in Ruby on Rails are based on an idea that you can either have a very high way view or go raw SQL but nothing in between.

Raw SQL wouldn't be that bad if they at least handled security somehow (execute accepting "%" would be a good start), strings returned from SQL were converted to Ruby objects (surprisingly timestamps get returned as Ruby objects in some databases), and results of execute supported map method. Or wait - it would be bad, it's SQL after all. Why are we still using RDBMSes in 2007, weren't they supposed to die together with Fortran or something ?

Good thing about models is how easy it is to move code between controllers and models. This barrier is much more permeable than controller-view barrier, resulting in easier refactoring and code looking better. Controller-view refactoring is usually much harder.

There's a lot of stuff that doesn't clearly fit in MVC like extensions to core classes, objects that are not backed by database, helpers and so on. It would be nice if it there was a place for putting tests of it.

Routes


There are two ways of doing routes, both bad. One is the old way as seen on screencasts. It ends with paths like /posts/123/destroy which are then fetched by web spiders deleting all your database. The new way is trying to make every controller fit REST model, so you end with DELETE /post_sharabilities/456 or something as stupid. If there is a good way of routing stuff I haven't seen it yet.

A good thing is that you can pretty much ignore it, use simple routes and filter out GETs in the controller. Controller filters are simply awesome, model filters are pretty good too. You can use them to handle things like authorization. One thing they unfortunately cannot handle is data integrity. Unfortunately Active Record hooks are too weak to handle things like ensuring that each person has exactly one primary email address. Why cannot RDBMSes just die ?

Testing


The first annoying thing about Ruby on Rails testing are fixtures. Each test runs inside a transaction so why are they wiped out and reapplied once for each test class ? And they really do not scale. There must be a better solution but I'm not sure what is it. One thing is certain - while mocha is great mocks aren't it, as often hundreds of objects must exist at the same time for testing to be useful.

Permeability of model-controller barrier also means that many things are only tested in controller tests (called "functional" tests in Ruby on Rals but I'm not sure if I like this abuse of terminology). The result - 90%+ in rcov report while half of the model methods are not tested in isolation.

Rake and Capistrano


Rake is simply awesome. It is to other building tools what Rails is to J2EE. Capistrano on the other hand, I have no idea why wasn't simply implemented on top of rake. Maybe it's time to take a look at Vlad the Deployer.

Plugins


Another nice thing about Rails is great hackability. Most behaviors can be easily hacked and most hacks can be easily extracted to plugins. A few things like schema dumper weren't that easily extensible but overall most of the stuff I wanted to hack was very simple to hack. It's also a great thing how 30 independently developed plugins each monkeypatching some Ruby or Rails behavior can work together with almost no conflicts.

Documentaton and console


Code grepping is usually the best documentation. api.rubyonrails.org was sometimes helpful but not always. Trying things out in script/console was usually enough to explore and debug model. Unfortunately it doesn't work with controllers as path helpers and controller action runners are simply not defined there, so I cannot jump from a failing test to console to find out what's going on.

Other stuff


TextMate is great. Usually I hate every program I spend more than half an hour with. In this case I only somewhat dislike some parts of it, what probably means less fussy programmers will just love it ;-)

26 comments:

  1. Anonymous09:37

    Most of the smaller things that annoy you can be easily fixed by installing appropriate gems or hacking them yourself. For almost every Rails problem, there's a couple of blog postings that at least try to deal with it :-)

    As for browsing the documentation, try gotapi.com or noobkit.com. The former is more general, the latter only Ruby oriented.

    ReplyDelete
  2. I agree with you in a lot of ways, especially the SQL being annoying. But there are a couple of in-between SQL levels -- you can do a fair bit of stuff with :conditions and :include, and all those sort of things. And by using the right :conditions syntax you can solve a lot of the escaping problems (but you probably know that). A useful feature is that anywhere that takes SQL can use the :conditions array syntax, with substitutions.

    As szeryf said, a lot of Rails is about finding a bunch of plugins to do what you want. And then hacking your own stuff. At least one of the semi-cool features of Rails (and Ruby) is that hacking is possible; one of the semi-uncool features is that it is necessary most of the time.

    ReplyDelete
  3. Simon: For what I can tell :include, :conditions and stuff quickly break, as they don't support complex actions (mass update, mass delete) or reports (group by, getting info from multiple columns) etc. Are "%" in execute and conversion of results to Ruby objects that much to ask for ?

    I know I can hack it all. I do hack it all. It would just be nicer if things I consider basic worked out of the box.

    ReplyDelete
  4. Anonymous14:40

    your comment on models being all or nothing shows you didn't read the docs or understand the methods of array and hash
    http://www.ruby-doc.org/core/classes/Array.html#M002210


    http://api.rubyonrails.org/classes/ActiveRecord/Base.html#M001044

    :include => [contact => phone ]

    basic?
    http://api.rubyonrails.com/classes/ActiveRecord/Associations/ClassMethods.html

    ReplyDelete
  5. Anonymous18:28

    Simon said, "At least one of the semi-cool features of Rails (and Ruby) is that hacking is possible; one of the semi-uncool features is that it is necessary most of the time."

    When I tried playing with RoR 2 years ago, within 30 minutes I needed to do something that wasn't built into the framework already. For something so "new" and "cool", it amazed me that I'd need to break free of the RoR sandbox that quickly. It's sad to see that this is still the case now.

    Other languages may lack the ease of use provided by RoR... but getting complex stuff to work never seemed quite as hackish.

    ReplyDelete
  6. Flash is an ugly hack that is best generalized into "flows", something that most java frameworks have an answer to: such as Spring Web Flow, Seam with its stateful Conversation model, or more exotically, Seaside and Rife with continuations. Others such as Wicket and Tapestry prefer session-scoped widgets/components instead, with the object scope of widgets being persistent between requests, and managed by a retention policy.

    Ironically, virtually none of the "agile languages" have managed to duplicate this sort of basic functionality in their own framework, requiring the developer to stuff everything in the session, use half-measures like 'flash', or reinvent their own system. CherryFlow seems to be the only serious effort at this, and it was abandoned years ago.

    ReplyDelete
  7. Pardon, I didn't mean to imply Seaside was a java framework, and it's arguable that Smalltalk is about as agile as it gets ... but the likelihood of smalltalk going mainstream again is about as likely as <clever analogy here>

    ReplyDelete
  8. Anonymous21:03

    I hear you and agree on some of your points.

    Especially with Capistrano. I think the whole need of such a separate tool shows how deployment is a bug black hole in Ruby/Rails as a platform. At least, when compared to what has been standardized in .war/.ear files in Java world.

    I recently did a post on the same subject of problems in rails, with some other annoyances than you have.

    ReplyDelete
  9. Anonymous21:38

    If you want to follow redirects use an integration test. The functional tests are supposed to just execute your controller method and let you test that in isolation. From the look of your pseudo test it looks like you want an integration test.

    I use the request and response objects in my functional (and integration) tests sometimes to set custom MIME accepts types in the request header, to check the response headers, etc. You could also use it to check the response status (like 403).

    Or you could check the response status using 'assert_response 403'. It doesn't 'ignore' 403 or flash[:error], but it doesn't raise an exception either -- and it would be weird if it did. In an HTTP client raising Error403 might make sense, but in the server that's not an exception, it's something you coded for and want to test for directly.

    Nice blog, some of your points are valid, others misinformed (which is understanable). Let us know what you think in 2 more months!

    ReplyDelete
  10. Anonymous: Do you agree that unexpected 403 in tests indicates that something is wrong, and all tests which validly return 403 should be tested ? But the way it works now, unless you do assert_response :success/:redirect after almost every single action (what massively violates DRY principle) unexpected 403s pass undetected and that's wrong.

    Testing already raises an exception instead of silently returning 500, so why should 4xx family of errors be handled any differently ?

    And redirects to the same controller should be followed. POST/PUT which returns a redirect and the following GET are logically a single action, it's just an implementation detail (browser refresh, history, bookmarkability etc.) that they're implemented in two steps.

    ReplyDelete
  11. Anonymous21:52

    One more point -- the setup method of a functional creates a new controller instance every test method. It lasts for the length of a test method. Usually one functional test method makes exactly one request. If you want controller instance variables cleared out every request, you'd need to re-instantiate the control (which makes sense to me -- having it automagically 'clear out' instance variables of the controller but leaving the controller object itself alone would be surprising to say the least).

    If you want to execute multiple request in one test you probably want an integration test.

    ReplyDelete
  12. Anonymous22:52

    Taw,

    I'm not sure I follow. If your controller (or code called by it) raises an exception, it will be propagated to the test. The rails request process catches exceptions that escape the controller, which ideally should never happen, and translates them into 500 errors. But you're not testing Rails in a functional test, you're testing your controller. So if your controller raises (and does not catch) an exception, it will get propagated to the test. That's how exceptions work in Ruby (and most languages); it'd be weird if it didn't.

    If your controller explicitly sets the response status to 500 (or 400, or 403, or 201), no exception will be raised, and the response status will be whatever you coded it to be in your controller. And you can test it with assert_response, or more explicitly by checking @response.status.

    Let's say my controller looks like this (pseudo-code):

    if user_input_is_good
    create_widget
    head :created # (201)
    elsif duplicate_widget
    head :conflict # (409)
    elsif invalid_widget
    head :unprocessable_entity (422)
    end

    How would I test each condition? Well, I'd write three different test methods in my controller test, passing valid, conflicting, and invalid data. And I'd assert that the response was 201, 409, or 422 as appropriate. It'd confuse the heck out of me if the test weaved in magic code that translated certain response codes into exceptions.

    ***Especially since in production, no exceptoin would ever be raised!*** If my controller sets the HTTP response status to 4xx, Rails dutifully returns that in the HTTP response; no exception is raised. Why should tests behave differently than production code in this case?

    To answer your question, if I coded my controller to return 403, I set up a test to verify that 403 is returned in certain circumstances, and indeed the controller returned 403 when the test was excecuted, then no, I would say nothing was wrong -- my code was working as I designed it.

    ReplyDelete
  13. Anonymous22:56

    "And redirects to the same controller should be followed. POST/PUT which returns a redirect and the following GET are logically a single action, it's just an implementation detail (browser refresh, history, bookmarkability etc.) that they're implemented in two steps."

    It's an implementation detail to the browser perhaps, but I have to verify that my code follows the conventions properly. How would else would I test the first POST request in isolation otherwise? Again, I think you're looking for a higher level test, which is an integration test:

    http://weblog.jamisbuck.org/2006/3/9/integration-testing-in-rails-1-1

    "Functional tests are very narrowly focused on testing a single constroller and the interactions between the models it employs."

    ReplyDelete
  14. Anonymous23:06

    Taw,

    Sorry to bang up on you so much, but this one is way off base:

    "It ends with paths like /posts/123/destroy which are then fetched by web spiders deleting all your database."

    If a GET request deletes data then you deserve whatever you get (pun intended). That's just bad web design. It has nothing to do with Rails or routes.

    ReplyDelete
  15. Anonymous23:10

    OK you've lost me one this one as well:

    "Unfortunately Active Record hooks are too weak to handle things like ensuring that each person has exactly one primary email address."

    Can't you just stick a check for the number of primary email addresses in before_validate, before_validate_on_create, or before_validate_on_update? You know, add an error and cancel the save operation if an attempt is made to add a second primary email address? I've done stuff like this quite often; works fine.

    ReplyDelete
  16. Anonymous: Unfortunately multiple operations modify database concurrently so before_validate hooks are not enough. And ActiveRecord doesn't have anything like an on_commit hook.

    Imagine two operations happening, both trying to add new primary email address. The important thing is that one starts before the other commits, so SELECTing all existing email addresses returns only old ones (456) in each case.

    BEGIN;
    SELECT id FROM email_addresses WHERE user_id = 123;
    INSERT INTO email_adresses(user_id, address, is_primary) VALUES (123,'foo@gmail.com', 1);
    UPDATE email_addresses SET is_primary=0 WHERE id in (456);
    COMMIT

    BEGIN;
    SELECT id FROM email_addresses WHERE user_id = 123;
    INSERT INTO email_adresses(user_id, address, is_primary) VALUES (123,'foo@yahoo.com', 1);
    UPDATE email_addresses SET is_primary=0 WHERE id in (456);
    COMMIT

    And user 123 now has two primary email addresses. One way to protect against such stuff is evil database-side triggers. Or rearranging database so that it's not possible (like keep primary_email_address_id on users table). I don't know if there's any pretty way of doing that stuff.

    ReplyDelete
  17. Always nice with a good rant :-)

    ReplyDelete
  18. Anonymous00:08

    Taw,

    Check out optimistic and/or pessimistic locking. Rails has support for both. Locking on the aggregate root object (Person) will solve your multiple primary emails problem.

    Martin Fowler has a good discussion of concurrency and locking in his book 'Patterns of Enterprise Application Architecture'. It's a fundamental problem with concurrency, not just relational databases. There are different solutions with different tradeoffs, and you can implement any of them just fine in Rails.

    ReplyDelete
  19. I share some of your gripes with Rails. Have you tried out the Ruby NetBeans builds? The OS X daily builds are more stable than the Linux ones in my experience, but they both work pretty well. Being able to set breakpoints, snoop around local vars, step into Rails iteslf, etc. are pretty nice.

    ReplyDelete
  20. sjs: I never cared much for single-stepping debuggers. Unit testing, Object#inspect and script/console were so far sufficient for all my Ruby debugging needs.

    ReplyDelete
  21. Awe, come on!

    Keep hating Haml! I love a good flaming.

    I'm sick and tired of people *liking* what I make. All I'm saying is, please consider hating Haml again. Like the olden days.

    ReplyDelete
  22. Hampton: I have limited supply of hatred and I've almost ran out of it hating so many things. So please forgive me if I don't have sufficient amount of hatred left to keep hating HAML more than just a little.

    ReplyDelete
  23. Anonymous17:52

    taw: Again, your "on_commit" problem is just a lack of knowledge of the framework. Raising an exception inside of an after_save callback would do exactly what you're looking for. You're still inside the transaction at that point, and can easily rollback.

    ReplyDelete
  24. Anonymous: after_save is not good enough. If a bunch of objects are saved in one transaction, after_save of some will be called before some others will be saved. At the very least there should be a before_commit hook which would be called after all objects were saved but before database COMMIT.

    ReplyDelete
  25. You can setup an "after_filter" in your controllers that will do whatever you want to the response.body part. You can setup a regex that will strip all XSS possibilities from your output.

    ReplyDelete
  26. Anonymous23:54

    Good writeup for a beginner's understanding to it. I don't personally love rails or rhtml, but...they work, and they pay the bills.
    Here's my own pet plugin:
    http://code.google.com/p/ruby-roger-useful-functions/wiki/EnglishLikeQueries
    Take care.

    ReplyDelete