The best kittens, technology, and video games blog in the world.

Saturday, July 17, 2010

Another example of Ruby being awesome - %W

cuteness by sparkleice from flickr (CC-NC-ND)

And there I was thinking I knew everything about Ruby, at least as far as its syntax goes...

As you might have figured out from my previous posts, I'm totally obsessed about string escaping hygiene - I would never send "SELECT * FROM reasons_why_mysql_sucks WHERE reason_id = #{id}" to an sql server even if I was absolutely totally certain that id is a valid integer and nothing can possibly go wrong here. Sure, I might be right 99% of time, but it only takes a single such mistake to screw up the system. And not only with SQL - it's the same with generated HTML, generated shell commands and so on.

And speaking of shell commands - system function accepts either a string which it then evaluates according to shell rules (big red flag), or a list of arguments which it uses to fork+exec right away. Of course we want to do that - except it's really goddamn ugly. Faced with a choice between this insecure but reasonably looking way of starting MongoDB shard servers:


system "mongod --shardsvr --port '#{port}' --fork --dbpath '#{data_dir}' \
--logappend --logpath '#{logpath}' --directoryperdb"

And this secure but godawful (to_s is necessary as port is an integer, and system won't take that):

system *["mongod", "--shardsvr", "--port", port, "--fork",
"--dbpath", data_dir, "--logappend",
"--logpath", logpath, "--directoryperdb"].map(&:to_s)

Even I have my doubts.

And then I found something really cool in Ruby syntax that totally solves the problem. Now I was totally aware of %w[foo bar] syntax Ruby copied from Perl's qw[foo bar], and while useful occasionally, is really little more than constructing a string, and then calling #split on that.

And I though I was also aware of %W - which obviously would work just like %w except evaluating code inside. Except that's not what it does! %W[foo #{bar}] is not "foo #{bar}".split - it's ["foo", "#{bar}"]! And using a real parser of course, so you can use as many spaces inside that code block as you want.

system *%W[mongod --shardsvr --port #{port} --fork --dbpath #{data_dir}
--logappend --logpath #{logpath} --directoryperdb]


There's nothing in Perl able to do that. Not only it's totally secure, it looks every better than the original insecure version as you don't need to insert all those 's around arguments (which only half-protected them anyway, but were better than nothing), and you can break it into multiple lines without \s.

%W always does the right thing - %W[scp #{local_path} #{user}@#{host}:#{remote_path}] will keep the whole remote address together - and if the code block returns an empty string or nil, you'll get an empty string there in the resulting array. I sort of wish there was some way of adding extra arguments with *args-like syntax like in other contexts, but %W[...] + args does exactly that, so it's not a big deal.

By the way, it seems to me that all % constructors undeservingly get a really bad reputation as some sort of ugly Perl leftover in Ruby community. This is so wrong - what's ugly is excessive escaping with \ which they help avoid. Which regexp for Ruby executables looks less bad, the one with way too many \/s - /\A(\/usr|\/usr\/local|\/opt|)\/bin\/j?ruby[\d.]*\z/, or one which avoids them all thanks to %r - %r[\A(/usr|/usr/local|/opt|)/bin/j?ruby[\d.]*\z]?


By the way - yes I used []s inside even though they were the big demarcator. That's another great beauty of % constructions - if you demarcate with some sort of braces like [], (), <>, or {} - it will only close once every matched pair inside is closed - so unlike traditional singly and doubly quoted strings % can be nested infinitely deep without a single escape character! (Perl could do that one as well)


And speaking of things that Ruby copied from Perl, and then made them much more awesome, here's a one-liner to truncate a bunch of files after 10 lines, with optional backups. Which language gets even close to matching that? ($. in both Perl and Ruby will keep increasing from file to file, so you cannot use that)

ruby -i.bak -ple 'ARGF.skip if ARGF.file.lineno > 10' files*.txt

Friday, July 16, 2010

Arrays are not integer-indexed Hashes

Cabooki by elycefeliz from flickr (CC-NC-ND)

We use a separate Array type even though Ruby Hashes can be indexed by integers perfectly well (unlike Perl hashes which implicitly convert all hash keys to strings, and array keys to integers). Hypothetically, we could get rid of them altogether and treat ["foo", "bar"] as syntactic sugar for {0=>"foo", 1=>"bar"}.

Now there are obviously some performance reasons for this - these are mostly fixable and a single data structure can perform well in both roles. And it would break backwards compatibility rather drastically, but let's ignore all that and imagine we're designing a completely fresh language which simply looks a lot like Ruby.

What would work


First, a lot of things work right away like [], []=, ==, size, clear, replace, and zip.



The first incompatibility is with each - for hashes it yields both keys and values, for arrays only values, and we'd need to decide one way or the other - I think yielding both makes more sense, but then there are all those non-indexable enumerables which won't be able to follow this change, so there are good reasons to only yield values as well. In any case, each_pair, each_key, and each_value would be available.


Either way, one more change would be necessary here - each and everything else would need to yield elements sorted by key. There are performance implications, but they're not so bad, and it would be nicer API.


Hash's methods keys, values, invert, and update all make perfect sense for Arrays. With keys sorted, first, last, and pop would work quite well. push/<< would be slightly nontrivial - but making it add #succ of the last key (or 0 for empty hashes) would work well enough.


Collection tests like any?, all?, one?, none? are obvious once we decide each, and so is count. map/collect adapts to hashes well enough (yielding both key and value, and returning new value).


Array methods like shuffle, sort, sample, uniq, and flatten which ignore indexes (but not their relative positions) would do likewise for hashes, so flattening {"a"=>[10,20], "b"=>30} would result in [10,20,30] ("a" yields before "b").


Enumerable methods like min/max/min_by/max_by, find, find_index, inject would do likewise.

include? checks values for Arrays and keys for hashes - we can throw that one out (or decide one way or the other, values make more sense to me), and use has_key?/has_value? when it matters.

reverse should just return values, but reverse_each should yield real keys.


I could go on like this. My point is - a lot of this stuff can be made to work really well. Usually there's a single behavior sensible for both Arrays, and Hashes, and if you really need something different then keys, values, or pairs would usually be a suitable solution.

What doesn't work


Unfortunately some things cannot be made to work. Consider this - what should be the return value of {0 => "zero", 1 => "one"}.select{|k,v| v == "one"}?


If we treat it as a hash - let's say a mapping of numbers to their English names, there is only one correct answer, and everything else is completely wrong - {1=>"one"}.


On the other hand if we treat it as an array - just an ordered list of words - there is also only one correct answer, and everything else is completely wrong - {0=>"one"}.

These two are of course totally incompatible. And an identical problem affects a lot of essential methods. Deleting an element renumbers items for an array, but not for a hash. shift/unshift/drop/insert/slice make so sense for hashes, and methods like group_by and partition have two valid and conflicting interpretations. It is, pretty much, unfixable.


So what went wrong? Thinking that Arrays are indexed by integers was wrong!


In {0=>"zero",1=>"one"} association between keys and values is extremely strong - key 0 is associated with value "zero", and key 1 with value "one". They exist as a pair and everything that happens to the hash happens to pairs, not to keys or values separately - there are no operations like insert_value, delete_value which would just shift remaining values around from one key to another. This is the nature of hashes.


Arrays are not at all like that. In ["zero", "one"] association between 0 and "zero" is very weak. The real keys are not 0, and 1 - they're two objects devoid of any external meaning, whose only property is their relative partial order.


To implement array semantics on top of hashes, we need a class like Index.new(greater_that=nil, less_than=nil). Then a construction like this would have semantics we desire.

arr = {}

arr[Index.new(arr.last_key, nil)] = "zero"

arr[Index.new(arr.last_key, nil)] = "one"



If we use these instead of integers, hashes can perform all array operations correctly.

# shift

arr.delete(arr.first_key)

# unshift

arr[Index.new(nil, arr.first_key)] = "minus one"

# select - indexes for "zero" and "two" in result have correct order

["zero", "one", "two"].select{|key, value| value != "one"}

# insert - nth_key only needs each

arr[Index.new(arr.nth_key(0), arr.nth_key(1))] = "one and half"


And so the theory is satisfied. We have a working solution, even if highly impractical one. Of course all these Index objects are rather hard to use, so the first thing we'd do is subclassing Hash so that arr[i] would really mean arr[arr.nth_key(i)] and so on, and there's really no point yielding them in #each and friends... oh wait, that's exactly where we started.

In other words, unification of arrays and hashes is impossible - at least unless you're willing to accept a monstrosity like PHP where numerical and non-numerical indexes are treated differently, and half of array functions accept a boolean flag asking if you'd rather have it behave like an array or like a hash.

Random sampling or processing data streams in Ruby

7 14 10 by BernieG10 from flickr (CC-NC-ND)

It might sound like I'm tackling a long solved problem here - sort_by{rand}[0, n] is a well known idiom, and in more recent versions of Ruby you can use even simpler shuffle[0, n] or sample(n).

They all suffer from two problems. The minor one is that quite often I want elements in the sample to be in the same relative order as in the original collection (this in no way implies sorted) - what can be dealt with by a Schwartzian transform to [index, item] space, sampling that, sorting results, and transforming out to just item.

The major problem is far worse - for any of these to work, the entire collection must be loaded to memory, and if that was possible, why even bother with random sampling? More often than not, the collection I'm interested in sampling is something disk-based that I can iterate only once with #each (or twice if I really really have to), and I'm lucky if I even know its #size in advance.

By the way - this is totally unrelated, but I really hate #length method with passion - collections have sizes, not "lengths" - for a few kinds of collections we can imagine them arranged in a neat ordered line, and so their size is also length, but it's really lame to name a method after special case instead of far more general "size" - hashtables have sizes not lengths, sets have sizes not lengths, and so on - #length should die in fire!

When size is known


So we have a collection we can only iterate once - for now let's assume we're really lucky and we know exactly how many elements it has - this isn't all that common, but it happens every now and then. As we want n elements out of size, probability of each element being included is n/size, and so select{ n > rand(size) } will nearly do the trick - even keeping samples in the right order... except it will only return approximately n elements.

If we're sampling 1000 out of a billion we might not really care all that much, but it turns out it's not so difficult to do better than that. Sampling n elements out of [first, *rest] collection neatly reduces to: [first, *rest.sample(n-1)] with n/size probability, or rest.sample(n) otherwise. Except Ruby doesn't have decent tail-call optimization, so we'll use counters for it.


module Enumerable
  def random_sample_known_size(wanted, remaining=size)
    if block_given?
      each{|it|
        if wanted > rand(remaining) 
          yield(it)
          wanted -= 1
        end
        remaining -= 1
      }
    else
      rv = []
      random_sample_known_size(wanted, remaining){|it| rv.push(it) }
      rv
    end
  end
end


This way of sampling has an extra feature that it can yield samples one at a time and never needs to store any in memory - something you might appreciate if you want to take a couple million elements out of 10 billions or so, and you will not only avoid loading them to memory, you will be able to use the results immediately, instead of only when the entire input finishes.


This is only possible if collection size is known - if we don't know if there's 1 element ahead or 100 billion, there's really no way of deciding what to put in the sample.

If you cannot fit even the sample in memory at once, and don't know collection size in advice - it might be the easiest thing to iterate twice, first to compute the size, and then to yield random records one at a time (assuming collection size doesn't change between iterations at least). CPU and sequential I/O are cheap, memory and random I/O are expensive.

Russian Blue by Adam Zdebel from flickr (CC-NC-ND)

When size is unknown

Usually we don't know collection size in advance, so we need to keep a running sample - initialize it with the first n elements, and then for each element that arrives replace a random one from the sample with probability n / size_so_far.

The first idea would be something like this:

module Enumerable
  def random_sample(wanted)
    rv = []
    size_so_far = 0
    each{|it|
      size_so_far += 1
      j = rand(size_so_far)
      rv.delete_at(j) if wanted == rv.size and wanted > j
      rv.push(it) if wanted > rv.size
    }
    rv
  end
end


It suffers from a rather annoying performance problem - we're keeping the sample in a Ruby Array, and while they're optimized for adding and removing elements at both ends, deleting something from the middle is a O(size) memmove.

We could replace rv.delete_at(j); rv.push(it) with rv[j] = it to gain performance at cost of item order in the sample... or we could do that plus Schwarzian transform into [index, item] space to get correctly ordered results fast. This only matters once sample size reaches tens of thousands, before that brute memmove is simply faster than evaluating extra Ruby code.

module Enumerable
  def random_sample(wanted)
    rv = []
    size_so_far = 0
    each{|it|
      size_so_far += 1
      j = wanted > rv.size ? rv.size : rand(size_so_far)
      rv[j] = [size_so_far, it] if wanted > j
    }
    rv.sort.map{|idx, it| it}
  end
end



This isn't what stream processing looks like!

The algorithms are as good as they'll get, but API is really not what we want. When we actually do have an iterate-once collection, we usually want to do more than just collect a sample. So let's encapsulate such continuously updated sample into Sample class:

class Sample
  def initialize(wanted)
    @wanted = wanted
    @size_so_far = 0
    @sample = []
  end
  def add(it)
    @size_so_far += 1
    j = @wanted > @sample.size ? @sample.size : rand(@size_so_far)
    @sample[j] = [@size_so_far, it] if @wanted > j
  end
  def each
    @sample.sort.each{|idx, it| yield(it)}
  end
  def total_size
    @size_so_far
  end
  include Enumerable
end

It's a fully-featured Enumerable, so it should be really easy to use. #total_size will return count of all elements seen so far - calling that #size would conflict with the usual meaning of number of times #each yields. You can even nondestructively access the sample, and then keep updating it - usually you wouldn't want that, but it might be useful for scripts that run forever and periodically save partial results.

To see how it can be used, here's a very simple script, which reads a possibly extremely long list of URLs, and prints a sample of 3 by host. By the way notice autovivification of Samples inside the Hash - it's a really useful trick, and Ruby's autovivification can do a lot more than Perl's.

require "uri"
sites = Hash.new{|ht,k| ht[k] = Sample.new(3)}
STDIN.each{|url|
  url.chomp!
  host = URI.parse(url).host rescue next
  sites[host].add(url)
}
sites.sort.each{|host, url_sample|
  puts "#{host} - #{url_sample.total_size}:"
  url_sample.each{|u| puts "* #{u}"}
}


So enjoy your massive data streams.

Thursday, July 15, 2010

Synchronized compressed logging the Unix way

tiny tiny kitten 4 weeks old by GeorgeH23 from flickr (CC-NC-ND)
In good Unix tradition if a program generates some data, in general it should write it to STDOUT, and you'll redirect it to the right file yourself.

There are two problems with that, both easily solvable in separation:
  • If it's a lot of data, you want to store it compressed. It would be bad Unix to put compression directly in the program - the right way is to pipe its output through gzip with program | gzip >logfile.gz. gzip is really fast, and usually adequate.
  • You want to be able to see what were the last lines written out by the program at any time. Especially if it appears frozen. Sounds trivial, but thanks to a horrible misdesign of libc, and everything else based on it, data you write gets buffered before being actually written - a totally reasonable thing - and there are no limits whatsoever how long it can stay in buffers! Fortunately it is possible to turn this misfeature off with a single line of STDOUT.sync=true or equivalent in other languages.
Unfortunately while both fixes involve a single line of obvious code - there's no easy way to solve them together. Even if you flushed all data from the program to gzip, gzip can hold onto it indefinitely. Now unlike libc which is simply broken, gzip has a good reason - compression doesn't work on one byte at a time - it takes a big chunk, compresses it, and only then writes it all out.


Still, even if it has good reasons not to flush data as soon as possible, it can and very much should flush it every now and then - with flushing every few seconds reduction in compression ratio will be insignificant, and it will be possible to find out why the program frozen almost right away. The underlying zlib library totally has this feature - unfortunately command line gzip utility doesn't expose it.

So I wrote this:

#!/usr/bin/env ruby

require 'thread'
require 'zlib'

def gzip_stream(io_in, io_out, flush_freq)
  fh = Zlib::GzipWriter.wrap(io_out)
  lock = Mutex.new
  Thread.new{
    while true
      lock.synchronize{
        return if fh.closed?
        fh.flush if fh.pos > 0
      }
      sleep flush_freq
    end
  }
  io_in.each{|line|
    lock.synchronize{
      fh.print(line)
    }
  }
  fh.close
end

gzip_stream(STDIN, STDOUT, 5)


It reads lines on stdin, writes them to stdout, and flushes every 5 seconds (or whatever you configure) in a separate Ruby thread. Ruby green threads are little more than a wrapper over select() in case you're wondering. The check that fh.pos is non-zero is required as flushing before you write something seems to result in invalid output.

Now you can program | gzip_stream >logfile.gz without worrying about data getting stuck on the way (if you flush in your program that is).

Friday, July 02, 2010

Palestinian problem fixed

good Shabbos by anomalous4 from flickr (CC-BY)

This solution is really really simple, has plenty of precedent in history, and yet I haven't heard it proposed seriously by anyone before. So here is goes:
Palestinians should convert to Judaism

According to Israeli laws, anybody who converts to Judaism can trivially get Isreali citizenship. Once sufficient number of Palestinians gets citizenship, not only will they no longer be persecuted as much (Israeli Arabs are still treated as second class citizen, but level of discrimination is much much less) - together with existing non-hawkish Israeli citizens they will constitute majority of voters who will ensure that Israel will become a country that's much more peaceful and friendly toward its neighbours. As a bonus on top of that such mass conversion would immensely piss off radicals on both sides.

Now you will very likely complain about freedom of religion and such stuff. But why are these people Muslim in the first place? Because their ancestors converted to religion of previous conquerors of these lands - some were forced to, others out of opportunism, perhaps some even out of genuine conviction - it doesn't matter. Is following religious choices of ancestors really worth all the death and suffering that exists in Palestine right now? I say no. Especially since the most painful part of such conversion - genital mutilation - they all underwent already.


This wouldn't be without precedent - not only religions like Islam and Christianity, but also languages like Latin, English, and Spanish, national identities, and cultures have all spread largely by members of the losing group adapting habits of members of the winning group. In fact it would be a lot more difficult to point to any major religion or nationality which didn't spread in such a way.


It's a topic for whole another post, but ignoring this is the primary reason why I hate most analysis of humans in terms of evolutionary biology so much - variance of human behaviour is almost entirely culture-driven, not gene-driven like they all pretend, and culture doesn't only spread from parents to children.

Anyway, it wouldn't be necessary for everyone to convert - as long as the portion of Palestinians who do is high enough - let's say half - it would completely alter current conflict dynamics enough to solve it. The die-hard few who care too much about Islam can then safety practice their faith in the emerging unified single state (with a huge Jewish majority, even if a lot of that would be rather unenthusiastic converts). As the law doesn't forbid it - children of Palestinian converts could choose whichever religion they wanted - Judaism, Islam, Catholicism, Baha'i, Spaghetti-Monsterism, or whatever. In practice vast majority would simply follow religion of their parents.

jewish cat by Shira Golding from flickr (CC-NC)

Alternatives


Everything else having been tried and having failed already - there is really only one alternative to such mass conversion - waiting a few decades for both Israel and Palestine (and if possible all other countries of the region) to undergo demographic transition from short lives and big families to long lives and small families. Countries after demographic transition are a lot less belligerent.

This is very much happening - but very slowly. According to CIA World Factbook total fertility (number of children per women) between 1990 and 2010 fell from 7.0 to 4.9 for Gaza Strip (still ridiculously high), from 5.0 to 3.12 for West Bank, and from 2.9 to 2.72 for Israel - moving them away from values typical for the worlds' war zones like Afghanistan's 5.50 and Democratic Republic of Congo's 6.11 and more like civilized world's 2.1 and less.

One day it will very likely happen - all will undergo demographic transition, and while they might still hate each other as much as they do now - they will express it like Japanese and Koreans - by flaming each other on the Internet - not in the good old way of bombs and bullets.

This will of course take many decades. My solution solves the whole problem overnight. Any takers?

Wednesday, June 09, 2010

Best sources of DHA omega-3 essential fatty acid

Axolotl by Ethan Hein from flickr (CC-NC-SA)

This is surprisingly difficult to find out, so I decided to share the results with everyone. But first, background:
  • Animals need omega-3 and omega-6 essential fatty acids
  • The same enzymes are used for omega-3 and omega-6 processing, so too much of one will interfere with the other. People used to have diets with about 1:1 omega-3:omega-6. Today ratio is more like 1:20, and that little omega-3 we eat is mostly ALA.
  • Three most important omega-3 are ALA (18 carbons), EPA (20 carbons), and DHA (22 carbons).
  • Brains are made largely out of DHA.
  • Land plants produce no EPA / DHA. None whatsoever. Zero.
  • Some land plants produce adequate amounts of ALA, but even this is uncommon.
  • Algae produce quite a lot of EPA / DHA.
  • Animals including humans can convert ALA to EPA, and then DHA, but this is a painfully slow and inefficient process; and over-saturation of omega-6 and many conditions interfere with even that much.
Based on this some people believe it would be wise to try to increase amount of omega-3 fatty acids in diets. Hard evidence is rather lacking, this is however to be expected as hard evidence of anything about diet is essentially nil. It's almost only short term studies of crappy proxies, and there are millions of reasons why this is just wrong. Anyway, concerning supplementation:
  • All mixed omega-3/omega-6/omega-9 supplements are waste of money - you're eating too much omega-6 already, and you can make as much omega-9 as you wish yourself.
  • You don't want generic "omega-3" supplements - most of these are ALA, which is of very limited use. You want DHA. At worst EPA. ALA is little more than filler, it's not bad for you but it's less relevant, and much easier to get via normal diet anyway.
  • If you're surprised why it's so hard to get DHA, this is possibly highly relevant.
Hopefully now you see why I'm measuring DHA, not anything else. And to keep science proper what I'm interested in is "% of calories coming from DHA", not "grams of DHA per portion" or anything like it. Portions are whatever manufacturer says they are, "per 100g" measures mostly tell you how much water foods have, and only "% of calories from" measures the right thing.

Escher Symmetry by Pieter Musterd from flickr (CC-NC-ND)

Best Sources of DHA


I digged through USDA National Nutrient Database, and this is what I found.
  • The data only contains "food" not supplement pills and such. These will of course contain highest concentrations. Rarely eaten foods like dolphin meat are not in the database, so I have no idea how nutritious dolphin sashimi would be.
  • The best source of DHA is unsurprisingly - fish oil. Salmon oil is 18.2% DHA and 34.2% omega-3 altogether; other fish oils are pretty good too, but not so much. Other oils like cod liver, sardine, and menhaden are 8.5%-10.9% DHA, 18.8%-26.6% total omega-3. Herring oil is less impressive 4.2%/11.1%. Fish oil also contains most mercury and other poisoning, so enjoy that. Once ultra-refined you won't need to worry about poisoning but it's more supplementation than food.
  • The second best source is caviar/roe, with 7.7%-13.6% DHA, and 13.7%-24.2% total omega-3. Might be expensive to turn it into a major part of your diet.
  • The third best source is seal oil with 6.5-12% DHA, 14.0-27.7% omega-3. Let's see if you can buy some legally outside Canada. So far we're totally out of luck.
  • Finally something more useful. The fourth best source is salmon. There's wide range of nutritious value from 2%/4% to 8.9%/13.7% - depending on where they're caught and what's their diet. Fortunately there doesn't seem to be a big difference between wild and farmed salmon, so either will work. Unfortunately the same mercury poisoning problem applies as to fish oil - poisonous substances are stored in oil so the oilier (and more useful for us) the fish the more toxic, and there is no way to escape that.
  • Fifth best source is mackerel. Like salmon, it can have as little as 1.5%/2.8% or as much as 8.7%/14.7%.
USDA says the next best source is dried parsley leaves at 7.3%/9.9% - which is most likely a massive measurement error, as there are no other plants anywhere, and it really makes little sense.

Old friend by JennyHuang from flickr (CC-BY)


Other than that, it's fish, fish, fish, mollusks, jellyfish, crustaceans, and more fish. My hopes for finding something that's not fish are getting slimmer and slimmer, so I'm just going to skip all of them now (you can probably see the pattern), and only focus on things which are not fish/seafood.
  • Brains. Beef/lamb/pork brains have 3%-5.4% DHA and 4.4%-7.7% total omega-3. Not surprisingly, as that's what animals primary need DHA for. And we simply throw away this most nutritious part.
  • Whale oil - 3.9%/8.3%. Whale meat on the other hand is pretty useless at 0.2%/0.5%. Not that you'll find much of either at the nearest supermarket. It's technically not a fish. Anyway, between brains and ocean creatures we pretty much ran out of good sources, the next source is:
  • Roasted squirrel - and that at mere 0.5%/0.6%
  • Chicken can be anywhere from 0.1%/0.2% to 0.5%/0.8%
  • Egg yolk - 0.3%-0.4% DHA, 0.4%-0.7% total.
  • Whole egg - 0.2%-0.3% DHA, 0.2%-0.8% total. Pretty much all of that in yolk.
  • We're long past useful concentrations anyway, so I won't be listing them. Next on the list are caribou, green turtles, turkeys, lamb kidneys, frog legs, guineahen, lamb hearts, squab/pigeon, pork livers, bear, raccoon, pork lung (and other beef/lamb/pork offal), and emu. Pork/beef/lamb meat doesn't register other than as rounding error.
So to summarize:
  • Get supplements;
  • Or eat a lot of fish and other seafood;
  • Or eat ridiculous amount of poultry, eggs, and game meat;
  • Or you're fucked.
There's no way to get enough DHA in anything resembling standard Western diet. Simply no way. Fruits and vegetables contain none, even "organic" ones. Actually fast food contains more as it often uses eggs and poultry as ingredients, but that's still not that useful.

One more long-term option would be to genetically engineer some common oily plants like soy or canola to produce some EPA/DHA - even if humans wouldn't eat them, if they're used as animal feed, we'd benefit indirectly quite a lot.

Actually someone already genetically engineered pigs to produce 4x omega-3 fatty acids, including 2x DHA, which sounds like the most urgent reminder that we need GMO now, and Luddites should not be in charge of policy.

EDIT: GM soybeans with more omega-3 (this most likely means ALA) and higher stability so they don't need partial hydrogenation has just been approved in US. If people moved from usual partially hydrogenated soybean oil to that it would be a massive health benefit. Of course our Euro-Luddites + CAP-paid farmer lobby coalition will probably ban it until long after "America vs Europe" picture get reversed. It's already much closer than you think.

Wednesday, June 02, 2010

What is Internet good for?

im in ur tube, blockin ur internets by the boy on the bike from flickr (CC-NC-SA)

Here a quick poll, pick the most fitting answer:
  • I spend too much time on the Internet, and I'm aware of this
  • I spend too much time on the Internet, I'm deluding myself about it
There's no need to bother with the third option, as people who don't use too much Internet are extremely unlikely to ever read this blog.

But in the spirit of "What have the Romans ever done for us?" - what good is Internet really for? Is it worth all the time we spend on it?

For the last two weeks in the spirit of Alicorn's luminosity - by the way definitely read that linked article, her first attempt was at that was a major tl;dr but these "seven shiny stories" are so short and insightful you're probably better off spending some time reading them than whatever else you're typically wasting your time on Internet, and it promoted her to my second favourite lesswrong writer after Eliezer.

So as I was saying before I interrupted myself, for the last two weeks I've been making a log of my daily activities and how much satisfaction I actually got from a given day.

Now many people who have learned economics 101 and are treating it too seriously think that whatever we're doing must by definition be the things we most enjoy, our claims to the contrary notwithstanding - so people who say the want to get thinner, but eat fuckloads of pizza actually prefer eating pizza and being fat to not eating pizza and being thin. This point of view is usually something worth considering - people usually say they want things they feel they're "supposed to want". On the other hand, the amount of evidence that we don't do what's best for us is ridiculously overwhelming.

A very very short list of such examples would include:
  • hyperbolic discounting - there's only one "mathematically consistent" way of treating values over time (exponential discounting), and the evidence is completely unambiguous that we're not doing so.
  • rodent experiments show also rather unambiguously that brains have separate systems for "wanting something" and "liking something". They are of course connected, so other things being equal if we like something more we will probably want it more - but this influence is far far less than total identity assumed by economics 101. Once you're aware that people might "want" things they don't really "like", and "not want" things they "like" (by the way - I'm using the words "want" and "like" rather vaguely - natural languages are spectacularly bad when analyzing humans - quite surprising actually as they seem to have been originally developed largely for social use) the entire utilitarian / consequentialist framework of analysis collapses.
  • Happiness research in spite of all its ambiguity at least shows that simple models of happiness are plain wrong. If you have time to waste on the Internet - TED is filled with good talks about happiness, not just the one linked.
  • Even disregarding these, you'd need to have ridiculously good information on yourself to decide what's the best thing to do. And it would be a massive understatement to say that we don't have it. And it's really really difficult to measure yourself. The idea behind "living luminously" is that increasing your self-awareness of your own mental state even somewhat might lead to highly positive results.
  • and many many more

Economics 101

By the way - and if you don't enjoy how I keep straying away from whatever is my main subject all the time, you probably shouldn't be reading this blog - I'm in no way disparaging "economics 101" thinking. I find it really sad that virtually every single person in the world pretty much belongs to these two classes:
  • People who don't understand Economics 101. They fail to get even such basic notions like comparative advantage or externalities.
  • People who get Economics 101 - and take it far too seriously. If you even briefly look at assumptions behind all its theorems, none of them is even approximately true in the real world. Very often you get lucky and this toolkit lets you predict things about the real world decently enough, but this is about as often not the case.

This is not to say Economics 101 should be thrown away. All models are wrong, some are useful. Or from a closely related perspective - all abstractions leak. If you don't perform sanity checks, and blindly trust everything the models tell you, they will lead you far astray (you could always argue that it's not models' fault, it's fault of the way you're applying them - but this is a purely theoretical distinction).

So for example in theory economics 101 models say that laissez-faire international trade policy should outperform any kind of intervention, but in practice countries which practice export subsidies by currency manipulation like the East Asia are better off than those with less laissez-faire trade policy like Europe, which in turn are better off than those that try to follow the route of import substitution via high tariffs like Latin America.

By the way if you find this curious, the standard answer to this puzzle is that benefits of economy of scale overwhelm loses due to comparative advantage - pumping subsidies into narrow range of related industries lets your country specialize in those, and import everything else - while limiting imports of wide range of products to protect diverse local industries means they will all be small and weak. As economics 101 completely ignores the dominant factor of scale advantages, focusing on an undoubtedly real but less important factor instead.

Similar misapplication of economics 101 says that minimum wage laws invariably increase unemployment. This was universally believed by nearly all economists a few decades ago according to some surveys cited by Wikipedia, and holding such belief became almost the canonical way of signaling that you're "economically savvy". And not surprisingly it turned out to be false, all research showing either no effect whatsoever, or effect that is really tiny compared to the huge increase in well-being of the working poor. The economists have finally figured that out, and they're more or less evenly divided on the question - even those standing against the minimum wage laws typically holding much more nuanced views - and yet some naive hard-liners still hold this as a measure of "economic enlightenment".
I'm in your PC, stealing your internets by the boy on the bike from flickr (CC-NC-SA)

My logs


Getting back to the subject, for each day in a bit over the last two weeks I recorded my activities, and some measures of satisfaction. The logs weren't terribly detailed, and "recording one's satisfaction" is exactly the kind of thing which would never get published in any reputable peer-reviewed journal. That's not to say that it's useless - "the official way of doing science" has led to many important discoveries, but it's for many questions it's been rather impotent, and it's important that people try different ways of finding things out too, if for nothing else then to fill in the blind spots of the mainstream science.

So my logs, of however dubious methodology they are, seem to point to the following correlations. I won't even bother pretending to have any "statistical significance" in all that of course, even forgetting about small sample of just two weeks measuring "statistical significance" necessarily assumes that samples are essentially independent, and they're nothing like that. The list is:
  • Physical exercise of all kinds - definitely positive - this isn't really surprising, as this is something that's highly enjoyable once started, but it takes effort to begin, so the hyperbolic discounting excuse applies
  • With video games it's mixed. First person shooter games like online Modern Warfare 2 have positive correlations, but Total War games negatively correlate with my end-of-the-day satisfaction, even though they're not really frustrating or anything most of the time.
  • Cleaning up my GTD system - definitely positive
  • Being productive at work, and in general getting done things I want to get done, especially the long postponed ones - definitely positive
  • Reduction in caffeine consumption - mildly negative, but that doesn't really imply anything about long term effects of different levels of caffeine, and it wasn't even as bad as I expected
  • Watching TV series, and reading books - mildly negative; this might be a false result, or the effect might be real but minor, in any case - there's no reason to do much more of those
  • And the largest and rather surprising correlation - spending more time online has a huge negative correlation with my satisfaction levels
Now there standard economics 101 disclaimer applies - these measures are necessarily marginal - so finding out that I'm better off exercising more and Internetting less implies only that I would be better off exercising a bit more than now, and Internetting a bit less than now, and it's more likely than at some point increasing amount of exercise and decreasing Internet use will make me worse off.

It's an interesting find that I seem to actually enjoy my work - I should probably put that one in my CV for future reference. And it's nice to get some insight on what kinds of recreational activities work better for me than others (due to small sample size less repeatable events like those involving interaction with other people not included). But the big find is that Internet is bad for me, and let's focus on that.

What is Internet good for?


Do you remember what life was before the Internet? It was horrible! We had to copy games from friends on stacks of floppies instead of just bittorrenting them! But really, what good is Internet for?
  • Email and IM are always far superior means of communication than phone calls (I really hate those, they should all die in fire); and are so much faster and easier than driving all the way to meet someone in person that they usually win, even if the throughput is somewhat less.
  • There's shopping - at which Internet really excels, most of the time. At least when you know exactly what you want, otherwise not so much.
  • There is information - but I'm far from happy about it. For some kinds of thing that you want to find, if they follow "keyword keyword of keyword" pattern, you can usually google or bing it out in seconds. Otherwise, all search engines become nearly useless, even if this information is somewhere. And it requires a lot of knowledge to turn a problem into unique keywords - very often you know little more than "X doesn't work", or "I'm not happy about Y", and search engines won't help you with those at all. This assumes information is even online in the first place - as very often it's not - it's really sad how nearly all research papers have been successfully pay-walled. It seems that "information wants to be free" only when the information in question is something on the top 100 bestsellers list of one kind or another, and not to the long tail which contains most of the real value.
  • There's Google Maps and similar sites, which are far superior to paper maps.
  • There are some funny things online - but they're swamped by such amounts of unfunny repetitive material that I really doubt Internet is even good for that. I dare you, go to let's say /r/funny on reddit - which is supposedly about the most recent funny stuff online (or at least reddittors seem to believe they find stuff first, and everyone else copies stuff from them) - and how many things you'll find there that will make you laugh, and are not nearly ancient? And it's the same on nearly every other place which is supposedly filled with funny stuff. People keep going there because occasionally something good turns out, but it's so rare it's probably not worth it.
  • There are news - and again the flood problem applies. I'm yet to find any RSS feed with only important news. Everyone seems to believe the right way to do news online is to just throw 10+ trivia items a day - Obama said something, one minor celebrity divorced another minor celebrity, stock prices decreased somewhere, IDF shot a few unarmed civilians somewhere else - as if knowing things like these made you better off in any way. The choice is to either get ridiculous amount of political trivia, or just ignore the news altogether.
  • There are all kinds of social networking sites - and I'm increasingly doubting their value. I have a Facebook account (and accounts on some other sites) and I might even use it occasionally, but I don't see that my life would be that much worse if Facebook and the rest didn't exist.
  • There are blogs, wikis, and similar places where you can contribute your knowledge, which can give tremendous amount of satisfaction to the contributor. If you add them all up, they provide a lot of value for readers as long as search engines manage to find a relevant blog post or wiki article, which is always in doubt. On the other hand, I have serious doubts about reading everything on a blog, or unfocused browsing on wikis. I haven't yet seen a blog which had consistently good posts - much less consistently good and relevant to my interests.
  • There are online video games, for some things playing with people is more fun than playing against computer.

These seem to cover the main points. And what's obvious is that the most valuable online activities - email, highly focused search, shopping, maps - take rather little time. On the other hand, the ones that take a lot of time - like all the reddits, forums, social sites, wikis, blogs, etc. - don't provide that terribly much value per time spent.


Unless you actually measure how you use internet, it's really easy to overestimate how much value spending time on it gives you - as you're far more likely to remember the high points which didn't really take long - as opposed to relatively pointless activities which took most of your online time. Human memory just works like that.


This distinction would be pointless if it was impossible to make a distinction between the two - if reduction in the bad kind of internet use required essentially proportional reduction in the good kind. Fortunately it seems to me that this is fairly straightforward - good things like email (this of course assuming you have a working spam filter / and all mailing lists etc. go somewhere else than your inbox), shopping, maps, directed search - have different entry points than less useful things like social sites, wikis, news, and funnies etc. Sometime you'll look for something specific and in the process accidentally fall into a wiki trap, but this shouldn't be too common with some self-awareness. Much more often you waste a ridiculous amount of time by wanting to "quickly check if there's anything good on X" and having hyperbolic discounting ("just one more link") turn that into a disaster.

It doesn't mean reducing your lolcat consumption to zero - only that you should force yourself to make an up-front decision "I'm start looking at funnies now, even though I know well enough it will probably take the next few hours" and having a realistic idea how good this time will be; instead of fooling yourself it will be quick and only filled with the good stuff, as seems typical now.

tl;dr - using internet only when you have clear goal, and not for vague "maybe there's something good" is good for you.

Saturday, May 22, 2010

Empire Total War mods - no walls, libertarians everywhere

The look of nobody home by Tjflex2 from flickr (CC-NC-ND)

Rome Total War was highly moddable. Medieval 2 Total War somewhat less as all data files were in obfuscated packages - but once you unpacked them it was as good as Rome. My M2TW mod is generated by a bunch of simple regexps, with which I can experiment as much as I like, turning features on and off and changing their magnitude in seconds.

I knew Empire Total War won't be as easy, but nothing prepared me for the pain I suffered. But I'll complain as I go.

Libertarians everywhere

First to see how modding works I wrote a very small mod - turning off taxes adds +10 happiness. Not quite literally, as it seems impossible to have anything triggered by zero taxes, so I gave all non-zero taxes extra -10 happiness, and every government type gets extra +10 happiness - net result being what I wanted, except it looks a bit silly in game.

Now why did I do so? The answer is my favourite "less micromanagement". I don't particularly having to babysit conquered provinces, chasing rebels around, and counting how many units I can move and how many I need to leave. This is simply not fun. So I decided that every province has a sizable population of armed libertarians who will gladly shoot every protester for me as long as I set taxes to zero. When taxes are not zero they blog climate change denial or something, I don't care.

This is all a stop-gap measure for the first few turns after conquest - if you ever want to get any taxes out of the province, as you usually do, you'll need to deal with taxpayers' happiness eventually. By the way zero taxes essentially means provinces loses money every turn, as it increases administration cost of every other province in your empire.

Happiness +10 is enough most of the time, but when you conquer someone's capital and have to face -30, on top of all industrialization, religion etc., you might need to deal with rebellions anyway. And as such regions usually bring a lot of money, you probably want to set higher taxes anyway.

If you want to tweak this bonus, use DBEditor for it.

No walls

And now the big mod - removing all walls. I'm not ideologically opposed to settlement fortifications - in fact my M2TW mod makes settlements more difficult to take. However:
  • ETW sieges are broken due to stupid AI and stupid pathfinding
  • Because they're broken, all ETW sieges follow just two boring scripts:

    • either: approach diagonally with infantry from 3 directions, bayonet charge everything;
    • or: approach diagonally with howitzers with carcass shot and a few line infantry units, once you start bombarding them AI will charge you one by one and you win without loses
    everything else fails as your units are too stupid to pathfind in more interesting strategies
  • After 1710 or so nearly every settlement outside Americas has city walls
  • And so after the first few years, 90% of battles is unbelievably boring
Field battles on the other hand are much more interesting. An unfortunate side effect of this mod is that while in vanilla town watch can use settlement fortifications to defend it from small enemy armies reasonably well, now it's completely useless. I can live with that.

Unfortunately, this has second order side effect of forcing you to be more aggressive. If before you'd be quite willing to have your armies further away from the front as town watch could serve as good enough first line of defense, now you need to have your armies closer to the borders, preferably on their side of it.

So how did I write this mod? It was truly painful. First, ETW has building features, and the fact that settlement fortifications building creates walls in the battle is supposedly encoded by such features. I ran into the first problem, as DBEditor is incapable of removing features - only adding or modifying them. To do such thing you need to go to PackFileManager, clone a table, make sure it's named exactly like the original one (so it will be overridden, not merged), and remove rows from that in DBEditor.

Except for some reason PackFileManager incorrectly mixes up slashes and backslashes, so I needed to fix the mod file from a hex editor.

Except it turned out in the end ETW ignores this feature, and simply hard-codes walls. So much for my effort.

It was plan B time. First I needed to remove all walls. And while db files are pain to edit it is nothing compared to the pain of editing startpos.esf which contains campaign information:
  • Start campaign, look at every single settlement on a map and write down which settlement has walls (as there's no search function in EsfEditor)
  • Manually find every such settlement in esf tree, and do all necessary changes (change True to False in one node, delete another node).
This is of course wholly incompatible with any other esf mod, like those which enable you to play minor factions.

After that I needed to make it impossible to build walls. I haven't figured out how to do that (I suspect if I remove walls completely the game will simply crash, and no other building in unbuildable) - so I made it require late technology of mass production, and take 999 turns to build. Sort of good enough.

And all this was possible only after a lot of effort of modding community members who created tools like DBEditor, EsfEditor, PackFileManager, documented what they could etc. This is borderline unmoddable, and I fully expect the next generation of Total War games after Napoleon (which is essentially extra scenario for ETW) to be completely impossible to mod. But hey, maybe they'll sell more DLC this way.

How to install

In case you want to install these mods, you need to:
  • Download starpos.esf and replace one in C:\Program Files (x86)\Steam\steamapps\common\empire total war\data\campaigns\main (or similar), first of course making a backup copy - this will remove starting walls. (small warning - it's a big file and the server I'm hosting it on is pretty slow; other files are tiny)
  • Download mod_no_walls.pack and put it in data directory together with all other packs - this will make walls unbuildable.
  • Optionally download mod_tax_break.pack and put it into data directory - this will give you +10 happiness bonus for no taxes.
  • Download ModManager and unpack it.
  • Start ModManager, select mods you want to use, and click Launch.

Greasemonkey and jQuery easier than ever

Douc Langur (Pygathrix nemaeus) by ucumari from flickr (CC-NC-ND)

Last year I wrote a short Greasemonkey tutorial, in which I explained how to use it with jQuery for some really simple scripts. Since then it became even easier, so here a few more useful scripts.

Since version 0.8 Greasemonkey has @require feature, in which your scripts can be made to depend on some external Javascript files - like jQuery. Unfortunately there are a few gotchas:
  • jQuery 1.4 doesn't work with Greasemonkey, you need to use jQuery 1.3 (or keep pestering them until it's fixed)
  • @require only works on installation time, you cannot use it with "New User Script..." feature, nor can you change @requires from the editor. If you simply use published scripts and don't write anything - you don't need to worry. If you write your own scripts you need to prepare something.user.js file somewhere, then open it from Firefox (copy&paste it's file path to Firefox URL bar, then click Install).

Show spoilers on tvtropes

Let's start with something really trivial. I don't care about spoilers. Maybe 1% of "spoilers" significantly diminish viewing pleasure, vast majority of them do not. I knew that Vader is Luke's father, that Snape killed Dumbledore, and that Titanic sunk before watching/reading, and it made it no less enjoyable.


On the other hand I'm quite annoyed that to read tvtropes I needed to constantly select "spoiler" text for everything to see it - but no more. This trivial script solves it entirely:


// ==UserScript==
// @name           Show all spoilers on tvtropes
// @namespace      http://t-a-w.blogspot.com/
// @include        http://tvtropes.org/*
// @require        http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js
// ==/UserScript==

$(".spoiler").removeClass('spoiler');

It's really simple - @name is unique name for the script, which should really be a very short description. We're not going to rely on @namespace at all, so put anything you feel like there.


@include is a pattern of URLs for which the script should be executed. @require is the jQuery library we're including. And thanks to jQuery, removing spoiler tags is really easy.

You can download this script here.


Always sort by seeders on The Pirate Bay

And extremely annoying thing about The Pirate Bay search engine is how it sorts results by its idea of "relevance" - usually giving you some dead ISO of three year old Ubuntu version when you obviously want the most recent one. And nearly always sorting by seeders is the right thing to do.

There is actually a GreaseMonkey script that claims to do exactly that, but it doesn't always work correctly, and the author decided to obfuscate the source for lulz. I'll have none of that, so I just wrote my own. Again, thanks to jQuery it's truly trivial.


// ==UserScript==
// @name           Always order by seed count
// @namespace      http://t-a-w.blogspot.com/
// @include        http://thepiratebay.org/*
// @require        http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js
// ==/UserScript==

$("input[name='orderby']").val(7);




You can download this script here.

Download Stewart and Colbert from bittorrent

Now a more complicated and perhaps more useful script. One of the recent developments on the Internet I hate the most are geographic restrictions - Americans can view whatever they like, for everyone else it's "Sorry, Videos are not currently available in your country" or other such nonsense.

Personally I don't care about licensing restrictions which led to this - I'm not going to accept development like that if I can help it. Fortunately in this case I can - all Stewart and Colbert is available on bittorrent.

The script does the following:
  • Find all dates on the page. They're not kind enough to use consistent class for those, I found at least 3, and perhaps I still missed a few.
  • Convert every date to from "Month DD, YYYY" to "YYYY MM DD" format, as this is what most torrents use. Half of the script is just doing that because Javascript doesn't have anything like Ruby's Date.parse.
  • Append link to bittorrent site next to the date. I'm using yourbittorrent here, as ThePirateBay is down due to overload far too often.
So every time someone on Reddit posts a "omg Colbert totally pwned Glenn Beck" link, I can actually see the pwnage now thanks to this script.


// ==UserScript==
// @name           View Stewart and Colbert on bittorrent
// @namespace      http://t-a-w.blogspot.com/
// @description    No more "Sorry, Videos are not currently available in your country"
// @include        http://www.thedailyshow.com/*
// @include        http://www.colbertnation.com/*
// @require        http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js
// ==/UserScript==

var month_names = {
  'January': '01',
  'February': '02',
  'March': '03',
  'April': '04',
  'May': '05',
  'June': '06',
  'July': '07',
  'August': '08',
  'September': '09',
  'October': '10',
  'November': '11',
  'December': '12'
};

$(".date, .airDate, .clipDate").each(function() {
  var txt = $(this).text();
  var m = txt.match(/(\S+)\s*(\d+),\s*(\d{4})/);
  var date = m[3] + "+" + month_names[m[1]] + "+" + m[2];
  var url;
  if(window.location.hostname == "www.thedailyshow.com") {
    url = 'http://www.yourbittorrent.com/?q=Daily+Show+' + date;
  } else {
    url = 'http://www.yourbittorrent.com/?q=Colbert+Report+' + date;
  }
  $(this).append("<div><a href='"+ url + "'>Download on bittorrent</a></div>");
});

You can download the script here.

See how easy it all was? Now start writing your own scripts and share them.

Thursday, May 20, 2010

Everybody Draws Muhammad Day and lessons in cultural relativism

May 20th is Everybody Draw Mohammed Day - a day when you too can join the defense of Free Speech by drawing a cartoon. Or if you really suck at drawing you can make a shop of Muhammad like me.



Prophet Muhammad, artist's conception
If you think I'm wrong, draw a better one

Lesson #1 - Principle of dissimilarity


Recently I have been convinced by some TTC audiobooks that Jesus might have probably existed as a historical person. You know what's the best argument for it? The criterion of dissimilarity.

It's not any pomo deconstructionism - just plain old historical analysis. The idea is that authors have an agenda, but life doesn't always agree with it. So every time the text admits to something that is clearly against the authors' agenda, it suggests it probably had some basis in reality - because they wouldn't make this up on purpose. A few examples from Jesus' life:
  • Crucifixion was embarrassing kind of death reserved for slaves, rebels, and other lowest status scum. If someone was making up the story they would have Jesus die in battle, or die some other "respectable kind of death". So actual Jesus was most likely actually crucified.
  • All the messy explanations how "everybody thought Jesus was born in middle-of-nowhere Nazareth but actually he was born in Bethlehem like King David" suggest that Jesus was probably born in Nazareth. If they were making this up, they'd skip the Nazareth story altogether.
  • Jesus was baptized by John the Baptist - being baptized by someone was signaling submission and inferiority to that person. This story is clearly embarrassing to Bible writers, they put words like "I should be baptized by you" in John the Baptist's mouth, and the last Gospel omits it altogether.
The sheer volume of such cases suggests that Bible is an "enhanced" story of some actual life, as opposed to being completely made up. There's no question that all the supernatural bits have been inserted, and it's widely believed that the real life stories were quite significantly massaged - but it does such a bad job at covering a lot of embarrassing parts like these that the hypothesis "Bible is vaguely based on a life of a real person, and these embarrassing stories were widely known so couldn't be ignored" is much more compatible with the evidence than "Bible is completely made up" - and as their Bayesian priors are not terribly different in the first place, historicity of Jesus can be reasonably believed in. The "Bible is real" story on the other hand is both against the priors and against the evidence, so let's not even go there.

Rambo Cat by Gerard Girbes from flickr (CC-NC-ND)

Lesson #2 - Why is Muhammad widely considered a pedophile?


The criterion of dissimilarity must be truly infuriating to the true believers - it essentially says that every time they like some part it's probably false, and every time they dislike some part it's probably true. By the way this applies to all historical texts, not just religious ones - parts of Commentarii de Bello Gallico that are overly sympathetic to Julius Caesar should be looked at with more suspicion than parts which talk about his failures, and so on.

So, what does it all have to do with Muhammad being a pedophile? Muslim texts clearly say that he was married to Aisha when she was 6, and fucked her when she was 9 year old.

Now how likely is it that this particular bit was made up? If you had a prophet who didn't fuck children, would you casually make up a story that he actually did? That'd go against all rules of proper writing. And it's not one isolated verse somewhere - as Wikipedia says "references to Aisha's age by early historians are frequent", and nobody questioned that back then.

So regardless of our believes, we can be fairly certain that
Muhammed fucked a 9 year old girl

And now some time for cultural relativism - this was considered fairly unremarkable back then. Our modern culture is obsessed about young people's sexuality and as a civilization we essentially lost the ability to propagate the species, but historically it was entirely normal for much younger people to marry, fuck, and have babies - biologically late teens are the optimal time to have children. In completely typical Ancient Rome - girls typically married in their mid teens, very soon after puberty. The idea was simple:
  • Regardless of societal constraints, many people will get sexually active once they hit puberty
  • Of people who get sexually active, many will get pregnant
  • Nearly everywhere except for modern times, it's really difficult to either provide for your kids without a husband, or get a husband if you already have kids
  • To avoid risking that, it's better to marry your daughters sooner

This was the baseline normal case for girls.

Western pedophile obsession (where all sexuality of people under 20 or so is a big taboo - this has little to do with what psychology calls "pedophilia") is a highly atypical case. 7th century Arabic children marriage, and fucking of prepubescent 9 year old girls was also rather atypical.

Not that it matters what's typical and what's not - abject poverty, illiteracy, and lack of broadband Internet are historical norm, and yet I much rather prefer atypical modern situation.

Anyway, what we see here is a conflict of values - Muhammed fucking a 9 year old was acceptable within his culture, and is not considered acceptable within ours - not even for most modern Muslim countries. Aisha most likely didn't mind getting fucked, and nobody was weirded out by this, in spite of modern fiction of "age of consent". She was most likely not scarred for life or anything like that - these laws exist primarily to make adult bigots feel good, not to "protect" children, and evidence is clear that plenty of teenagers (and an occasional pre-teen) have sex and enjoy it.



Lesson #3 - Freedom of speech


But that's not all. On top of one value dissonance about sex with pre-teens we have a second value dissonance about free speech and religious respect.

In Western culture since the Enlightenment, we have been very strongly attached to the idea that there is nothing that cannot be criticized. Yes, laws of different countries prohibit different kinds of speech (including United States, Supreme Court's favourite activity is making exceptions to the First Amendment and Elena Kagan doesn't seem any different here) - but every time such law is applied the media get freaked out and people feel highly uneasy about it. Even people who want to limit freedom of speech considerably universally consider it to be the default case - exceptions to be few and made only when "necessary".

These believes are not shared by many, apparently including modern Muslim societies. It seems that they approach Muhammed cartoons from another direction - that people's religious believes should be respected, and their holy figures shouldn't be mocked - freedom of speech being far lower in their order of priorities than that.

They don't even see the cartoons as a freedom of speech issue, just like 7th century Arabs didn't see fucking a 9 year old girl as a child abuse issue, and we don't see the cartoons as a blasphemy issue. Different cultures have different perspectives.

This basic cultural relativism does not mean that all cultures are equally wrong, or equally right. It would be intellectually dishonest to think that your culture is uniquely correct about things, and its beliefs are some sort of human universals - history shows that there are hardly any true human universals, and we might even get rid of death and taxes one day. But you are still free to follow your own culture's value system.

If you think that freedom of speech is valuable, and fucking prepubescent girls is creepy, mocking that and not giving a shit about angry Pakistanis is all fine.

In other words - enjoy the Everybody Draws Muhammad Day.