taw's blog: October 2009

Wednesday, October 28, 2009

Medieval 2 Total War Concentrated Vanilla 0.07

Here's another release of my Medieval 2 Total War Concentrated Vanilla mod.

Old releases suffered from the too-much-money problem, just like the pre-crash Goldman Sachs, so I decided to scale it down a bit. Instead I introduced massive free garrisons.

Here's the full list of changes relative to vanilla:

For more dynamic game, all units move 75% faster on strategic map.
One big problem with vanilla was that money didn't matter. No matter if you had a lot or not, there wasn't all that much to spend it on. So now all buildings take 1 turn to build, but are 50% more expensive. This drastically increases importance of money.
To add some strategic richness, mines give 2x the profits, and cost 3x as much (x2, +50% like all other buildings). Some settlements with rich mines like Zagreb or Vienna are worth a lot, maybe even worth starting a war over.
There used to be 50% increase in settlement and merchant trade, and mine profits used to be 3x, but this was a serious overkill. If you manage well, settlements and merchants will give you more money than in vanilla as you can build relevant buildings faster.
There also no longer double population growth penalty/bonus for tax rates - it was an overkill.
To help small nations, King's Purse increased 2x (that is - free money a nation gets unrelated to its settlements - from 1500-2500/turn to 3000-5000/turn).
To reduce annoyance with rebels, rebel and pirate spawn rates reduced 10x. (I'm not actually sure if it reduced them as much as the file says it would, but it feels more or less right now).
Cavalry, especially heavy cavalry was seriously overpowered. Small groups of knights could easily crush infantry armies ten times their budget, so infantry was useless other than free militias to patrol cities, and cheap grunts to operate siege equipment. Because with cavalry-centric warfare missile units are useless, this limited army composition even further - and destroyed rich tactical variety that M2TW could possibly have. So all cavalry is now 50% more expensive to recruit and upkeep. Cavalry is still extremely powerful when used right, except now other kinds of armies are useful too.
Bodyguards were particularly overpowered - in early periods they simply destroyed everything, were numerous, self-healing, essentially free (you get new candidates for adoption if you lose old generals all the time), and as every faction has identical bodyguards they severely reduced tactical richness of the game. So bodyguards units are the half size now (upkeep cost is proportional to unit size, and they're cavalry, so mod's bodyguard unit costs 75% of vanilla's). Old versions of the mod reduced their hitpoints from 2 to 1, but it no longer happens, as this made bodyguards overnerfed in late periods. Unfortunately there no way to get the balance right - as bodyguards stay the same, but other units massively improve, either bodyguards will be too strong early on, or too weak late on. I moved the balance point somewhat towards the weakening.
Vanilla sieges are ridiculously easy, with a single unit of ballistae being able to take down large stone walls. This made sieges was too easy, and as a consequence turned the game into a siegefest. So now gates and walls (but not towers) are 5x stronger. Serious artillery will be needed to take anything but the weakest walls.
Towers now have 2x fire rate for normal ammo, and 1x fire rate for flaming ammo. Increasing flaming ammo rate too much as old versions did made siege equipment like siege towers almost useless. But high normal fire rate should help defenders a lot, even for undermanned settlements that AI leaves. This wouldn't be necessary if AI was competent, but we know well it's not, and leaves valuable settlements defended by one unit of Peasants far too often. This balances AI stupidity considerably, and also frees the player to play more strategically by making defense easier (this is not as essential as in old versions of the mod due to increased free garrisons).
Tower activation range is 8x larger (a bit of reversal to Rome Total War). So you cannot shut down towers just by scaring off defenders with some arrows like in vanilla. Strong walls, and massive towers' firepower together mean sieges will be very difficult. Prepare a lot of artillery, or spies, or massively overwhelming force that can take severe loses before giving up. I'm yet to take a well-manned Fortress in this mod (but then AI often mans its settlements insufficiently, and there are other options like waiting their food supplies out, or provoking a sally).
For consistency with making sieges harder, spies cost 2x to recruit and upkeep. And AI is quite competent at counter-espionage operations, so spy flood doesn't work all that often.
To help defenders in sieges, missile infantry ammo increased 2x. It very rarely affects field battles, as in those they rarely get to expend even half of their ammo.
A big change is doubling free garrison in cities. Village 0, Town 4, Large Town 6, Small City 8, Large City 10, Huge City 12.
The biggest change is introducing free garrison in castles. Motte&Bailey 1, Wooden Castle 2, Castle 4, Fortress 7, Citadel 12. In vanilla castles are either left barely defended, or are massive bleeding money hole - failure both ways.
Just like with cities, castles' basic infantry units can be free garrison, but not cavalry or expensive infantry. Basic infantry is one with morale up to 5, and cost up to 155. So Peasants up to Armored Sergeants, and Peasant Archers up to Genoese Crossbowmen, Venetian Archers, and Hand Gunners can all be free when garrisoned. On the other hand Dismounted Knights of all kinds, and most faction-specific elite units cannot. I guess it might introduce some faction unfairness, but I can live with it.

Overall, what you get is much tactically and strategically richer game, with a lot larger differences between factions, but one that still feels a lot like vanilla Medieval 2 Total War. In the mod defense is easy, so you don't have to mindlessly attack everyone.

The main problem left is the spectacularly stupid AI, about which I cannot really do that much.

I also did some other experiments, that didn't get to the mod:

Making ocean-traveling ships available from the beginning (well, as soon as you can build sufficient port infrastructure) enables fights with natives. But Americas are disconnected from the mainland Europe, and it doesn't really feel like it's part of the same game. Try it once to see what fighting Aztecs is like, then forget they're even part of the game.
Making Mongols (or maybe even Timurids) come earlier. It's a lot more fun to fight Mongols, but then if they come very early, they will totally crush AI on auto-resolve. Unfortunately auto-resolve favours stronger armies too much, so they won't even suffer too much loses, and you'll have to deal with them yourself. One full stack of Pavise Crossbowmen and Feudal Knights led by semi-decent General (or equivalent) can defeat one full Mongol stack without that much trouble in field, and well defended settlement with ballista towers completely destroys them. It's another thing to try once. I haven't fought Timurids yet.
Making Gunpowder available immediately unfortunately counters the effects of very strong walls, and yet good gunpowder infantry and cavalry won't be available until much later. I'm not even sure if gunpowder units are better than plain Pavise Crossbowmen, they don't seem to be from the stats, and I never had an opportunity to play much with them.

Anyway, download the mod here.

Blog posts about previous releases: 0.02, 0.04, 0.06.

Thursday, October 15, 2009

Folk Complexity Theory

There's one kind of posts on this blog that get the most hostile reactions. It's not politics (nobody reads those), flaming programming languages, or computer games - the most controversial subject is Big O analysis! Here are relevant posts:

Basically I claim that Big O is useless in most practical situations. And I get flamed a lot. At first I tried to explain to the flamers how they are mistaken, then I basically ignored them after explanations proven to be pointless, but now I see that it's all just big misunderstanding. What I call Big O analysis is a fairly precise construct of computer science. But that's not how most people use the Big O notation - they follow Folk Complexity Theory, which I want to describe here.

The Official Complexity Theory

First, here's a brief repetition of Algorithms and Data Structures 201, or whatever your course explaining this was.

An algorithm is O(f(n)) if in a particular model of computations, if there exist some constants n₀, and c, such that for all input lengths greater than n, computation time is less than c × f(n). A problem is O(f(n)) if in particular model of computations there exists a O(f(n)) algorithm that solves it.

Models of computations are things like different varieties of Random access machine, and different models usually only differ by some logarithmic factors. Sometimes you want to assume that adding two numbers takes constant time, sometimes that it takes time proportional to logarithm of their sum, and details like that. If you disregard logarithmic factors, you don't have to be all that precise about which variety of RAM you're using. That's about it.

In particular, if the problem is only defined for constant-sized input, then by definition it has complexity O(1). There exists some c, such that for all possible sizes, running time of the algorithm is less that c × 1.

So breaking AES-128 is O(1). As is breaking RSA-4096. Breaking RSA-n for arbitrary n isn't, but that's entirely unrelated problem. Breaking AES-n is meaningless concept, as AES isn't defined for arbitrary n.

In particular classes like P, and NP, don't depend much on what reasonable computation model you're using. They're defined on Deterministic Turing Machines for reference (which is polynomial factor slower than RAM), but if something is P on DTM, it's P on RAM and vice versa.

How cryptography handles it

As I got a lot of flames about my claim that breaking RSA-4096 is O(1), let me explain how cryptography handles it. So cryptography doesn't use Big O notation much.

Security is measured in bits - a cryptosystem has k bits of security if the attacker must perform 2^k times as many operations as legitimate user to defeat security of the cipher. For some types of security key size and security are related - for block ciphers security = key size, for hash collisions security = key size/2. For public key cryptography these bit measures can be only approximately specified in terms of key size, and whatever formula we have we're completely uninterested in its asymptotic form, as all feasible key sizes are small, and constant factors are of great interest to us.

The Folk Complexity Theory

And now the main subject of the post - The Folk Complexity Theory. When people say some algorithm is O(f(n)), they actually mean that for "normal" sizes of inputs, on actual computers, it's expected running time is between two "small" constants times f(n) times reasonable number of log(n) factors as long as they don't push the constant too high.

O is used sort of like Θ of the official theory. Folk O(n) algorithm is not considered to be O(n²).
So bubble sort is O(n²), as number of operation tends to be something like 10n². Logarithmic factor for addressing and arithmetic is not considered at all.
Merge sort is O(n×log(n)), as number of operation tends to be something like 10n×log(n). Notice how log factor for number of operations is included here as it's significantly higher than allowed small constants; but log factor for cost of addressing and arithmetic is not, as on normal computers it's constant up to 2³², which is very big n. And only doubles for 2⁶⁴, which is too ridiculously high n to bother. Exactly the same reasoning can be used for both log factors, so this is highly peculiar.
Quick sort is O(n×log(n)), the Folk Complexity Theory always deals with typical case, not worst case.
Breaking AES-128 is O(2ⁿ) because the constant is too big to be allowed. It doesn't matter that there's no n here, and AES is not defined for other key sizes. The problem must be reformulated to have n, no matter if it makes any sense or not. Number of operation like 2¹⁴⁰ is simply "not allowed", while 2¹²×2ⁿ is.
Breaking RSA-4096 is likewise O(2ⁿ), even though there's no n here at all.
Forging signature of n-byte message signed with RSA-4096 is, uhm... O(2ⁿ). Wait, n is already something else... O(2^k) then maybe? The problem is reformulated as RSA-k because large constants are not allowed, and usually there's no point saying O(2^k+n), as n is too small to matter for all practical n.
As an added bonus "NP" usually means Θ(2ⁿ), for real or imagined n. This is spectacularly incorrect (most problems are not decision problems; there can be no Θ bounds even for NP-complete problems until NP != P is proven, and every P problem is NP), but doesn't cause much confusion in practice.

This is the source of all confusion - I was talking about the Official Complexity Theory, while most Redditors interpreted it as Folk Complexity Theory statements, in which they were obviously untrue.

Is Folk Complexity Theory useful?

It's quite amusing that so many people flame me for saying the Official Complexity Theory is overrated, and don't even know it. But I'm not going to blame them - who needs it once they're past exams anyway? In practice Folk Complexity Theory is actually better heuristic than the Official kind. We program real computers. On real input sizes. ceil(log₂(n)) factors shouldn't be ignored, but ceil(log_4294967296(n)) factors which represent cost of arithmetic and addressing on most 32-bit computers hardware totally can and should. And so should all log(log(n)), which nobody ever bother with past exams.

The Folk Complexity Theory has weird status - it's neither mathematics like the Official theory, nor practice like actual benchmarking. It's just a bunch of heuristics dressed as something much more formal than they are. Often even fairly useful heuristics. Just nothing more than that, and not worth the flames.

The Web 2.0 of Metaphors

After hearing that something is Wikipedia of whatever it is about, and getting slightly annoyed by it, I decided, in interest of science, to compile popularity ranking of "the X of Y" metaphors, by Google counting.

I found quite a few false positives, like "the livejournal of Nikki", and "the Linux of your choice", but upholding the glorious scientific tradition of ignoring every methodological issue that cannot be fixed, I completely disregarded them.

"the windows of" - 15,100,000
"the internet of" - 7,790,000
"the apple of" - 6,470,000
"the google of" - 3,160,000
"the youtube of" - 766,000
"the microsoft of" - 397,000
"the imdb of" - 307,000
"the craigslist of" - 303,000
"the livejournal of" - 290,000
"the ebay of" - 277,000
"the amazon of" - 262,000
"the facebook of" - 237,000
"the myspace of" - 235,000
"the linux of" - 197,000
"the wikipedia of" - 180,000
"the yahoo of" - 142,000
"the flickr of" - 124,000
"the ibm of" - 109,000
"the blogspot of" - 108,000
"the web 2.0 of" - 92,900
"the bing of" - 90,900
"the firefox of" - 63,200
"the aol of" - 63,100
"the napster of" - 61,200
"the msn of" - 52,100
"the digg of" - 23,600

It seems the most popular activity is explaining one web service as metaphor of another:

Is Craigslist the "Napster" of the Sex Industry
YouTube Redesign: Becoming the Google of Video?
Is Facebook: the Google of Social Networks
Google Earth Is the AOL of the Geoweb
MySpace is the AOL of Social Media/Web 2.0
YouTube.com — the Flickr of Video

There are also some unusual finds:

The Wikipedia of Slutty Dressing
The Craigslist of Antibiotic Resistance
Is Python the Apple of Programming Languages?

Another interesting find is that while I've been expecting a lot of "the X of sex", only Wikipedia (36,200) and Facebook (29,500) have decent number of matches.

"The X of porn" is a lot more popular, here Google (378,000), YouTube (354,000), and surprisingly Napster (22,200) are taking the lead.

The other most popular Internet activity, spam, got almost no hits.

Wednesday, October 14, 2009

Efficient go hypothesis

Efficient market hypothesis states that you cannot really beat the market by much, all the prices are about right, and everybody behaves it more or less rational way. Yes, treating it too seriously leads to ketchup economics thinking, but it's often a good starting point - if you postulate that some prices are seriously wrong, you better come up with a story why they're wrong.

But let's forget about economics for a moment. Something very similar to efficient-market hypothesis applies to the game of go. Unlike most other games like chess and football, go is extremely balanced.

Black player, who moves first, pays 6.5 points for the privilege, what makes both players' chances of winning as close to 50:50 as it gets.
If one player is stronger, the weaker player gets as many free stones as necessary to get the chance of winning very close to 50:50.

What's interesting is that during the game, it's impossible to make a move that highly increases your advantage, or "beats the market" - if you could get significantly ahead by playing a stone, then the position was already unbalanced before you've done so due to this possibility. Values of positions consist of complex balance of territory, influence, flexibility, good and bad aji, and many other things - if you ask a computer program like gnugo to estimate balance of the board, you will get widely varying numbers with each moves, each totally ridiculous.

Just like on the market we cannot give correct formula for a price, but we know it's right because people trade at it; in go we don't have a formula for value of a position, but we know it's even if good moves by two players led there. If you postulate imbalance, you better come up with a story telling which move was bad and why.

Go players often perform some complex trades of moves called joseki - standard sequences that lead to more or less even results. Again, why even? Because neither you nor your opponent would purposefully play bad moves to fall behind, and if there are no good moves available it means one of you was already losing earlier.

Joseki example - every single move here and the sequence are about even

This requires significant shift in thinking compared to games where you act to get as far ahead of your opponent. You can play go like an index fund - just try to stay even with each move, and let your opponent make some mistakes and fall behind you.

That doesn't mean there's no room for creativity - sometimes there's only one good move to make, most of the time you have multiple good options with subtly different balances. These tiny variations are enough to win - there is no possibility of draw in go (not counting super-ko and such exceptional situations), so even if you get just 1 point ahead through the entire game, 0.01 point advantage per move, that's just enough. It's kind of like high frequency trades, which earn tiny profits, but with sufficient volume it's worth it.

Most professional games end with just a few points advantage for one side or the other. Or one of the players falls behind just a bit, and then risks it all on a great all-board fight to avoid the few point loss.

So as you can see, go is just like market. Except it never leads to crises costing trillions of dollars.

Sunday, October 04, 2009

Practical O(n) problem impossible to solve even for n=1

In computational complexity a fairly popular claim seen in most textbooks equates class P with practically feasible algorithms, and everything not known to be P (but usually within NP, "not known to be P" is an awkward thing we need to say until P!=NP is finally proven, even though we all know it to be true already) with not practically feasible algorithms. While it's often noted that it might be untrue for some cases, with some P algorithms being impractical, and some not known to be P algorithms to be practical, such disclaimers usually consider these cases to be contrived and not worth any further consideration.

Here's a really simple example of a genuinely useful problem in P (linear even) that's not practical at all.

Utterly impractical O(n) problem

Imagine problem of falsifying digital signature for a message in some reasonable digital signature scheme. Inputs are public key of constant length, and message of length n. The algorithm is as follows:

Extract private key from the public key by brute force, or some other algorithm (constant time)
Sign the message (linear time in n required just to read the message, and no more than linear time necessary)

So we have an O(n) algorithm, and because we need to read entire input to sign the message, no faster algorithm is possible. Therefore the problem is Θ(n).

Yet, unless someone breaks the public key system and provides vastly faster method of extracting private key based on the public key, we won't be able to solve it even for n=1.

This problem is of great practical importance - the entire online banking system would fail, and billions of dollars of fraud would happen if someone could practically solve it. So it's not fair to call utterly unfeasible P problems contrieved.