This post is a blogified version of a lightning talk I gave on BarCamp London 5. It was inspired by Chris Ball's Favourite Unicode Codepoints post. It's going to be in a weird talk/blogpost hybrid form that I hope my readers will excuse.
First, I want to say that this talk is not going to convey any useful information whatsoever. You won't learn anything about internationalization, or anything else from it. I'm doing it just because it's going to be fun and awesome.
First the famous mirror trick, where text can be seen upside down, or mirrored left to right. None of it is real Unicode characters like "mirrored e" or "upside down a". It's just a bunch of characters that happen to look like that - for example "upside down p" (like in pet) is obviously "d" (like in dog). If there's no good Latin letter, a letter from other script is used, like Cyrillic or IPA phonetic alphabet. It will be more or less noticable depending on your font.
Here's a real Unicode character - Skull and Crossbones, arrr! It's used as danger signal, so it's arguably common enough for inclusion in Unicode.
This one I totally don't get. It's just a random icon that somehow got into Unicode. Unicode is huge, so they have very low standards for inclusion. Maybe it was in Microsoft Wingdings or something like that and they thought it's a good enough reason to include it.
I half-get this one. Top three lines are Japanese Post symbol. Where does the rest of the face comes from and how it got into Unicode is a mystery to me. It was probably included in some JIS standard as a joke, and Unicode copied it, or something along these lines.
Operators from APL programming language got into Unicode too. APL is like 1960s' Perl. This operator doesn't feel too good because it has to program in APL.
It's called Arabic ligature Uighur Kirghiz yeh with hamza above with alef maksura isolated form, and it's exactly what it says it is. It looks rather ordinarily for this list, but it might be the character with the longest name.
Another Arabic one. Most ligatures are for just 2 or 3 characters, but canonical decomposition of this one is whooping 18 characters. It means something like "May Allah bless him and grant him peace" and is used when Prophet Muhammad is mentioned. By the way I had a really funny picture of Muhammad that I wanted to put here, but I somehow cannot find it.
How many loops are there?
This letter is very spidery so better be careful or it will bite you.
Sometimes it's not enough to be greater than, or even much greater than something else. Oh no, you need to be very much greater than. I think TeX is spoiling mathematicians and they come up with way too many symbols, and then we have to support them.
A polar opposite of the previous character. It's not greater than, neither is it less than. We kinda have a symbol for that already - U+003D EQUALS SIGN. OK, I know it's about partial orders, and it means that two objects cannot be compared, but it's not any less funny for knowing that.
This is a very sad symbol. Not only its heart is heavy, it's also black. Is it a waste of codepoint or what? It's just a random icon not a meaningful "character".
That's my personal favorite for "worst waste of codepoint award". Not only is "Floral Heart Bullet" not a character, they even included a reversed rotated version of it in Unicode. It's an icon, not a character.
We really need a punctuation mark that says "WTF". This entire list is one big interrobang use case, am I right?
The last one is not a character, but the entire Tibetan script. It looks absolutely beautiful.
If you have any questions related to this talk/blogpost, just put them in comments.