Wednesday, March 25, 2009

why isn't life more like Unicode?

It's 12:03pm in Berlin. The sun is shining. Cars whiz up and down Hauptstrasse outside this café. Wherever Randall Munroe may be it is now definitely Wednesday. I'm supposed to be sorting out contractual issues, which ought to mean writing an e-mail and sending it but turns out to mean writing e-mails and putting them in the Drafts Folder and wandering Etherworld in search of solace. A frame of mind in which I naturally turn to XeTeX. The webpage now has screenshots of various multilingual documents, which look awfully nice; I could use XeTeX to typeset a book.

One of the screenshots was of the Arabic version of What is Unicode; went over to the Unicode website to have a look around, and came upon the Last Resort Font:

Last Resort Font

The Last Resort font is a collection of glyphs to represent types of Unicode characters. These glyphs are designed to allow users to recognize that an encoded value is one of the following:

  • a specific type of Unicode character
  • in the Private Use Area (no private agreement exists)
  • unassigned (reserved for future assignment)
  • one of the illegal character codes.
These glyphs are used as the backup of "last resort" to any other font; if the font cannot represent any particular Unicode character, the appropriate "missing" glyph from the Last Resort font is used instead. This provides users with the ability to tell what sort of character it is, and gives them a clue as to what type of font they would need to display the characters correctly. (For more information, see The Unicode Standard, Version 5.0, Section 5.3 Unknown and Missing Characters, pages 155-156.)

Which sounds like the sort of thing we need for daily life. In its absence things go horribly wrong and no one knows why, because unrepresentable characters have no appropriate "missing" glyph.

No comments: