Unicode typefaces
From Wikipedia, the free encyclopedia
| Unicode |
|---|
| Encodings |
| UCS |
| Mapping |
| Bi-directional text |
| BOM |
| Han unification |
| Unicode and HTML |
| Unicode and e-mail |
| Unicode typefaces |
Unicode typefaces (also known as UCS fonts and Unicode fonts) contains wide range of characters, letters, digits, glyphs, symbols, ideograms, logograms, etc, which are collectively mapped into Universal Character Set, also known as, UCS (which is an international standard ISO/IEC 10646), derived from many different languages, scripts from all around the world. Thus a single font is able to include a vast range of characters, from different languages.
Unicode (ISO 10646 UCS) standard does not encode the font (collections of graphical shapes called glyphs) itself, but rather instead, it defines the abstract characters in a specific (codepoint) place and also defines the required changes of shape depending on the context they're used in (by using Combining characters). It also defines precomposed versions of most letter/diacritic combinations in normal use, so that, the conversion to and from the legacy encodings (of locale languages) becomes simpler and allows applications to use Unicode as an internal text format without having to implement combining characters. Different encodings, with their different number of byte scheme, can refer to the same Unicode codepoint (glyph).
Many fonts have kerning pairs which implements better spacing in between the letters. Many scripts (languages) have special orthographic rules which require that certain combinations of letterforms (alternative symbols for the same letter) be combined into special ligature forms (mixed characters), these rules are vast and complex, and the correct rendering requires script-shaping technologies (also known as Rendering Technology or Smartfont Engine) to tell the Operating System and User Agent how to properly output different characters and character parts for ligature formation. These complex instructions are embedded inside fonts. The user's operating system uses rendering engine(s) to translate (Unicode) strings into graphics of displayable characters.
Computer fonts use various techniques to display characters or glyphs. A Bitmap font contains a grid of dots known as pixels forming an image of each glyph in each face and size. Outline fonts (also known as Vector fonts) use drawing instructions or mathematical formulæ to describe each glyph. Stroke fonts use a series of specified lines (for the glyph's border) and additional information to define the profile, or size and shape of the line in a specific face and size, which together describe the appearance of the glyph. For more information, please see Computer font.
Currently (July, 2006), no Unicode fonts include all the characters defined in the present revision of the ISO 10646 standard. Many are continually updated to incorporate characters which were previously omitted or which were added in a newer version of the standard. Additionally, fonts may be updated to correct errors in past versions.
The UCS has over 1.1 million code points, but only the first 65,536 (the Plane 0: Basic Multilingual Plane, or BMP) had entered into common use before 2000. See the Mapping of Unicode characters article for more information on other planes (Plane 1: SMP, Plane 2: SIP, Plane 14: SSP, Plane 15 and 16: reserved for PUA) and therein included character blocks of scripts for different languages, dialects, etc.
The first Unicode font (with very large character set, and supporting many Unicode blocks) was Lucida Sans Unicode, it was developed by Charles Bigelow & Kris Holmes' in March, 1993 (Shipped with Windows NT 3.1). Second was Unihan font, developed by Ross Paterson in 1993. Third was Everson Mono Unicode font, released in 1995, developed by Michael Everson.
Contents |
[edit] Issues
There are typographical ambiguities in Unicode, so that some of the unified Chinese characters will be typographically different in different regions. For example, Unicode point U+9AA8 (骨) is typographically different between simplified Chinese and traditional Chinese. This has implications for the idea that a single typeface can satisfy the needs of all locales<ref>Ken Lunde, CJKV Information Processing, O'Reilly Inc, 1999. Page 128, "CJKV character form differences"</ref>.
[edit] Application of Unicode typefaces
Beside all the issues, Unicode is now the base character set for many new standards and protocols, and is built into the architecture of operating systems (Microsoft Windows, Apple Mac OS X, and many versions of Unix), programming languages (Perl, Python, Java, Common LISP, APL), and libraries (IBM International Components for Unicode (ICU) along with the Pango, Graphite, Scribe, Uniscribe, and ATSUI rendering engines), font formats (TrueType and OpenType) and so on. Many other standards are also getting upgraded to Unicode compliancy, day by day.
[edit] Utility software
Utility software such as the Character Map applet included with Windows 2000/XP, MainType (by HighLogic. Commercial, 40-day trial version is available), BabelMap (by Andrew West. Free, donation-ware.), Unicode Font Viewer (by Mike Lischke. Freeware), Quick Key (by Nathanael Jones. Opensource, free.), etc, can be used to see exactly which characters are included, inside a font file.
[edit] List of Unicode fonts
Out of many Unicode fonts, only few are listed below, which are mostly and commonly used by the (mainstream) majority of users around the world, in major platforms. Unicode font list with more fonts can be found in this (List of typefaces) article's "Unicode fonts" section. Free software Unicode typefaces gives more detail on free typefaces.
| Unicode Fonts | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Font | Char(s) | Glyphs | Kerning | Version | Font Family | Font style | Font type | Serif style | Other Info |
| Arial | 1,419 | 1,674 | 909 | 3.00 | Arial | Regular | OTF+TTO | Normal Sans | Comes with Microsoft Windows. |
| Arial Unicode MS | 38,917 | 50,377 | 0 | 1.00 | Arial Unicode MS | Regular | OTF+TTO | Normal Sans | Comes with Microsoft Office. |
| Bitstream Cyberbit | 32,910 | 29,934 | 935 | 2.0 beta | Bitstream Cyberbit | Roman | TT | Cove | Freeware, for non-commercial use only. |
| Cardo | 2,879 | 2,882 | 216 | 0.098 (2004) | Cardo | Regular | TT | Cove | Freeware, for non-commercial and non-profit uses only. |
| Caslon Roman | 3,684 | 3,686 | 0 | 001.000 16-12-2001 | Caslon | Roman | TT | BSD-like license. | |
| Code2000 | 51,239 | 61,864 | 115 | 1.16 | Code2000 | Regular | TT | Any | Shareware. A reduced version, Code2001, is available as freeware. |
| Charis SIL | 1,958 | 3,084 | 0 | 4.002 | Charis SIL | Regular | TT | Any | OFL |
| Chryſanþi Unicode (Chrysanthi Unicode) | 4,818 | 4,383 | 0 | 3.1 | Chrysanthi Unicode | Regular | TT | Cove | Freeware. |
| ClearlyU | - | 9,538 | 0 | 1.9 | - | - | - | - | Freeware. |
| DejaVu fonts (DejaVu Sans) | 3,525 | 3,611 | 2,558 | 2.8 | DejaVu Sans | Book | TT | Normal Sans | Freeware. |
| Doulos SIL | 1,958 | 3,083 | 0 | 4.014 | Doulos SIL | Regular | TT | Any | OFL |
| Everson Mono (Everson Mono Unicode) | 4,893 | 4,899 | 0 | 4.1.3<ref>Version info of Everson Mono Unicode 3.2b4 font is "Macromedia Fontographer 4.1.3 2003-02-13".</ref> | Everson Mono Unicode | Regular | TT | Any | Monospaced. Shareware. |
| FreeSerif | 3,914 | 5,257 | 0 | 1.52 | FreeSerif | Medium | TT | Cove | GPL. Sans serif (FreeSans) and monospaced (FreeMono) variants. |
| Gentium (Gentium Regular) | 1,469 | 1,699 | 2,857 | 1.0.2 (2005) | Gentium | Regular | TT | Any | OFL |
| GNU Unifont | 33,580 | 33,583 | 0 | 001.000 | unifont | Medium | Bitmap | Any | GPL |
| Junicode | 2,235 | 2,256 | 0 | 0.6.12 | Junicode | Regular | TT | Any | GPL |
| Linux Libertine (Regular TT) | 1,982 | 1,985 | 0 | 2.2.0 | Linux Libertine | Regular | OTF+TT | Any | GPL, OFL |
| Lucida Grande | 2,245 | 2,826 | 0 | 5.0d8e1 (Revesion 1.002) | Lucida Grande | Regular | - | Normal Sans | Comes with Mac OS X. Any proportion. |
| Lucida Sans Unicode | 1,765 | 1,776 | 0 | 2.00 | Lucida Sans Unicode | Regular | OTF+TTO | Normal Sans | Comes with Microsoft Windows. |
| Microsoft Sans Serif | 2,301 | 2,257 | 0 | 1.41 | Microsoft Sans Serif | Regular | OTF+TTO | Normal Sans | Comes with Microsoft Windows. |
| New Gulim | 46,567 | 49,284 | 0 | 3.10 | New Gulim | Regular | TT | Obtuse Cove | Came with MS Office 2000. Any Proportion. |
| Tahoma | 1,912 | 2,034 | 674 | 3.14 | Tahoma | Regular | OTF+TTO | Normal Sans | Comes with Microsoft Windows. |
| Times New Roman | 1,419 | 1,674 | 867 | 3.00 | Times New Roman | Regular | OTF+TTO | Cove | Comes with Microsoft Windows. |
| TITUS Cyberbit Basic | 9,341 | 10,044 | 0 | 3.0 (2000) (Revision 4.00) | TITUS Cyberbit Basic | Regular | TT | Cove | Freeware. |
| Y.OzFontN | 21,360 | 59,678 | 0 | 9.41 | Y.OzFontN | Regular | TT | Any | Freeware. Sans-serif (for Japanese) and Monospace (for Latin). |
| Font | Char(s) | Glyphs | Kerning | Version | Font Family | Font style | Font type | Serif style | Other Info |
| Unicode Fonts | |||||||||
- Note:
- OTF+TTO Image:U+2192.gif Font type: OpenType font with TrueType outlines.
- TT Image:U+2192.gif Font type: TrueType font.
[edit] Comparison of fonts
Number of characters included by the above version of fonts, for different Unicode blocks (or, ranges), are listed below.
[edit] 0000-077F
- N = Numerical digits. This number of characters are included in the font for that range.
- Image:U2713.svg = Most or some portion out of all characters in that range are present in the font.
- X = No characters are included in the font for that range or Unicode block.
- - = Data not available now.