Francais | English | Espanõl

Binary Ordered Compression for Unicode

From Wikipedia, the free encyclopedia

(Redirected from BOCU)
Jump to: navigation, search
Unicode
Encodings
UCS
Mapping
Bi-directional text
BOM
Han unification
Unicode and HTML
Unicode and e-mail
Unicode typefaces

BOCU-1 is a MIME compatible Unicode compression scheme. BOCU stands for Binary Ordered Compression for Unicode. BOCU-1 combines the wide applicability of UTF-8 with the compactness of SCSU. This Unicode encoding is useful for compressing short strings, and it maintains code point order. Usually, the zip, bzip2, and other industry standard algorithms compact larger amounts of Unicode text more efficiently.

SCSU was created as a Unicode compression scheme with a byte/code point ratio similar to language-specific code pages. It has not been widely adopted although it fulfills the criteria for an IANA charset and is registered with IANA. SCSU is not suitable for MIME “text” media types. For example, SCSU cannot be used directly in emails and similar protocols. SCSU requires a complicated encoder design for good performance.

It is worth noting that SCSU has been adopted as an official Unicode Technical Standard. BOCU-1 has not been officially adopted by the Unicode consortium, but Unicode Technical Note #6 does describe this encoding in more detail.

[edit] External links

Personal tools