Romanization of Arabic
From Wikipedia, the free encyclopedia
| Arabic alphabet | ||||||
| ﺍ || ﺏ || ﺕ || ﺙ || ﺝ || ﺡ || ﺥ | ||||||
| ﺩ || ﺫ || ﺭ || ﺯ || ﺱ || ﺵ || ﺹ | ||||||
| ﺽ || ﻁ || ﻅ || ﻉ || ﻍ || ﻑ || ﻕ | ||||||
| ﻙ || ﻝ || ﻡ || ﻥ || هـ || ﻭ || ﻱ | ||||||
| History · Transliteration Diacritics · hamza ء Numerals · Numeration | ||||||
Due to the fact that the Arabic language has a number of phonemes that have no equivalent in English or other European languages, a number of different transliteration methods have been invented to represent certain Arabic characters, due to various conflicting goals.
Contents |
[edit] Problems
Any transliteration system of Arabic has to make a number of decisions, dependent on its intended field of application. The root of the problem is that the information contained in unvocalized Arabic writing is not sufficient to give a reader unfamiliar with the language sufficient information for accurate pronunciation. An exact equivalent of e.g. صدام حسين would be ṣdʾm ḥsyn, which is meaningless to an untrained reader. The "full transliteration" adds information not in the text, which has to be supplied by a speaker of Arabic, ṣaddām ḥussayn. Usually, newspapers and popular books use not a transliteration, but a transcription: instead of translating each written letter they try to reproduce the sound of the words according to the orthography rules of the target language, e.g. Saddam Hussein; for spelling differences depending on the target language, compare Omar Khayyam with German Omar Chajjam, both for عمر خيام (unvocalized ʿmr ḫyʾm, vocalized ʿumar ḫayyām).
Most issues around the romanization are about transliterating vs. transcribing – others, about what should be romanized:
- transliteration ignores assimilation (sandhi) of the article before "solar letters": al-shams not the transcribed ash-shams / aš-Šams / asch-Schams (German) / asj-Sjams (dutch) / ach-chams (French)
- a transliteration must render the "tied tā" (ta marbouta ة) faithfully, a transcription must render the sound ("a" like any other "a" or "t" like any other "at" — or in a vocalized text nothing vs. t)
- "broken alif" (alif maqṣura, ى) must be transliterated with a special symbol, but is transcribed like standing alif, when it stands for a long a (ā)
- For nunation is true what is true for the rest: transliteration renders what you see, transcription what you hear.
A transcription may reflect the language as spoken by the people of Baghdad, or the official Standard as spoken by a preacher in the mosque or a TV news reader. A transcription is free to add phonological (such as vowels) or morphological (such as word boundaries) information. A transliteration is ideally fully reversible: a machine must be able to translate it into Arabic and back.
A transliteration may be criticized as flawed for any of the following reasons:
- A "loose" transliteration is ambiguous, rendering several Arabic phonemes with an identical transliteration, or digraphs for a single phoneme (such as sh) may be confused with two adjacent phonemes;
- Symbols representing phonemes may be considered too similar (e.g., ` and ' or ʿ and ʾ for ayin and hamza);
- ASCII transliterations using capital letters to disambiguate phonemes are easy to type but may be considered unaesthetic.
A further problem is that a transliteration which represents the letters exactly may be easily misread by non-Arabs, in particular with the use of the definite article (written "al" in arabic, but not necessarily pronounced as such). For instance an-nur (or an-nuur, or an-noor) would be more correctly transliterated along the lines of alnnur, but a hyphen is added and the unpronounced 'l' removed for the convenience of the uninformed non-Arab reader, who would otherwise pronounce an 'l', probably not understand the word to be nur, pronounce only one 'n', and be confused by the role of the double 'n'. Alternatively, if the shadda is not transliterated (since it is strictly not a letter), a hypercorrect transliteration would be alnur, which presents similar problems for the uninformed non-Arab reader.
A final problem is that all these problems produce the problem that a lot of time may be wasted worrying about transcription when it really doesn't matter. A reader who knows Arabic will normally be able to reconstruct the original however it has ben transliterated or transcribed, while a reader who does not know Arabic will not normally understand any of the systems anyhow, and simply be confused.
[edit] Transliteration standards
- Deutsche Morgenländische Gesellschaft (1936): Adopted by the International Convention of Orientalist Scholars in Rome. It is the basis for the very influential Hans Wehr dictionary (ISBN 0-87950-003-4). [1]
- ISO/R 233 (1961). Replaced by ISO 233 in 1984 but still encountered.
- BS 4280 (1968): Developed by the British Standards Institute. [2]
- SATTS (1970s): One-to-one mapping to Latin Morse equivalents; used by US military.
- UNGEGN (1972): [3]
- DIN-31635 (1982): Developed by the Deutsches Institut für Normung (German Institute for Standardization).
- ISO 233 (1984).
- Qalam (1985): A system that focuses upon preserving the spelling, rather than the pronunciation, and uses mixed case. [4]
- ISO 233-2(1993). Simplified transliteration.
- Buckwalter Transliteration (1990s): Developed at Xerox by Tim Buckwalter [5]; doesn't require unusual diacritics. [6]
- ALA-LC (1997). [7]
- SAS: Spanish Arabists School (José Antonio Conde and others, early 19th century onwards). [8]
A table comparing romanizations using DIN 31635, ISO 233, ISO/R 233, UN, ALA-LC, and Encyclopaedia of Islam systems is available here: [9].
[edit] Comparison table
| Letter | Name | SATTS | UNGEGN | ALA-LC | DIN-31635 | ISO 233 | ISO/R 233 | Qalam | SAS | SM | IPA |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ﺀ | hamza | E | ʼ, — | —, ’ | ʾ | ˈ, ˌ | —, ’ | ' | ʾ (zero word-initially) | ' (disappears after 'al-' and where alif waṣl is. | [ʔ] |
| ﺍ | ʼalif | A | ā | ʾ | ā | aa | a, i, u (syllable-initial) ā (lengthening) | aa | various, including [aː] | ||
| ﺏ | bāʼ | B | b | b | b | b | b | b | b | b | [b] |
| ﺕ | tāʼ | T | t | t | t | t | t | t | t | t | [t] |
| ﺙ | ṯāʼ | C | th | th | ṯ | ṯ | ṯ | th | ṯ | ç | [θ] |
| ﺝ | ǧīm, jīm, gīm | J | j | j | ǧ | ǧ | ǧ | j | ŷ | j | [ʤ] / [ʒ] / [ɡ] / [j] |
| ﺡ | ḥāʼ | H | ḩ | ḥ | ḥ | ḥ | ḥ | H | ḥ | ḥ | [ħ] |
| ﺥ | ḫāʼ | O | kh | kh | ḫ | ẖ | ẖ | kh | j | x | [x] |
| ﺩ | dāl | D | d | d | d | d | d | d | d | d | [d] |
| ﺫ | ḏāl | Z | dh | dh | ḏ | ḏ | ḏ | dh | ḏ | đ | [ð] |
| ﺭ | rāʼ | R | r | r | r | r | r | r | r | r | [r] |
| ﺯ | zāy | ; | z | z | z | z | z | z | z | z | [z] |
| ﺱ | sīn | S | s | s | s | s | s | s | s | s | [s] |
| ﺵ | šīn | : | sh | sh | š | š | š | sh | š | š | [ʃ] |
| ﺹ | ṣād | X | ş | ṣ | ṣ | ṣ | ṣ | S | ṣ | ṣ | [sˁ] |
| ﺽ | ḍād | V | ḑ | ḍ | ḍ | ḍ | ḍ | D | ḍ | ḍ | [dˁ] |
| ﻁ | ṭāʼ | U | ţ | ṭ | ṭ | ṭ | ṭ | T | ṭ | ṭ | [tˁ] |
| ﻅ | ẓāʼ | Y | z̧ | ẓ | ẓ | ẓ | ẓ | Z | ẓ | đ̣ | [ðˁ] / [zˁ] |
| ﻉ | ʻayn | ` | ʻ | ʻ | ʿ | ʿ | ʿ | ` | ʿ | ř | [ʕ] / [ʔˁ] |
| ﻍ | ġayn | G | gh | gh | ġ | ġ | ḡ | gh | g | ğ | [ɣ] / [ʁ] |
| ﻑ | fāʼ | F | f | f | f | f | f | f | f | f | [f] |
| ﻕ | qāf | Q | q | q | q | q | q | q | q | q | [q] |
| ﻙ | kāf | K | k | k | k | k | k | k | k | k | [k] |
| ﻝ | lām | L | l | l | l | l | l | l | l | l | [l], [lˁ] (in Allah only) |
| ﻡ | mīm | M | m | m | m | m | m | m | m | m | [m] |
| ﻥ | nūn | N | n | n | n | n | n | n | n | n | [n] |
| ﻩ | hāʼ | ~ | h | h | h | h | h | h | h | h | [h] |
| ﻭ | wāw | W | w | w | w | w | w | w | w (consonantal) ū (lengthening) | w (consonantal) o (lengthening) | [w] , [uː] |
| ﻱ | yāʼ | I | y | y | y | y | y | y | y (consonantal) ī (lengthening) | y (consonantal) e (lengthening) | [j] , [iː] |
| ﺁ | ʼalif mamdūda | AEA | ā | ā, ʼā | ʾā | ʾâ | ā, ʼā | ā | 'aa | [ʔaː] | |
| ﺓ | tāʼ marbūṭa | @ | h, t | h, t | h, t | ẗ | h, t | h, t | t (zero when in absolute state) | ŧ | [a], [at] |
| ﻯ | ʼalif maqṣūra | / | y | y | ā | ỳ | ae | à | à | [aː] | |
| ﻻ | lām ʼalif | LA | lā | lā | lā | laʾ | lā | la | lʾ (with hamza) lā (with lengthening alif) | treated as laam then alif usually: laa | [laː] |
| ال | ʼalif lām | AL | al- | al- | al- | ʾˈal | al- | al | al- | al- When assimilation occurs: ál- |
[edit] Online
- Main article: Arabic Chat Alphabet
Online communication is often restricted to an ASCII environment in which not only the Arabic letters themselves but also Roman characters with diacritics are unavailable. This problem is faced by most speakers of languages that use non-Roman alphabets, or heavily modified ones. An ad hoc solution consists of using Arabic numerals which mirror or resemble the relevant Arabic.
[edit] See also
[edit] External links
- Online en > ar transliteration tool
- SATTS: Roman-to-Arabic mappings
- Omniglot: Arabic alphabet, pronunciation and language
- J'raxis·Com: The Arabic Script
- Table comparing Romanization systems
- Learn the Arabic Script Onlinear:مناظرة الحروف العربية
br:Treuzlizherennadur arabek th:การเขียนคำทับศัพท์ภาษาอาหรับ

