Generating Combining Forms in Devanagari

la_half_devanagari
la_half_devanagariUnicode has been a boon for Indic Computing. Before the Arrival of Unicode, Indic scripts usually resorted to hack encoding of Latin & Extended Latin to encode them. This resulted in the creation of numerous non-standard fonts which were mutually incompatible. Every application, or website followed its own encoding schememe. Thanks to Unicode, Indic Scripts now have a uniforms standard, though legacy fonts are still in use albeit much restricted.

The introduction radicalized Indic font encodings, by resorting to logical order of the characters rather than the visual order. But It has its own limitation. In legacy fonts, the display of the script could be twisted and extended to suit our own needs. As, the scripts were encoded as glyph pieces. 
  
Even in Unicode it is possible to generate some of the glyph pieces, using Unicode control characters. 
 

As the Title says, Half-forms of Devanagari consonants could be generated sing Zero Width Joiner (ZWJ). 
  
The Half forms can be created by the following order:

<Consonant> + <Virama> + <ZWJ> 
 
<ल> + <्> + <‍> (Invisible Control Character) ->  ल्‍
 
The ZWJ character can be found in the Character Map, if the IME doesn’t support inputting of control characters.
 
Some of the Devanagari half forms:
  
क्‍  च्‍   छ्‍‍‍‍   ज्‍   व्‍   स्‍   ह्‍ 
  
One might expect a “Repha” when ZWJ is used to generate the half-form of /ra/ र . But strangely, <ra-virama-zwj> is mapped to the Marathi Eye-lash Ra.
 
र्‍
 
Some consonants like ङ ट ठ ड do not have any half forms. They produce Stacked Conjuncts or the second combining consonant changes shape to combine with them, while they retain their shapes.
 
Apart from the ability to write weird combinations such as क्‍ष (<KA> + <VIRAMA> + <ZWJ> + <SSA> ) , ब्‍र etc. This has some useful uses as well. In some fonts like Arial Unicode MS, archaic conjuncts are formed by default. By using ZWJ, the display of the conjuncts can be controlled.
 
क्व क्‍व
ग्न ग्‍न

(If you dont have Arial Unicode MS, both the display ought to look the same)

 

ZWJ and ZWNJ have many more uses in the Indic block. We shall see about them later… perhaps all in a single article..

Leave a Reply

Your email address will not be published. Required fields are marked *