python26 hw2.py Xhw1.txt Xhw2.txt
The hw2 program converts the normalized headword present in Xhw1.txt into the spelling of the slp1 phonetic transliteration. Of course, this only applies to dictionaries where the headword is Sanskrit; it doesn’t apply to the dictionaries where the headword is in English (ae and mwe).
Some dictionaries show the headword in Devanagari, and some dictionaries show the headword in the International Alphabet of Sanskrit Transliteration, or IAST.
There are several methods of transliteration from Devanagari to the Roman script; this is discussed in a Wikipedia article.
In the original Cologne digitizations, Devanagari is represented in the Kyoto-Harvard transliteration (also called Harvard-Kyoto transliteration).
For dictionaries whose headwords are coded in the Harvard-Kyoto transliteration, hw2 uses a system of transcoding to convert the headword spelling to the SLP1 transliteration. This transcoding is governed by an xml file named hk_slp1.xml.
In the original Cologne digitizations, IAST is coded in an Anglicized Sanskrit scheme developed by Thomas Malten. This AS scheme uses letter-number combinations to represent various diacritic marks on the letter. For instance, the IAST form rāma is represented in AS coding as ra1ma, where ‘a1’ indicates the letter ‘a’ with the macron diacritcal mark.
For dictionaries whose headwords are coded in the Anglicized Sanskrit scheme, hw2 uses a system of transcoding to convert the headword spelling to the SLP1 transliteration. This transcoding is governed by an xml file named as_slp1.xml. Also, if any capitalization in the headword is removed.
The end result of applying the appropriate transcoding to the headword spelling is that for each dictionary X, Xhw2.txt uses the same transliteration (SLP1) for spelling the headword.
The system of transcoding mentioned above is implemented by a Python module called transcoder.py.
There is a functionally identical PHP implementation, transcoder.php, that is used in the various displays.