Persian to English Translation Problems of Topicalization Process in Apertium Platform
AbstractMachine translation encounters several problems in translating from Persian language to English, due to morphological, lexical, and structural divergences between these languages. It becomes especially more difficult when the source language (SL) has specific characteristics which are unavoidable in the process of machine translation systems. This article is going to present some syntactic problems, the Apertium shallow-transfer rule-based machine translation (RBMT) platform encounters in translating structures with topilcalization from Persian to English, and tries to solve them based on the Apertium structural transfer module. Then, this developed Apertium system is evaluated using word error rate (WER) and position-independent error rate (PER), metrics and its quality is compared with that of Google translate as a statistical machine translation system. The Apertium Persian monolingual dictionary was extracted from the frequent words of Wikipedia Persian Monolingual Corpus and Persian side of Mizan English-Persian Parallel Corpus. The result shows that the syntactic translation problems mainly arise from Persian syntactic structures with topicalized constituents which are difficult to be handled by the Apertium structural transfer module. One way to solve them is writing new structural transfer rules to translate these structures more adequately.