Persian to English Translation Problems of Topicalization Process in Apertium Platform

  • Parya Razmdideh
  • Abbas Ali Ahangar
  • Seyyed Mojtaba Sabbagh Jafari
  • Gholamreza Haffari


Machine translation encounters several problems in translating from Persian language to English, due to morphological, lexical, and structural divergences between these languages. It becomes especially more difficult when the source language (SL) has specific characteristics which are unavoidable in the process of machine translation systems. This article is going to present some syntactic problems, the Apertium shallow-transfer rule-based machine translation (RBMT) platform encounters in translating structures with topilcalization from Persian to English, and tries to solve them based on the Apertium structural transfer module. Then, this developed Apertium system is evaluated using word error rate (WER) and position-independent error rate (PER), metrics and its quality is compared with that of Google translate as a statistical machine translation system. The Apertium Persian monolingual dictionary was extracted from the frequent words of Wikipedia Persian Monolingual Corpus and Persian side of Mizan English-Persian Parallel Corpus. The result shows that the syntactic translation problems mainly arise from Persian syntactic structures with topicalized constituents which are difficult to be handled by the Apertium structural transfer module. One way to solve them is writing new structural transfer rules to translate these structures more adequately.

Author Biographies

Parya Razmdideh
Ph.D. Candidate of Linguistics, University of Sistan and Baluchestan, Iran
Abbas Ali Ahangar
Associate Professor of Linguistics, University of Sistan and Baluchestan, Iran
Seyyed Mojtaba Sabbagh Jafari
Assistant Professor of Computer Engineering, Vali-e-Asr University of Rafsanjan, Iran
Gholamreza Haffari
Lecturer at Faculty of Information Technology, Monash University, Australia