Errors in Machine Translated and Crowdsourced Post-Edited Texts
Abstract
The initial objective of the present study was to identify the most and the least frequent error types in Google Translate (GT) raw outputs and the crowd(sourced) post-edited versions according to Vilar et al.’s (2006) typology. The second objective was to compare the results of error analysis between both outputs in order to address the significance of the decrease in the number of errors in post-edited texts. To this end, four English sports news texts were uploaded on Google Translator Toolkit (GTT), which is an online collaborative environment for post-editing the automatic translations rendered by GT. Subsequently, eleven M.A. students of translation studies which were categorized as unprofessional translators were invited to the online environment via email to modify the machine translations. Results of the error analysis revealed that the two categories of Incorrect Words and Unknown Words were respectively the most and the least frequent error types in both outputs. The study also showed less than fifty percent decrease in the number of errors in post-edited texts. However, some effective factors for improving the quality of crowd(sourced) post-edited outputs and the applicability of GTT were investigated based on the collected literature, an online interview with participants and the researchers’ own observations.
Keywords:
Crowdsourcing, crowdsourced post-editing, machine translation, translation errors, google translator toolkitReferences
Anastasiou, D., & Gupta, R. (2011, November 14). Comparison of crowdsourcing translation with machine translation. (A. Foster, & P. Rafferty, Eds.) Journal of Information Science, 37(6), 637–659.
Boudreau, K., & Lakhani, K. (2013, April). Using the crowd as an innovation partner. Harvard, 91(4), 61–69.
Brabham, D. (2008). Crowdsourcing as a model for problem solving: An introduction and cases. Convergence: The International Journal of Research into New Media Technologies, 14(1), 75-90. Retrieved from www.cvg.sagepub.com
Brabham, D. (2010). Moving the crowd at Threadless: Motivations for participation in a crowdsourcing application. Information, Communication & Society, 13(8), 1122–1145.
Brabham, D. (2013). Crowdsourcing. Cambridge, UK: MIT Press.
Cheung, S. (2012). How Companies Can Leverage Crowdsourcing. (Master's thesis) , Massachusetts Institute of Technology, Cambridge, Massachusetts. Retrieved from web.mit.edu/smadnick/www/wp/2012-02.pdf
Dombek, M. (2012, February 29). Translation crowdsourcing: The Facebook way – in search of crowd motivation. Retrieved May 18, 2018, from Issuu: https://issuu.com/dublincityuniversity/docs/magdalenadombek
Dombek, M. (2014). A study into the motivations of internet users contributing to translation crowdsourcing: The case of Polish Facebook user-translators. (Doctoral dissertation), Dublin City University, Dublin, Ireland. Retrieved May 23, 2018, from http://doras.dcu.ie/19774/
European Commission. (2012). Crowdsourcing Translation: Studies on multilingualism and translation. Brussels & Luxembourg, Luxembourg: Directorate-General for Translation. Retrieved August 16, 2017, from https://publications.europa.eu/en/publication-detail/-/publication/85558431-cfb4-4ff7-817d-5ad1338dc4b1/language-en/format-PDF/source-70609970
Hertel, G., Niedner, S., & Herrmann, S. (2003). Motivation of software developers in open source projects: An Internet-based survey of contributors to the Linux kernel. Elsevier, 1159–1177.
Hessellund, L. (2014). Crowdsourcing in the translation industry - An emerging trend in a globalised world: A study of Danish and Dutch user-translators on Facebook. (Master's thesis), Aarhus University, Aarhus, Denmark. Retrieved May 14, 2018, from http://pure.au.dk/portal/files/79510036/Crowdsourcing_in_the_Translation_Industry_An_Emerging_Trend_in_a_Globalised_World.pdf
Howe, J. (2006, June 1). The Rise of Crowdsourcing. (N. Thompson, Editor) Retrieved May 16, 2018, from Wired Magazine: https://www.wired.com/2006/06/crowds/
Howe, J. (2008). Crowdsourcing: Why the power of the crowd is driving the future of business. New York: Random House.
Jenkins, H., Clinton, K., Purushotma, R., Weigel, M., & Robinson, A. (2006). Confronting the challenges of participatory culture: Media education for the 21st century. Illinois: MacArthur Foundation.
Jiménez-Crespo, M. (2017). Crowdsourcing and Online Collaborative Translations. Amsterdam & Philadelphia: John Benjamins.
Katmada, A., Satsiou, A., & Kompatsiaris, I. (2016). Incentive mechanisms for crowdsourcing platforms. In F. Bagnoli, A. Satsiou, I. Stavrakakis, & P. Nesi (Ed.), Internet Science: Third International Conference, (pp. 3-18). Florence.
Lakhani, K., Bo Jeppesen, L., Lohse, P., & Panetta, J. (2006). The value of openness in scientific problem solving. Retrieved May 21, 2018, from Harvard Business School: https://www.hbs.edu/faculty/Publication%20Files/07-050.pdf
McDonough Dolmaya, J. (2011). The ethics of crowdsourcing. Linguistica Antverpiensia, New Series – Themes in Translation Studies, 10(10), 97-110.
McDonough Dolmaya, J. (2012). Analyzing the crowdsourcing model and its impact on public perceptions of translation. The Translator, 18(12), 167-191.
Mitchell, L. (2015). Community post-editing of machine-translated user-generated content. Dublin City University, Dublin, Ireland. Retrieved from doras.dcu.ie/20463/
O’Brien, S. (2011). Collaborative translation. In Y. Gambier, & L. van Doorslaer (Eds.), Handbook of Translation Studies, Volume 2 (pp. 17-20). Amsterdam & Philadelphia: John Benjamins.
Olohan, M. (2014). Why Do You Translate? Motivation to Volunteer and TED Translation. (C. O'Sullivan, Ed.) Translation Studies, 7(1), 17-33.
O'Neill , M. (2015, April 1). 7 Google Crowdsourcing Projects That Help Us Today. Retrieved July 31, 2018, from Make use of: https://www.makeuseof.com/tag/7-google-crowdsourcing-projects-help-us-today/
O'Reilly, T. (2007). What is Web 2.0: Design patterns and business models for the next generation of software. Communications & Strategies, 16(1), 17-37.
Probst, A. (2017). The effect of error type on pause length in post-editing machine translation output. (Master's thesis), Tilburg University, Communication and Information Sciences, Tilburg, Netherlands. Retrieved May 28, 2018, from arno.uvt.nl/show.cgi?fid=144937
Pym, A. (2011, January). Translation research terms: A tentative glossary for moments of perplexity and dispute. In A. Pym (Ed.), Translation Research Projects (Vol. 3, pp. 75-99). Tarragona: Intercultural Studies Group. Retrieved June 16, 2018, from http://isg.urv.es/publicity/isg/publications/trp_3_2011/index.htm
Summers, N. (2014, July 26). Google sets up a community site to help improve Google Translate. Retrieved May 23, 2018, from The Next Web: https://thenextweb.com/google/2014/07/25/google-sets-community-site-help-improve-google-translate/
TED. (2009, May 13). TED Translators. Retrieved May 22, 2018, from TED: https://www.ted.com/about/programs-initiatives/ted-translators
Toffler, A. (1980). The third wave: The classic study of tomorrow. New York: Bantam Books.
Turovsky, B. (2016, November 15). Found in translation: More accurate, fluent sentences in Google Translate. Retrieved May 5, 2018, from Google Blog: https://blog.google/products/translate/found-translation-more-accurate-fluent-sentences-google-translate/
Vashee, K. (2010, January 16). The Continuing Evolution of Automated Translation Technology: RbMT vs. SMT. Retrieved May 23, 2018, from proz.com: https://www.proz.com/translation-articles/articles/2855/
Vilar, D., Xu, J., D’Haro, L. F., & Ney, H. (2006). Error analysis of statistical machine translation output. The 5th International Conference on Language Resources and Evaluation (pp. 697–702). Genoa: European Language Resources Association. Retrieved May 29, 2018, from http://www.lrec-conf.org/proceedings/lrec2006/papers.htm
Winkler, K. (2014, July 28). Google enters crowdsourcing with launch of Translate Community. Retrieved May 24, 2008, from EDUKWEST: http://www.edukwest.com/google-enters-crowdsourcing-launch-translate-community/
Published
How to Cite
Issue
Section
License
Copyright Licensee: Iranian Journal of Translation Studies. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution–NonCommercial 4.0 International (CC BY-NC 4.0 license).