DeepL VS. ChatGPT: MACHINE TRANSLATION EVALUATION

Penulis

  • Yulia Milka Nugraheni Gadjah Mada University
  • Adi Sutrisno

DOI:

https://doi.org/10.36277/jurnalprologue.v10i2.174

Kata Kunci:

Machine Translation Evaluation, Error Analysis, BLEU, DeepL, ChatGPT

Abstrak

This research aims to evaluate DeepL and ChatGPT performance in translating acaddemic text, through human and machine evaluation. Furthermore, this research is expected to give readers an overview of the translation produced by DeepL and ChatGPT. DeepL and ChatGPT are two machine translation which are using the latet technology in machine translation called as Natural Language Processing (Jiao et al., 2023). The evaluation was conducted by using Koponen (2010) Error Analysis and Papineni (2002) automated machine translation evaluation called Bilingual Language Evaluation Understudy (BLEU). The evaluation was conducted by applying qualitative and quantitative method. Both methods used in order to draw a stronger conclusion. The result of the research concluded that DeepL evaluation showed a better performance than ChatGPT.  On Error Analysis Evaluation, there are 25 errors found in DeepL translation and 26 errors in ChatGPT translation. On BLEU score evaluation, the final score of DeepL translation is 0.9446657236 and BLEU score of ChatGPT is 0.9211813372.

##submission.downloads##

Diterbitkan

2024-09-30

Cara Mengutip

Nugraheni, Y. M., & Adi Sutrisno. (2024). DeepL VS. ChatGPT: MACHINE TRANSLATION EVALUATION. Jurnal Prologue, 10(2), 411–426. https://doi.org/10.36277/jurnalprologue.v10i2.174