Neural versus phrase-based MT quality: An in-depth analysis on English–German and English–French
- Citation data:
Computer Speech & Language, ISSN: 0885-2308, Vol: 49, Page: 52-70
- Publication Year:
- Computer Science; Mathematics
- Most Recent Tweet View All Tweets
Within the field of statistical machine translation, the neural approach (NMT) is currently pushing ahead the state of the art performance traditionally achieved by phrase-based approaches (PBMT), and is rapidly becoming the dominant technology in machine translation. Indeed, in the last IWSLT and WMT evaluation campaigns on machine translation, NMT outperformed well established state-of-the-art PBMT systems on many different language pairs. To understand in what respects NMT provides better translation quality than PBMT, we perform a detailed analysis of neural versus phrase-based statistical machine translation outputs, leveraging high quality post-edits performed by professional translators on the IWSLT data. In this analysis, we focus on two language directions with different characteristics: English–German, known to be particularly hard because of morphology and syntactic differences, and English–French, where PBMT systems typically reach outstanding quality and thus represent a strong competitor for NMT. Our analysis provides useful insights on what linguistic phenomena are best modelled by neural models – such as the reordering of verbs and nouns – while pointing out other aspects that remain to be improved – like the correct translation of proper nouns.