WMT: What are the differences of WMT14, WMT15 and WMT16 datasets?

Each year, the Workshop on Statistical Machine Translation (WMT) holds a conference that focuses on new tasks, papers, and findings in the field of machine translation.

Let's say we are talking about the parallel dataset Newscommentary. There is the Newscommentary in WMT14, WMT15, WMT16 and so on.

How much does the dataset differ from each conference? Is it possible to read this somewhere?

Topic machine-translation dataset nlp

Category Data Science


The most straightforward way is to check the article published to summarize the results of each WMT, where you can find the figures of each of the datasets supplied for the competition e.g.

enter image description here

enter image description here

enter image description here

You can find all the papers in the ACL Anthology, searching for Findings of the Conference on Machine Translation 20XX, specifying the target year.

Note that in 2016, WMT changed its name from "Workshop on Statistical Machine Translation" to " Conference on Machine Translation", despite keeping the WMT acronym.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.