One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the FLORES-101 evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains. These sentences have been translated in 101 languages by professional translators through a carefully controlled process. The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are multilingually aligned. By publicly releasing such a high-quality and high-coverage dataset, we hope to foster progress in the machine translation community and beyond.
— Lees op arxiv.org/abs/2106.03193
Blijf op de hoogte
Wekelijks inzichten over AI governance, cloud strategie en NIS2 compliance — direct in je inbox.
[jetpack_subscription_form show_subscribers_total="false" button_text="Inschrijven" show_only_email_and_button="true"]Bescherm AI-modellen tegen aanvallen
Agentic AI ThreatsRisico's van autonome AI-systemen
AI Governance Publieke SectorVerantwoorde AI voor overheden
Cloud SoevereiniteitSoeverein in de cloud — het kan
NIS2 Compliance ChecklistStap-voor-stap naar NIS2-compliance
Klaar om van data naar doen te gaan?
Plan een vrijblijvende kennismaking en ontdek hoe Djimit uw organisatie helpt.
Plan een kennismaking →Ontdek meer van Djimit
Abonneer je om de nieuwste berichten naar je e-mail te laten verzenden.