×
Loading...

Arxiv. Unsupervised Cross-lingual Representation Learning at Scale by John Doe

Book Information

TitleArxiv. Unsupervised Cross-lingual Representation Learning at Scale
Year2020-04-08
PPI300
LanguageEnglish
Mediatypetexts
SubjectAI, Facebook, Commoncrawl, common crawl, language, languages, research, arxiv, bigdata, big data, crawl, crawls, crawler, web crawler, web crawl
Collectionopensource, community
Uploaderadydunch7642
Identifier1911.02116-arxiv-org
Telegram icon Share on Telegram
Download Now

Description

AbstractThis paper shows that pretraining multilinguallanguage models at scale leads to significantperformance gains for a wide range of cross-lingual transfer tasks. We train a Transformer-based masked language model on one hundredlanguages, using more than two terabytes of fil-tered CommonCrawl data. Our model, dubbedXLM-R, significantly outperforms multilingualBERT (mBERT) on a variety of cross-lingualbenchmarks, including +14.6% average accu-racy on XNLI, +13% average F1 score onMLQA, and +2.4% F1 score on NER. XLM-Rperforms particularly well on low-resource lan-guages, improving 15.7% in XNLI accuracyfor Swahili and 11.4% for Urdu over previ-ous XLM models. We also present a detailedempirical analysis of the key factors that arerequired to achieve these gains, including thetrade-offs between (1) positive transfer and ca-pacity dilution and (2) the performance of highand low resource languages at scale. Finally,we show, for the first time, the possibility ofmultilingual modeling without sacrificing per-language performance; XLM-R is very compet-itive with strong monolingual models on theGLUE and XNLI benchmarks. We will makeour code, data and models publicly available