Meta Releases Open Source AI Machine Translation Model

In a project called "No Language Left Behind," Meta has built an artificial intelligence model — NLLB-200 — that can translate text across 200 different languages. Modelling techniques and learnings from the model will be used to improve and extend translations on Facebook, Instagram and Wikipedia, the company said in a news release.

NLLB-200 was designed with a focus on African languages, for which it can be difficult to find sufficient data to train an AI model. For example, there are 20 million native speakers of Luganda, a language of central Uganda, but "examples of this written language are extremely difficult to find on the internet," Meta explained. "The reality is that a handful of languages dominate the web, so only a fraction of the world can access content and contribute to the web in their own language. We want to change this by creating more inclusive machine translations systems — ones that unlock access to the web for the more than 4 billion people around the world that are currently excluded because they do not speak one of the few languages content is available in."

The company worked with professional translators to help develop a benchmark for automatically assessing NLLB-200's translation quality as well as do a human evaluation of what the AI produced. After measuring the quality of NLLB-200's output in each of the 200 languages, Meta found that it out-performs previous models by an average of 44 percent.

"Africa is a continent with very high linguistic diversity, and language barriers exist day-to-day. We are pleased to announce that 55 African languages will be included in this machine translation research, making it a major breakthrough for our continent," said Balkissa Ide Siddo, public policy director for Africa at Meta, in a statement. "In the future, imagine visiting your favorite Facebook group, coming across a post in Igbo or Luganda, and being able to understand it in your own language with just a click of a button — that's where we hope research like this leads us. Highly accurate translations in more languages could also help to spot harmful content and misinformation, protect election integrity, and curb instances of online sexual exploitation and human trafficking." 

Meta is releasing NLLB-200 as open source as well as publishing research tools for extending the model to more languages and technologies. It also plans to distribute up to $200,000 in grants for nonprofit organizations to develop real-world applications for the model.

A demo using NLLB-200 to translate children's stories from around the world is available here.

About the Author

Rhea Kelly is editor in chief for Campus Technology, THE Journal, and Spaces4Learning. She can be reached at [email protected].

Featured

  • cloud, database stack, computer screen, binary code, and flowcharts interconnected by lines and arrows

    Salesforce to Acquire Data Management Firm Informatica

    Salesforce has announced plans to acquire data management company Informatica for $8 billion. The deal is aimed at strengthening Salesforce's AI foundation and expanding its enterprise data capabilities.

  • Abstract AI circuit board pattern

    New Nonprofit to Work Toward Safer, Truthful AI

    Turing Award-winning AI researcher Yoshua Bengio has launched LawZero, a new nonprofit aimed at developing AI systems that prioritize safety and truthfulness over autonomy.

  • illustration of a football stadium with helmet on the left and laptop with ed tech icons on the right

    The 2025 NFL Draft and Ed Tech Selection: A Strategic Parallel

    In the fast-evolving landscape of collegiate football, the NFL, and higher education, one might not immediately draw connections between the 2025 NFL Draft and the selection of proper educational technology for a college campus. However, upon closer examination, both processes share striking similarities: a rigorous assessment of needs, long-term strategic impact, talent or tool evaluation, financial considerations, and adaptability to a dynamic future.

  • server racks, a human head with a microchip, data pipes, cloud storage, and analytical symbols

    OpenAI, Oracle Expand AI Infrastructure Partnership

    OpenAI and Oracle have announced they will develop an additional 4.5 gigawatts of data center capacity, expanding their artificial intelligence infrastructure partnership as part of the Stargate Project, a joint venture among OpenAI, Oracle, and Japan's SoftBank Group that aims to deploy 10 gigawatts of computing capacity over four years.