Ai2 releases new language models competitive with Meta’s Llama

There’s a new AI model family on the block, and it’s one of the few that can be reproduced from scratch.

On Tuesday, Ai2, the nonprofit AI research organization founded by the late Paul Allen, released OLMo 2, the second family of models in its OLMo series. (OLMo’s short for “Open Language Model.”) While there’s no shortage of “open” language models to choose from (see: Meta’s Llama), OLMo 2 meets the Open Source Initiative’s definition of open source AI, meaning the tools and data used to develop it are publicly available.

The Open Source Initiative, the long-running institution aiming to define and “steward” all things open source, finalized its open source AI definition in October. But the first OLMo models, released in February, met the criterion as well.

“OLMo 2 [was] developed start-to-finish with open and accessible training data, open-source training code, reproducible training recipes, transparent evaluations, intermediate checkpoints, and more,” AI2 wrote in a blog post. “By openly sharing our data, recipes, and findings, we hope to provide the open-source community with the resources needed to discover new and innovative approaches.”

There’s two models in the OLMo 2 family: one with 7 billion parameters (OLMo 7B) and one with 13 billion parameters (OLMo 13B). Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.

Like most language models, OLMo 2 7B and 13B can perform a range of text-based tasks, like answering questions, summarizing documents, and writing code.

To train the models, Ai2 used a data set of 5 trillion tokens. Tokens represent bits of raw data; 1 million tokens is equal to about 750,000 words. The training set included websites “filtered for high quality,” academic papers, Q&A discussion boards, and math workbooks “both synthetic and human generated.”

Ai2 claims the result is models that are competitive, performance-wise, with open models like Meta’s Llama 3.1 release.

Image Credits:Ai2

“Not only do we observe a dramatic improvement in performance across all tasks compared to our earlier OLMo model but, notably, OLMo 2 7B outperforms LLama 3.1 8B,” Ai2 writes. “OLMo 2 [represents] the best fully-open language models to date.”

The OLMo 2 models and all of their components can be downloaded from Ai2’s website. They’re under Apache 2.0 license, meaning they can be used commercially.

There’s been some debate recently over the safety of open models, what with Llama models reportedly being used by Chinese researchers to develop defense tools. When I asked Ai2 engineer Dirk Groeneveld in February whether he was concerned about OLMo being abused, he told me that he believes the benefits ultimately outweigh the harms.

“Yes, it’s possible open models may be used inappropriately or for unintended purposes,” he said. “[However, this] approach also promotes technical advancements that lead to more ethical models; is a prerequisite for verification and reproducibility, as these can only be achieved with access to the full stack; and reduces a growing concentration of power, creating more equitable access.”

Source link

Ai2 releases new language models competitive with Meta’s Llama

Recent posts

Threads tests the ability for users to choose their preferred default feed

The US IPO window hasn’t reopened yet, but startups take what they can

Fintech Bolt is buying out the investor suing over Ryan Breslow’s $30M loan

The other election night winner: Perplexity

Russian programmer says FSB agents planted spyware on his Android phone

ServiceTitan’s IPO keeps getting weirder

Prominent crypto critic says someone offered bribes to take down a blog post

How a viral AI image catapulted a Mexican startup to a major adidas contract

Swiggy weighs increasing IPO size by $150M, aiming to raise up to $1.4B

Nubank leads $250M round in African digital bank Tyme at $1.5B valuation

AI pioneer Fei-Fei Li warns policymakers not to let sci-fi sensationalism shape AI rules

Autonomous delivery startup Nuro pivots and another Indian EV scooter startup takes the IPO road

Zeta valued at $2B in new funding

Elon Musk’s X dodges EU’s DMA as bloc decides platform isn’t important enough for fairness controls

WhatsApp brings Meta Verified, customized messages to small businesses in India

Related articles

Startup co-founded by longevity guru Peter Attia emerges from stealth

OpenAI plans to integrate Sora’s video generator into ChatGPT

Researchers uncover unknown Android flaws used to hack into a student’s phone

Google Sheets gets a Gemini-powered upgrade to analyze data faster and create visuals

Good hype for fusion, bad buzz for YC

Alkami is buying fintech Mantl for $400 million

Mozilla responds to backlash over new terms, saying it’s not using people’s data for AI

Only 3 more days to save up to $325 at TechCrunch Sessions: AI

Company

Follow us