Meta execs obsessed over beating OpenAI’s GPT-4 internally, court filings reveal

Date:

Share post:


Executives and researchers leading Meta’s AI efforts obsessed over beating OpenAI’s GPT-4 model while developing Llama 3, according to internal messages unsealed by a court on Tuesday in one of the company’s ongoing AI copyright cases, Kadrey v. Meta.

“Honestly… Our goal needs to be GPT-4,” said Meta’s VP of Generative AI, Ahmad Al-Dahle, in an October 2023 message to Meta researcher Hugo Touvron. “We have 64k GPUs coming! We need to learn how to build frontier and win this race.”

Though Meta releases open AI models, the company’s AI leaders were far more focused on beating competitors that don’t typically release their model’s weights, like Anthropic and OpenAI, and instead gate them behind an API. Meta’s execs and researchers held up Anthropic’s Claude and OpenAI’s GPT-4 as a gold standard to work toward.

The French AI startup Mistral, one of the biggest open competitors to Meta, was mentioned several times in the internal messages, but the tone was dismissive.

“Mistral is peanuts for us,” Al-Dahle said in a message. “We should be able to do better,” he said later.

Tech companies are racing to upstage each other with cutting-edge AI models these days, but these court filings reveal just how competitive Meta’s AI leaders truly were — and seemingly still are. At several points in the message exchanges, Meta’s AI leads talked about how they were “very aggressive” in obtaining the right data to train Llama; at one point, an exec even said that “Llama 3 is literally all I care about,” in a message to coworkers.

Prosecutors in this case allege that Meta’s executives occasionally cut corners in their mad race to shipping AI models, training on copyrighted books in the process.

Touvron noted in a message that the mix of datasets used for Llama 2 “was bad,” and talked about how Meta could use a better mix of data sources to improve Llama 3. Touvron and Al-Dahle then talked about clearing the path to use the LibGen dataset, which contains copyrighted works from Cengage Learning, Macmillan Learning, McGraw Hill, and Pearson Education.

“Do we have the right datasets in there[?]” said Al-Dahle. “Is there anything you wanted to use but couldn’t for some stupid reason?”

Meta CEO Mark Zuckerberg has previously said he’s trying to close the performance gap between Llama’s AI models and closed models from OpenAI, Google, and others. The internal messages reveal the intense pressure within the company to do so.

“This year, Llama 3 is competitive with the most advanced models and leading in some areas,” said Zuckerberg in a letter from July 2024. “Starting next year, we expect future Llama models to become the most advanced in the industry.”

When Meta ultimately released Llama 3 in April 2024, the open AI model was competitive with leading closed models from Google, OpenAI, and Anthropic, and outperformed open options from Mistral. However, the data Meta used to train its models — data Zuckerberg reportedly gave the green light to use, despite its copyright status — are facing scrutiny in several ongoing lawsuits.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

PowerSchool data breach victims say hackers stole ‘all’ historical student and teacher data

U.S. school districts affected by the recent cyberattack on edtech giant PowerSchool have told TechCrunch that hackers...

UnitedHealth hid its Change Healthcare data breach notice for months

Change Healthcare, the UnitedHealth-owned healthtech company that lost more than 100 million people’s sensitive health data in...

SoftBank veteran hunts for profits in payments infrastructure plumbing

In the summer of 2020, as pandemic-driven volatility gripped markets, SoftBank Group shocked Wall Street with a...

Creator of Gas and tbh makes an app for disappearing photos via iMessage

Nikita Bier, creator of popular apps like the anonymous polling app tbh (acquired by Facebook) and the...

Synthesia snaps up $180M at a $2.1B valuation for its B2B AI video platform

As the world continues to work through how to handle the explosion of deepfake content online, it...

Nelly raises $51 million to digitalize medical practices across Europe

Nelly wants to become the biggest fintech startup in the healthcare industry. The Berlin-based startup is already...

SEC sues Elon Musk for allegedly failing to disclose Twitter acquisition on time

The Securities and Exchange Commission filed a lawsuit against Elon Musk on Tuesday over an alleged securities...

Nvidia backs MetAI, a Taiwanese startup that creates AI-powered digital twins

Nvidia has been doubling down on the opportunity to build robotics and other industrial AI applications, with...