‘Open’ model licenses often carry concerning restrictions

Date:

Share post:


This week, Google released a family of open AI models, Gemma 3, that quickly garnered praise for their impressive efficiency. But as a number of developers lamented on X, Gemma 3’s license makes commercial use of the models a risky proposition.

It’s not a problem unique to Gemma 3. Companies like Meta also apply custom, non-standard licensing terms to their openly available models, and the terms present legal challenges for companies. Some firms, especially smaller operations, worry that Google and others could “pull the rug” on their business by asserting the more onerous clauses.

“The restrictive and inconsistent licensing of so-called ‘open’ AI models is creating significant uncertainty, particularly for commercial adoption,” Nick Vidal, head of community at the Open Source Initiative, a long-running institution aiming to define and “steward” all things open source, told TechCrunch. “While these models are marketed as open, the actual terms impose various legal and practical hurdles that deter businesses from integrating them into their products or services.”

Open model developers have their reasons for releasing models under proprietary licenses as opposed to industry-standard options like Apache and MIT. AI startup Cohere, for example, has been clear about its intent to support scientific — but not commercial — work on top of its models.

But Gemma and Meta’s Llama licenses in particular have restrictions that limit the ways companies can use the models without fear of legal reprisal.

Meta, for instance, prohibits developers from using the “output or results” of Llama 3 models to improve any model besides Llama 3 or “derivative works.” It also prevents companies with over 700 million monthly active users from deploying Llama models without first obtaining a special, additional license.

Gemma’s license is generally less burdensome. But it does grant Google the right to “restrict (remotely or otherwise) usage” of Gemma that Google believes is in violation of the company’s prohibited use policy or “applicable laws and regulations.” 

These terms don’t just apply to the original Llama and Gemma models. Models based on Llama or Gemma must also adhere to the Llama and Gemma licenses, respectively. In Gemma’s case, that includes models trained on synthetic data generated by Gemma.

Florian Brand, a research assistant at the German Research Center for Artificial Intelligence, believes that — despite what tech giant execs would have you believe — licenses like Gemma and Llama’s “cannot reasonably be called ‘open source.’”

“Most companies have a set of approved licenses, such as Apache 2.0, so any custom license is a lot of trouble and money,” Brand told TechCrunch. “Small companies without legal teams or money for lawyers will stick to models with standard licenses.”

Brand noted that AI model developers with custom licenses, like Google, haven’t aggressively enforced their terms yet. However, the threat is often enough to deter adoption, he added.

“These restrictions have an impact on the AI ecosystem — even on AI researchers like me,” said Brand.

Han-Chung Lee, director of machine learning at Moody’s, agrees that custom licenses such as those attached to Gemma and Llama make the models “not usable” in many commercial scenarios. So does Eric Tramel, a staff applied scientist at AI startup Gretel.

“Model-specific licenses make specific carve-outs for model derivatives and distillation, which causes concern about clawbacks,” Tramel said. “Imagine a business that is specifically producing model fine-tunes for their customers. What license should a Gemma-data fine-tune of Llama have? What would the impact be for all of their downstream customers?”

The scenario that deployers most fear, Tramel said, is that the models are a trojan horse of sorts.

“A model foundry can put out [open] models, wait to see what business cases develop using those models, and then strong-arm their way into successful verticals by either extortion or lawfare,” he said. “For example, Gemma 3, by all appearances, seems like a solid release — and one that could have a broad impact. But the market can’t adopt it because of its license structure. So, businesses will likely stick with perhaps weaker and less reliable Apache 2.0 models.”

To be clear, certain models have achieved widespread distribution in spite of their restrictive licenses. Llama, for example, has been downloaded hundreds of millions of times and built into products from major corporations, including Spotify.

But they could be even more successful if they were permissively licensed, according to Yacine Jernite, head of machine learning and society at AI startup Hugging Face. Jernite called on providers like Google to move to open license frameworks and “collaborate more directly” with users on broadly accepted terms.

“Given the lack of consensus on these terms and the fact that many of the underlying assumptions haven’t yet been tested in courts, it all serves primarily as a declaration of intent from those actors,” Jernite said. “[But if certain clauses] are interpreted too broadly, a lot of good work will find itself on uncertain legal ground, which is particularly scary for organizations building successful commercial products.”

Vidal said that there’s an urgent need for AI models companies that can freely integrate, modify, and share without fearing sudden license changes or legal ambiguity.

“The current landscape of AI model licensing is riddled with confusion, restrictive terms, and misleading claims of openness,” Vidal said. “Instead of redefining ‘open’ to suit corporate interests, the AI industry should align with established open source principles to create a truly open ecosystem.”



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Amazon’s Echo will send all voice recordings to the cloud, starting March 28

Amazon Echo users will no longer have the option to process their Alexa requests locally, which means...

Bluesky users debate plans around user data and AI training

Social network Bluesky recently published a proposal on GitHub outlining new options it could give users to...

Week in Review: SXSW week comes to a close

Welcome back to Week in Review! I’m Karyne Levy, TechCrunch’s deputy managing editor, and I’ll be writing...

SpaceX launches astronauts for long-awaited International Space Station crew swap

SpaceX successfully launched four people into space on Friday, beginning a mission that will give the International...

Skype is shutting down in May — these are the best alternatives

After 23 years of connecting people around the world, Skype, the popular video-calling service, is shutting down....

Republican Congressman Jim Jordan asks Big Tech if Biden tried to censor AI

On Thursday, House Judiciary Chair Jim Jordan (R-OH) sent letters to 16 American technology firms, including Google...

Bench is charging people for services they already paid for, some customers say

After Employer.com acquired bankrupt accounting startup Bench in a fire-sale late last year, CEO Jesse Tinsley pledged...

AI coding assistant Cursor reportedly tells a ‘vibe coder’ to write his own damn code

As businesses race to replace humans with AI “agents,” coding assistant Cursor may have given us a...