Inception emerges from stealth with a new type of AI model

Inception, a new Palo Alto-based company started by Stanford computer science professor Stefano Ermon, claims to have developed a novel AI model based on “diffusion” technology. Inception calls it a diffusion-based large language model, or a “DLM” for short.

The generative AI models receiving the most attention now can be broadly divided into two types: large language models (LLMs) and diffusion models. LLMs, built on the transformer architecture, are used for text generation. Meanwhile, diffusion models, which power AI systems like Midjourney and OpenAI’s Sora, are mainly used to create images, video, and audio.

Inception’s model offers the capabilities of traditional LLMs, including code generation and question-answering, but with significantly faster performance and reduced computing costs, according to the company.

Ermon told TechCrunch that he has been studying how to apply diffusion models to text for a long time in his Stanford lab. His research was based on the idea that traditional LLMs are relatively slow compared to diffusion technology.

With LLMs, “you cannot generate the second word until you’ve generated the first one, and you cannot generate the third one until you generate the first two,” Ermon said.

Ermon was looking for a way to apply a diffusion approach to text because, unlike with LLMs, which work sequentially, diffusion models start with a rough estimate of data they’re generating (e.g. ,a picture), and then bring the data into focus all at once.

Ermon hypothesized generating and modifying large blocks of text in parallel was possible with diffusion models. After years of trying, Ermon and a student of his achieved a major breakthrough, which they detailed in a research paper published last year.

Recognizing the advancement’s potential, Ermon founded Inception last summer, tapping two former students, UCLA professor Aditya Grover and Cornell professor Volodymyr Kuleshov, to co-lead the company.

While Ermon declined to discuss Inception’s funding, TechCrunch understands that the Mayfield Fund has invested.

Inception has already secured several customers, including unnamed Fortune 100 companies, by addressing their critical need for reduced AI latency and increased speed, Emron said.

“What we found is that our models can leverage the GPUs much more efficiently,” Ermon said, referring to the computer chips commonly used to run models in production. “I think this is a big deal. This is going to change the way people build language models.”

Inception offers an API as well as on-premises and edge device deployment options, support for model fine-tuning, and a suite of out-of-the-box DLMs for various use cases. The company claims its DLMs can run up to 10x faster than traditional LLMs while costing 10x less.

“Our ‘small’ coding model is as good as [OpenAI’s] GPT-4o mini while more than 10 times as fast,” a company spokesperson told TechCrunch. “Our ‘mini’ model outperforms small open-source models like [Meta’s] Llama 3.1 8B and achieves more than 1,000 tokens per second.”

“Tokens” is industry parlance for bits of raw data. One thousand tokens per second is an impressive speed indeed, assuming Inception’s claims hold up.

Source link

Inception emerges from stealth with a new type of AI model

Recent posts

SpaceX mulls tender offer at $350B valuation

A single default password exposes access to dozens of apartment buildings

Microsoft signs massive carbon credit deal with reforestation startup Chestnut Carbon

Yup, Jony Ive is working on an AI device startup with OpenAI

Meta Connect 2024: How to watch the metaverse and generative AI event

WP Engine files an injuction to get its WordPress.org access back

Mammoth’s founder returns with new iOS app for Mastodon, Saturn

Employees of failed startups are at special risk of stolen personal data through old Google logins

Meta’s Yann LeCun says worries about A.I.’s existential threat are ‘complete B.S.’

Sequoia’s Matt Miller is exiting the firm after making headlines earlier this year

Silicon Valley is debating if AI weapons should be allowed to decide to kill

Direct ocean capture may be the next frontier for carbon removal

Reusable rocket startup Stoke raised another massive round: $260M

AI boom masks fundraising struggles for non-AI startups

Controversial genetics testing startup Nucleus Genomics raises $14M Series A

Related articles

Apple iPhone 16e review: An A18 chip and Apple Intelligence for $599

Europe’s Relay pulls in $35M Series A after applying Asia’s model to delivery

Lonestar and Phison’s data center infrastructure is headed to the moon

Commercetools, a pioneer in ‘headless commerce’, lays off dozens of staff

Shop Circle raises $60M to encircle ecommerce with an app suite

Maternity clinic Millie nabs $12M Series A from an all-star, all female class of VCs

Avride’s sidewalk delivery bots land in Japan

Here are all the tech companies rolling back DEI or still committed to it — so far

Company

Follow us