Inception emerges from stealth with a new type of AI model

Date:

Share post:


Inception, a new Palo Alto-based company started by Stanford computer science professor Stefano Ermon, claims to have developed a novel AI model based on “diffusion” technology. Inception calls it a diffusion-based large language model, or a “DLM” for short.

The generative AI models receiving the most attention now can be broadly divided into two types: large language models (LLMs) and diffusion models. LLMs, built on the transformer architecture, are used for text generation. Meanwhile, diffusion models, which power AI systems like Midjourney and OpenAI’s Sora, are mainly used to create images, video, and audio. 

Inception’s model offers the capabilities of traditional LLMs, including code generation and question-answering, but with significantly faster performance and reduced computing costs, according to the company.

Ermon told TechCrunch that he has been studying how to apply diffusion models to text for a long time in his Stanford lab. His research was based on the idea that traditional LLMs are relatively slow compared to diffusion technology.   

With LLMs, “you cannot generate the second word until you’ve generated the first one, and you cannot generate the third one until you generate the first two,” Ermon said. 

Ermon was looking for a way to apply a diffusion approach to text because, unlike with LLMs, which work sequentially, diffusion models start with a rough estimate of data they’re generating (e.g. ,a picture), and then bring the data into focus all at once.

Ermon hypothesized generating and modifying large blocks of text in parallel was possible with diffusion models. After years of trying, Ermon and a student of his achieved a major breakthrough, which they detailed in a research paper published last year.

Recognizing the advancement’s potential, Ermon founded Inception last summer, tapping two former students, UCLA professor Aditya Grover and Cornell professor Volodymyr Kuleshov, to co-lead the company. 

While Ermon declined to discuss Inception’s funding, TechCrunch understands that the Mayfield Fund has invested.

Inception has already secured several customers, including unnamed Fortune 100 companies, by addressing their critical need for reduced AI latency and increased speed, Emron said.

“What we found is that our models can leverage the GPUs much more efficiently,” Ermon said, referring to the computer chips commonly used to run models in production. “I think this is a big deal. This is going to change the way people build language models.”

Inception offers an API as well as on-premises and edge device deployment options, support for model fine-tuning, and a suite of out-of-the-box DLMs for various use cases. The company claims its DLMs can run up to 10x faster than traditional LLMs while costing 10x less.

“Our ‘small’ coding model is as good as [OpenAI’s] GPT-4o mini while more than 10 times as fast,” a company spokesperson told TechCrunch. “Our ‘mini’ model outperforms small open-source models like [Meta’s] Llama 3.1 8B and achieves more than 1,000 tokens per second.”

“Tokens” is industry parlance for bits of raw data. One thousand tokens per second is an impressive speed indeed, assuming Inception’s claims hold up.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Apple iPhone 16e review: An A18 chip and Apple Intelligence for $599

Apple delivered its latest budget handset, the $599 iPhone 16e, without pomp. There was no big event...

Europe’s Relay pulls in $35M Series A after applying Asia’s model to delivery

Being somewhat later than Europe in adopting the idea of parcel delivery, much of Asia built its...

Lonestar and Phison’s data center infrastructure is headed to the moon

Data storage and resilience company Lonestar and semiconductor and storage company Phison launched a data center infrastructure...

Commercetools, a pioneer in ‘headless commerce’, lays off dozens of staff

Commercetools — a “headless commerce” platform that provides APIs to companies building online storefronts — saw a...

Shop Circle raises $60M to encircle ecommerce with an app suite

The boom in ecommerce post-pandemic meant shops moved online. However, some merchants ended up with dozens of...

Maternity clinic Millie nabs $12M Series A from an all-star, all female class of VCs

Millie, a California-based maternity clinic, founded by Anu Sharma, announced the raise of a $12 million Series...

Avride’s sidewalk delivery bots land in Japan

Avride sidewalk bots will start delivering restaurant orders and groceries in central Tokyo this week through a...

Here are all the tech companies rolling back DEI or still committed to it — so far

Companies around America have started cutting DEI programs and eliminating DEI commitments from public documents in response...