Stable Diffusion 3 arrives to solidify early lead in AI imagery against Sora and Gemini

Stability AI has announced Stable Diffusion 3, the latest and most powerful version of the company’s image-generating AI model. While details are scant, it’s clearly an attempt to fend off the hype around recently announced competitors from OpenAI and Google.

We’ll have a more technical breakdown of all this soon, but for now you should know that Stable Diffusion 3 (SD3) is based on a new architecture and will work on a variety of hardware (though you’ll still need something beefy). It’s not out yet, but you can sign up for the waitlist here.

SD3 uses an updated “diffusion transformer,” a technique pioneered in 2022 but revised in 2023 and reaching scalability now. Sora, OpenAI’s impressive video generator, apparently works on similar principles (Will Peebles, co-author of the paper, went on to co-lead the Sora project). It also employs “flow matching,” another new technique that similarly improves quality without adding too much overhead.

The model suite ranges from 800 million parameters (less than the commonly used SD 1.5) to 8 billion parameters (more than SD XL), with the intent of running on a variety of hardware. You’ll probably still want a serious GPU and a setup intended for machine learning work, but you aren’t limited to an API like you generally are with OpenAI and Google models. (Anthropic, for its part, has not focused on image or video generation publicly, so it isn’t really part of this conversation.)

On X, formerly Twitter, Stable Diffusion boss Emad Mostaque notes that the new model is capable of multimodal understanding, as well as video input and generation, all things that his rivals have emphasized in their API-driven competitors. Those capabilities are still theoretical, but it sounds like there is no technical barrier to them being included in future releases.

It’s impossible to compare these models, of course, since none are really released and all we have to go on are competing claims and cherry-picked examples. But Stable Diffusion has one definite advantage: its presence in the zeitgeist as the go-to model for doing any kind of image generation anywhere, with few intrinsic limitations in method or content. (Indeed, SD3 will almost surely usher in a new era of AI-generated porn, once they get past the safety mechanisms.)

Stable Diffusion seems to want to be the white label generative AI that you can’t do without, rather than the boutique generative AI you aren’t sure you need. To that end, the company is upgrading its tooling as well, to lower the bar for use, though as with the rest of the announcement, these improvements are left to the imagination.

Interestingly, the company has put safety front and center in its announcement, stating:

We have taken and continue to take reasonable steps to prevent the misuse of Stable Diffusion 3 by bad actors. Safety starts when we begin training our model and continues throughout the testing, evaluation, and deployment. In preparation for this early preview, we’ve introduced numerous safeguards. By continually collaborating with researchers, experts, and our community, we expect to innovate further with integrity as we approach the model’s public release.

What exactly are these safeguards? No doubt the preview will delineate them somewhat, and then the public release will be further refined, or censored depending on your perspective on these things. We’ll know more soon, and in the meantime will be diving into the technical side of things to better understand the theory and methods behind this new generation of models.

Source link

Stable Diffusion 3 arrives to solidify early lead in AI imagery against Sora and Gemini

Recent posts

TechCrunch Space: Good night, Odysseus

LoanDepot says about 17 million customers had personal data and Social Security numbers stolen during cyberattack

DEI backlash: Stay up-to-date on the latest legal and corporate challenges

Meta opens Quest OS to third-party headset makers, taps Lenovo and Xbox as partners

NFT fantasy sports startup Sorare lays off 13% of staff, as web3 gaming continues to sputter

Tesla layoffs, Cybertruck recalls and Serve Robotics goes public

Behold, TruckBot

Apex Legends hacker said he hacked tournament games ‘for fun’

Tumblr CEO publicly spars with trans user over account ban, revealing private account names in the process

Pinterest says its AI-powered collages are now more engaging than Pins

Hundreds of AI luminaries sign letter calling for anti-deepfake legislation

Ex-Tesla exec leading Ford skunkworks project to develop low-cost EV

India’s Paytm is in flux

The DOJ’s case against Apple adds to a growing pile of antitrust problems for Cupertino

After 6-year hiatus, Stripe to start taking crypto payments, starting with USDC stablecoin

Related articles

Rowing startup Hydrow acquires a majority stake in Speede Fitness as their CEO steps down

TikTok will automatically label AI-generated content created on platforms like DALL·E 3

India weighs delaying caps on UPI market share in win for PhonePe, Google Pay

Thai food delivery app Line Man Wongnai weighs IPO in Thailand, US in 2025

Apple’s ‘Crush’ ad is disgusting

OpenAI offers a peek behind the curtain of its AI’s secret instructions

US Patent and Trademark Office confirms another leak of filers’ address data

Encrypted services Apple, Proton and Wire helped Spanish police identify activist

Company

Follow us