Nvidia’s new tool lets you run GenAI models on a PC


Share post:

Nvidia, ever keen to incentivize purchases of its latest GPUs, is releasing a tool that lets owners of GeForce RTX 30 Series and 40 Series cards run an AI-powered chatbot offline on a Windows PC.

Called Chat with RTX, the tool allows users to customize a GenAI model along the lines of OpenAI’s ChatGPT by connecting it to documents, files and notes that it can then query.

“Rather than searching through notes or saved content, users can simply type queries,” Nvidia writes in a blog post. “For example, one could ask, ‘What was the restaurant my partner recommended while in Las Vegas?’ and Chat with RTX will scan local files the user points it to and provide the answer with context.”

Chat with RTX defaults to AI startup Mistral’s open source model but supports other text-based models including Meta’s Llama 2. Nvidia warns that downloading all the necessary files will eat up a fair amount of storage — 50GB to 100GB, depending on the model(s) selected.

Currently, Chat with RTX works with text, PDF, .doc and .docx and .xml formats. Pointing the app at a folder containing any supported files will load the files into the model’s fine-tuning data set. In addition, Chat with RTX can take the URL of a YouTube playlist to load transcriptions of the videos in the playlist, enabling whichever model’s selected to query their contents.

Now, there’s certain limitations to keep in mind, which Nvidia to its credit outlines in a how-to guide.

Chat with RTX

Image Credits: Nvidia

Chat with RTX can’t remember context, meaning that the app won’t take into account any previous questions when answering follow-up questions. For example, if you ask “What’s a common bird  in North America?” and follow that up with “What are its colors?,” Chat with RTX won’t know that you’re talking about birds.

Nvidia also acknowledges that the relevance of the app’s responses can be affected by a range of factors, some easier to control for than others — including the question phrasing, the performance of the selected model and the size of the fine-tuning data set. Asking for facts covered in a couple of documents is likely to yield better
results than asking for a summary of a document or set of documents. And response quality will generally improve with larger data sets — as will pointing Chat with RTX at more content about a specific subject, Nvidia says.

So Chat with RTX is more a toy than anything to be used in production. Still, there’s something to be said for apps that make it easier to run AI models locally — which is something of a growing trend.

In a recent report, the World Economic Forum predicted a “dramatic” growth in affordable devices that can run GenAI models offline, including PCs, smartphones, internet of things devices and networking equipment. The reasons, the WEF said, are the clear benefits: not only are offline models inherently more private — the data they process never leaves the device they run on — but they’re lower latency and more cost effective than cloud-hosted models.

Of course, democratizing tools to run and train models opens the door to malicious actors — a cursory Google Search yields many listings for models fine-tuned on toxic content from unscrupulous corners of the web. But proponents of apps like Chat with RTX argue that the benefits outweigh the harms. We’ll have to wait and see.

Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Waymo can now charge for robotaxi rides in LA and on San Francisco freeways

Waymo received approval Friday afternoon from the California Public Utilities Commission to operate a commercial robotaxi service...

Rabbit’s Jesse Lyu on the nature of startups: ‘Grow faster, or die faster,’ just don’t give up

Rabbit co-founder and CEO Jesse Lyu isn’t afraid of death… the death of the company, at least....

Stay up-to-date on the amount of venture dollars going to underrepresented founders

Venture capital funding has never been robust for women or Black and brown founders. Alongside Crunchbase, we’ve...

MWC 2024: Everything announced so far, including Swayy’s app to tell friends where you’ll be next

The TechCrunch team is in Barcelona this week to bring you all the action going on at...

Is there anything AI can’t do?

Welcome to Startups Weekly — your weekly recap of everything you can’t miss from the world of...

Ultraleap is bringing haptic touch to cars and VR headsets

In May 2019, Ultrahaptics and Leap Motion became Ultraleap (not to be confused with Magic Leap, which...

Rants, AI and other notes from Upfront Summit

The venture capital stars were shining in Los Angeles this week at the Upfront Summit, an invite-only...

Threads says it will make its API broadly available by June

Meta-owned social network Threads said today that it will make its API broadly available to developers by...