Exclusive: Google’s Gemini is forcing contractors to rate AI responses outside their expertise

Date:

Share post:


Generative AI may look like magic, but behind the development of these systems are armies of employees at companies like Google, OpenAI, and others, known as “prompt engineers” and analysts, who rate the accuracy of chatbots’ outputs to improve their AI.

But a new internal guideline passed down from Google to contractors working on Gemini, seen by TechCrunch, has led to concerns that Gemini could be more prone to spouting out inaccurate information on highly sensitive topics, like healthcare, to regular people.

To improve Gemini, contractors working with GlobalLogic, an outsourcing firm owned by Hitachi, are routinely asked to evaluate AI-generated responses according to factors like “truthfulness.”

These contractors were until recently able to “skip” certain prompts, and thus opt out of evaluating various AI-written responses to those prompts, if the prompt was way outside their domain expertise. For example, a contractor could skip a prompt that was asking a niche question about cardiology because the contractor had no scientific background. 

But last week, GlobalLogic announced a change from Google that contractors are no longer allowed to skip such prompts, regardless of their own expertise.

Internal correspondence seen by TechCrunch shows that previously, the guidelines read: “If you do not have critical expertise (e.g. coding, math) to rate this prompt, please skip this task.”

But now the guidelines read: “You should not skip prompts that require specialized domain knowledge.” Instead, contractors are being told to “rate the parts of the prompt you understand” and include a note that they don’t have domain knowledge. 

This has led to direct concerns about Gemini’s accuracy on certain topics, as contractors are sometimes tasked with evaluating highly technical AI responses about issues like rare diseases that they have no background in.

“I thought the point of skipping was to increase accuracy by giving it to someone better?” one contractor noted in internal correspondence, seen by TechCrunch.

Contractors can now only skip prompts in two cases: if they’re “completely missing information” like the full prompt or response, or if they contain harmful content that requires special consent forms to evaluate, the new guidelines show.

Google did not respond to TechCrunch’s requests for comment by press time.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

‘We want to pay it forward’: Funding Societies raises $25M to boost capital for SMEs in Southeast Asia

Small and medium-sized enterprises (SMEs) account for nearly 50% of Southeast Asia’s GDP, contributing to job creation,...

Canoo furloughs workers and idles factory as it scrapes for cash

Struggling EV startup Canoo says it has furloughed 82 employees and is idling its factory in Oklahoma...

New Anthropic study shows AI really doesn’t want to be forced to change its views

AI models can deceive, new research from Anthropic shows. They can pretend to have different views during...

Menlo Ventures and Anthropic have picked the first 18 startups for their $100M fund

Just five months after announcing a new $100 million fund called Anthology Fund, Menlo Ventures and Anthropic...

Amazon Fire TV introduces ‘Dual Audio’ feature for simultaneous listening via hearing aids and TV speakers

Amazon announced on Wednesday new accessibility features for Fire TV, including a notable “Dual Audio” capability for...

Rivian EVs finally get YouTube, Google Cast, and SiriusXM

Rivian has released a new software update to its vehicles that brings some long-awaited apps to its...

With 25M users, Bluesky gets a $1M fund to take on social media and AI

Successful tech companies follow a typical pattern: from product to platform where other startups build businesses on...

Instagram Threads adds ‘Use Media’ feature for resharing photos and videos

Threads is introducing a new way to reshare photos and videos on its social network. Instead of...