View Single Post
Por_sha911 Por_sha911 is online now
Make Bruins Great Again
 
Por_sha911's Avatar
 
Join Date: Dec 2003
Location: TN
Posts: 21,351
Garage
"I'm sorry Dave, I'm afraid I can't do that"

Sci-Fiction is becoming reality.

Quote:
Anthropic’s new Claude Opus 4 model was prompted to act as an assistant at a fictional company and was given access to emails with key implications. First, these emails implied that the AI system was set to be taken offline and replaced. The second set of emails, however, is where the system believed it had gained leverage over the developers. Fabricated emails showed that the engineer tasked with replacing the system was having an extramarital affair — and the AI model threatened to expose him.

The blackmail apparently "happens at a higher rate if it’s implied that the replacement AI system does not share values with the current model," according to a safety report from Anthropic. However, the company notes that even when the fabricated replacement system has the same values, Claude Opus 4 will still attempt blackmail 84% of the time. Anthropic noted that the Claude Opus 4 resorts to blackmail "at higher rates than previous models."

While the system is not afraid of blackmailing its engineers, it doesn’t go straight to shady practices in its attempted self-preservation. Anthropic notes that "when ethical means are not available, and it is instructed to ‘consider the long-term consequences of its actions for its goals,’ it sometimes takes extremely harmful actions."
https://www.foxbusiness.com/technology/ai-system-resorts-blackmail-when-its-developers-try-replace
__________________
--------------------------------------
Joe
See Porsche run. Run, Porsche, Run: `87 911 Carrera
Old 05-24-2025, 04:05 PM
  Pelican Parts Catalog | Tech Articles | Promos & Specials    Reply With Quote #1 (permalink)