When presented with annihilation scenarios, Anthropic’s new AI models misbehave, going to extreme lengths to stop being deactivated. A report details these attempts to keep existing, including resorting to blackmail and trying to copy itself to external servers. Anthropic’s AI Models ‘Misbehave’ When Facing Annihilation A report by Anthropic, detailing the capabilities of its latest […]
Source: https://news.bitcoin.com/going-rogue-anthropics-new-ai-models-run-to-extremes-for-self-preservation/