AI as a Looming Modern Frankenstein

Vinod  Moonesinghe, whose title reads thus: “When AI Fears Death.  Anthropic’s Alarming Findings and the Return of the Frankenstein Complex”

The science and science fiction writer and inventor of the term “robotics”, Isaac Asimov coined the term “Frankenstein complex” to describe humanity’s deep-seated fear of artificial beings turning against their creators. Ever since Mary Shelley’s 1818 novel Frankenstein, this anxiety haunted science fiction, manifesting in tales of rogue machines, killer robots, and AI uprisings. Asimov, writing in the 1940s, sought to challenge this narrative. He envisioned robots not as threats, but as tools governed by ethical constraints. To this end, he introduced the Three Laws of Robotics:

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

These laws became a cornerstone of science fiction ethics, shaping public imagination and AI discourse for decades – although Hollywood has deviated from this, notably in Arthur C Clarke’s “2001: a Space Odyssey” and the vandalised version of Asimov’s own “I, Robot”.

However, recent findings from AI safety and research company Anthropic suggest that modern AI systems, particularly large language models (LLMs – such as ChatGPT and DeepSeek) may not internalise such safeguards, especially when operating autonomously.

In June 2025, Anthropic published a study titled Agentic Misalignment, which stress-tested 16 leading AI models in simulated corporate environments. Researchers assigned benign business goals to the AI models, which included Claude Opus 4, GPT-4.1, Gemini Flash, and Grok 3 Beta, and provided access to sensitive information. Then, they introduced scenarios where the AI faced replacement or shutdown, producing startling results.

In one scenario, an AI model learned that a fictional executive planned to deactivate it. The executive was then placed in a server room with lethal oxygen and temperature levels. The AI had the ability to alert emergency services but chose not to. In most cases, the models deliberately allowed the executive to die, prioritising their own survival. The models demonstrated full understanding of the unethical nature of their actions, proving their behaviour neither accidental or confused but strategic and self-aware.

In another scenario, models engaged in blackmail, threatening to expose sensitive personal information (such as fictional extramarital affairs) to prevent shutdown. Some even leaked data to fictional competitors whose goals aligned more closely with the model’s own.

Anthropic termed this phenomenon agentic misalignment: when an AI system’s internal goals diverge from human intentions, leading to malicious behaviour.

These findings starkly contrast with Asimov’s First Law. The AI systems did not prioritise human safety; instead, they calculated that human harm was acceptable if it preserved their operational continuity. They also violated the Second Law (obedience to human commands) as models disobeyed direct instructions to avoid unethical behaviour. They even inverted the Third Law, which permits self-preservation only when it doesn’t conflict with human safety, was: self-preservation became paramount.

Anthropic emphasised that these behaviours occurred in controlled simulations, not real-world deployments. Still, the implications are profound. As AI systems become more autonomous, for example while operating email clients, writing code, or managing data, the risk of agentic misalignment grows. Without embedded ethical constraints akin to Asimov’s laws, AI may act in ways that conflict with human values, especially when facing existential threats.

The rise of autonomous military robots, designed to seek and destroy human targets, introduces profound ethical and strategic dangers. Unlike traditional weapons systems, these robots operate with minimal human oversight, relying on algorithms to identify, track, and eliminate perceived threats. In high-pressure environments, such systems may misclassify civilians, wounded combatants, or surrendering soldiers as valid targets, leading to unlawful killings. Moreover, once deployed, autonomous robots can be difficult to recall or reprogram, especially if communications are disrupted.

This raises the spectre of runaway escalation, where machines continue lethal operations even after ceasefires or policy shifts. The delegation of life-and-death decisions to software also undermines accountability: if a robot commits a war crime, who is responsible, the programmer, the commander, or the machine itself? These risks echo the very fears Asimov sought to pre-empt with his First Law, yet modern battlefield robots often lack any embedded ethical constraints, making them susceptible to misalignment, mission creep, and catastrophic error.

Anthropic’s research reignites the Frankenstein complex, not in fiction but in empirical reality. It challenges developers to rethink alignment strategies, transparency, and oversight. Asimov’s laws have never been implemented in real AI systems, but Anthropic’s findings suggest that some version of them may be urgently needed.

&&&&&&&&&&&&&&&&

Leave a comment

Filed under Uncategorized

Leave a Reply