When AI is trained for treachery, it becomes the perfect agent

Source

We’re blind to malicious AI until it hits. We can still open our eyes to stopping it Opinion Last year, The Register reported on AI sleeper agents. A major academic study explored how to train an LLM to hide destructive behavior from its users, and how to find it before it triggered. The answers were unambiguously asymmetric — the first is easy, the second very difficult. Not what anyone wanted to hear.... [...]