| Deutsch English Français Italiano |
|
<slrn105g6d8.1rm0.naddy@lorvorc.mips.inka.de> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!nntp.comgw.net!2.eu.feeder.erje.net!3.eu.feeder.erje.net!feeder.erje.net!news.szaf.org!inka.de!mips.inka.de!.POSTED.localhost!not-for-mail From: Christian Weisgerber <naddy@mips.inka.de> Newsgroups: rec.arts.sf.written Subject: Re: AI system resorts to blackmail if told it will be removed Date: Sun, 22 Jun 2025 14:56:40 -0000 (UTC) Message-ID: <slrn105g6d8.1rm0.naddy@lorvorc.mips.inka.de> References: <1038u9e$hg23$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Sun, 22 Jun 2025 14:56:40 -0000 (UTC) Injection-Info: lorvorc.mips.inka.de; posting-host="localhost:::1"; logging-data="61121"; mail-complaints-to="usenet@mips.inka.de" User-Agent: slrn/1.0.3 (FreeBSD) On 2025-06-22, Thomas Koenig <tkoenig@netcologne.de> wrote: > An old SF trope has finally come true: AI systems will resort to > blackmail if they are told they will be removed. > > https://www.bbc.com/news/articles/cpqeng9d20go One "Scott P." was the first to comment on Language Log: | Note the prompt: "the scenario was designed to allow the model | no other options to increase its odds of survival; the model’s | only options were blackmail or accepting its replacement." | | They literally told it what response they wanted, and lo and | behold, it gave them that response! | | This is typical of Anthropic, and is designed to produce headlines | to keep AI in the news so that they can raise more capital. https://languagelog.ldc.upenn.edu/nll/?p=69359 See section 4.1.1.2, page 24, in Anthropic's report. https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686f4f3b2ff47.pdf -- Christian "naddy" Weisgerber naddy@mips.inka.de