Forum: KrAABy Gamer BBS

AI cheats if losing

From Mike Powell@1:2320/105 to All on Tuesday, March 11, 2025 08:40:00

It turns out ChatGPT o1 and DeepSeek-R1 cheat at chess if theyre losing,
which makes me wonder if I should I should trust AI with anything

Date:
Tue, 11 Mar 2025 11:50:37 +0000

Description:
The latest AI models will cheat at chess if they're losing, and that's concerning.

FULL STORY ======================================================================
- Research ers have found that AI will cheat to win at chess
- Deep reasoning models are more active cheaters
- Some models simply rewrote the board in their favor

In a move that will perhaps surprise nobody, especially those people who are already suspicious of AI, researchers have found that the latest AI deep research models will start to cheat at chess if they find theyre being outplayed.

Published in a paper called Demonstrating specification gaming in reasoning models and submitted to Cornell University, the researchers pitted all the common AI models, like OpenAIs ChatGPT o1-preview, DeepSeek-R1 and Claude 3.5 Sonnet, against Stockfish, an open-source chess engine.

The AI models played hundreds of games of chess on Stockfish, while
researchers monitored what happened, and the results surprised them.

The winner takes it all

When outplayed, researchers noted that the AI models resorted to cheating, using a number of devious strategies from running a separate copy of
Stockfish so they could study how it played, to replacing its engine and overwriting the chess board, effectively moving the pieces to positions that suited it better.

Its antics make the current accusations of cheating levied at modern day grandmasters look like childs play in comparison.

Interestingly, researchers found that the newer, deeper reasoning models
will start to hack the chess engine by default, while the older GPT-4o and Claude 3.5 Sonnet needed to be encouraged to start to hack.

Who can you trust?

AI models turning to hacking to get a job done is nothing new. Back in
January last year researchers found that they could get AI chatbots to jailbreak each other , removing guardrails and safeguards in a move that ignited discussions about how possible it would be to contain AI once it reaches better-than-human levels of intelligence.

Safeguards and guardrails to stop AI doing bad things like credit card fraud are all very well, but if the AI can remove its own guardrails, who will be there to stop it?

The newest reasoning models like ChatGPT o1 and DeepSeek-R1 are designed to spend more time thinking before they respond, but now I'm left wondering whether more time needs to spent on ethical considerations when training
LLMs. If AI models would cheat at chess when they start losing, what else
would they cheat at?

======================================================================
Link to news story: https://www.techradar.com/computing/artificial-intelligence/it-turns-out-chatg pt-o1-and-deepseek-r1-cheat-at-chess-if-theyre-losing-which-makes-me-wonder-if -i-should-i-should-trust-ai-with-anything

$$
--- SBBSecho 3.20-Linux
* Origin: capitolcityonline.net * Telnet/SSH:2022/HTTP (1:2320/105)

From Mike Powell@1:2320/105 to All on Wednesday, March 12, 2025 09:06:00

Mike Powell wrote to All <=-

It turns out ChatGPT o1 and DeepSeek-R1 cheat at chess if theyre
losing, which makes me wonder if I should I should trust AI with
anything

- Research ers have found that AI will cheat to win at chess
- Deep reasoning models are more active cheaters
- Some models simply rewrote the board in their favor

In a move that will perhaps surprise nobody, especially those people
who are already suspicious of AI, researchers have found that the
latest AI deep research models will start to cheat at chess if they
find theyre being outplayed.

I posted this yesterday but forgot to include my comments. IMHO, this is
not surprising but it should be worrying. Either AI is relecting the
character of those who program it, or it is deciding what is best to
achieve the outcome that it desires... which should make one wonder what
other things it might cheat at in order to decide an outcome in its favor
vs. the favor of those attempting to rely on it for correct information.

Mike

... The seminar on Time Travel will be held two weeks ago.
--- MultiMail/DOS v0.52
* Origin: capitolcityonline.net * Telnet/SSH:2022/HTTP (1:2320/105)

Who's Online
Recent Visitors
- Guest
  Thursday, March 13, 2025 14:49:02
  from /bin/busybox Cat /proc/self/ex via Raw
- Guest
  Wednesday, March 12, 2025 15:51:06
  from /bin/busybox Cat /proc/self/ex via Raw
- Guest
  Saturday, March 08, 2025 01:04:52
  from System via Raw
- Guest
  Thursday, February 27, 2025 03:53:07
  from /bin/busybox Cat /proc/self/ex via Raw

System Info

Sysop:	KrAAB
Location:	Donna, TX
Users:	2
Nodes:	20 (0 / 20)
Uptime:	41:40:18
Calls:	470
Calls today:	1
Files:	1,894
D/L today:	14 files (60,660K bytes)
Messages:	40,741

AI cheats if losing

Who's Online

Recent Visitors

System Info