• AI cheats if losing

    From Mike Powell@1:2320/105 to All on Tuesday, March 11, 2025 08:40:00
    It turns out ChatGPT o1 and DeepSeek-R1 cheat at chess if theyre losing,
    which makes me wonder if I should I should trust AI with anything

    Date:
    Tue, 11 Mar 2025 11:50:37 +0000

    Description:
    The latest AI models will cheat at chess if they're losing, and that's concerning.

    FULL STORY ======================================================================
    - Research ers have found that AI will cheat to win at chess
    - Deep reasoning models are more active cheaters
    - Some models simply rewrote the board in their favor

    In a move that will perhaps surprise nobody, especially those people who are already suspicious of AI, researchers have found that the latest AI deep research models will start to cheat at chess if they find theyre being outplayed.

    Published in a paper called Demonstrating specification gaming in reasoning models and submitted to Cornell University, the researchers pitted all the common AI models, like OpenAIs ChatGPT o1-preview, DeepSeek-R1 and Claude 3.5 Sonnet, against Stockfish, an open-source chess engine.

    The AI models played hundreds of games of chess on Stockfish, while
    researchers monitored what happened, and the results surprised them.

    The winner takes it all

    When outplayed, researchers noted that the AI models resorted to cheating, using a number of devious strategies from running a separate copy of
    Stockfish so they could study how it played, to replacing its engine and overwriting the chess board, effectively moving the pieces to positions that suited it better.

    Its antics make the current accusations of cheating levied at modern day grandmasters look like childs play in comparison.

    Interestingly, researchers found that the newer, deeper reasoning models
    will start to hack the chess engine by default, while the older GPT-4o and Claude 3.5 Sonnet needed to be encouraged to start to hack.

    Who can you trust?

    AI models turning to hacking to get a job done is nothing new. Back in
    January last year researchers found that they could get AI chatbots to jailbreak each other , removing guardrails and safeguards in a move that ignited discussions about how possible it would be to contain AI once it reaches better-than-human levels of intelligence.

    Safeguards and guardrails to stop AI doing bad things like credit card fraud are all very well, but if the AI can remove its own guardrails, who will be there to stop it?

    The newest reasoning models like ChatGPT o1 and DeepSeek-R1 are designed to spend more time thinking before they respond, but now I'm left wondering whether more time needs to spent on ethical considerations when training
    LLMs. If AI models would cheat at chess when they start losing, what else
    would they cheat at?

    ======================================================================
    Link to news story: https://www.techradar.com/computing/artificial-intelligence/it-turns-out-chatg pt-o1-and-deepseek-r1-cheat-at-chess-if-theyre-losing-which-makes-me-wonder-if -i-should-i-should-trust-ai-with-anything

    $$
    --- SBBSecho 3.20-Linux
    * Origin: capitolcityonline.net * Telnet/SSH:2022/HTTP (1:2320/105)
  • From Mike Powell@1:2320/105 to All on Wednesday, March 12, 2025 09:06:00
    Mike Powell wrote to All <=-

    It turns out ChatGPT o1 and DeepSeek-R1 cheat at chess if theyre
    losing, which makes me wonder if I should I should trust AI with
    anything

    - Research ers have found that AI will cheat to win at chess
    - Deep reasoning models are more active cheaters
    - Some models simply rewrote the board in their favor

    In a move that will perhaps surprise nobody, especially those people
    who are already suspicious of AI, researchers have found that the
    latest AI deep research models will start to cheat at chess if they
    find theyre being outplayed.

    I posted this yesterday but forgot to include my comments. IMHO, this is
    not surprising but it should be worrying. Either AI is relecting the
    character of those who program it, or it is deciding what is best to
    achieve the outcome that it desires... which should make one wonder what
    other things it might cheat at in order to decide an outcome in its favor
    vs. the favor of those attempting to rely on it for correct information.

    Mike


    ... The seminar on Time Travel will be held two weeks ago.
    --- MultiMail/DOS v0.52
    * Origin: capitolcityonline.net * Telnet/SSH:2022/HTTP (1:2320/105)