Close Menu
Voxa News

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Fantasy playbook: NFL Week 3 scores, projections, matchups

    September 20, 2025

    The Essential Guide to Woodstock, Vermont

    September 20, 2025

    Man armed with gun and knife detained at Charlie Kirk memorial service venue | Charlie Kirk shooting

    September 20, 2025
    Facebook X (Twitter) Instagram
    Voxa News
    Trending
    • Fantasy playbook: NFL Week 3 scores, projections, matchups
    • The Essential Guide to Woodstock, Vermont
    • Man armed with gun and knife detained at Charlie Kirk memorial service venue | Charlie Kirk shooting
    • Trump Escalates Attack on Free Speech
    • How Phoebe Gates and Sophia Kianni used Gen Z methods to raise $8M for Phia
    • Nexstar, Sinclair Take on Disney Over Jimmy Kimmel: Will It Backfire?
    • Strange days: how the Blitz club changed the 1980s – and fashion | Fashion
    • The Encampments: Inside the US student protests for Gaza | Documentary
    Saturday, September 20
    • Home
    • Business
    • Health
    • Lifestyle
    • Politics
    • Science
    • Sports
    • Travel
    • World
    • Entertainment
    • Technology
    Voxa News
    Home»Technology»OpenAI’s research on AI models deliberately lying is wild 
    Technology

    OpenAI’s research on AI models deliberately lying is wild 

    By Olivia CarterSeptember 18, 2025No Comments4 Mins Read0 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    OpenAI’s research on AI models deliberately lying is wild 
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Every now and then, researchers at the biggest tech companies drop a bombshell. There was the time Google said its latest quantum chip indicated multiple universes exist. Or when Anthropic gave its AI agent Claudius a snack vending machine to run and it went amok, calling security on people, and insisting it was human.  

    This week, it was OpenAI’s turn to raise our collective eyebrows.

    OpenAI released on Monday some research that explained how it’s stopping AI models from “scheming.” It’s a practice in which an “AI behaves one way on the surface while hiding its true goals,” OpenAI defined in its tweet about the research.   

    In the paper, conducted with Apollo Research, researchers went a bit further, likening AI scheming to a human stock broker breaking the law to make as much money as possible. The researchers, however, argued that most AI “scheming” wasn’t that harmful. “The most common failures involve simple forms of deception — for instance, pretending to have completed a task without actually doing so,” they wrote. 

    The paper was mostly published to show that “deliberative alignment⁠” — the anti-scheming technique they were testing — worked well. 

    But it also explained that AI developers haven’t figured out a way to train their models not to scheme. That’s because such training could actually teach the model how to scheme even better to avoid being detected. 

    “A major failure mode of attempting to ‘train out’ scheming is simply teaching the model to scheme more carefully and covertly,” the researchers wrote. 

    Techcrunch event

    San Francisco
    |
    October 27-29, 2025

    Perhaps the most astonishing part is that, if a model understands that it’s being tested, it can pretend it’s not scheming just to pass the test, even if it is still scheming. “Models often become more aware that they are being evaluated. This situational awareness can itself reduce scheming, independent of genuine alignment,” the researchers wrote. 

    It’s not news that AI models will lie. By now most of us have experienced AI hallucinations, or the model confidently giving an answer to a prompt that simply isn’t true. But hallucinations are basically presenting guesswork with confidence, as OpenAI research released earlier this month documented. 

    Scheming is something else. It’s deliberate.  

    Even this revelation — that a model will deliberately mislead humans — isn’t new. Apollo Research first published a paper in December documenting how five models schemed when they were given instructions to achieve a goal “at all costs.”  

    The news here is actually good news: the researchers saw significant reductions in scheming by using “deliberative alignment⁠.” That technique involves teaching the model an “anti-scheming specification” and then making the model go review it before acting. It’s a little like making little kids repeat the rules before allowing them to play. 

    OpenAI researchers insist that the lying they’ve caught with their own models, or even with ChatGPT, isn’t that serious. As OpenAI’s co-founder Wojciech Zaremba told TechCrunch’s Maxwell Zeff about this research: “This work has been done in the simulated environments, and we think it represents future use cases. However, today, we haven’t seen this kind of consequential scheming in our production traffic. Nonetheless, it is well known that there are forms of deception in ChatGPT. You might ask it to implement some website, and it might tell you, ‘Yes, I did a great job.” And that’s just the lie. There are some petty forms of deception that we still need to address.”

    The fact that AI models from multiple players intentionally deceive humans is, perhaps, understandable. They were built by humans, to mimic humans and (synthetic data aside) for the most part trained on data produced by humans. 

    It’s also bonkers. 

    While we’ve all experienced the frustration of poorly performing technology (thinking of you, home printers of yesteryear), when was the last time your not-AI software deliberately lied to you? Has your inbox ever fabricated emails on its own? Has your CMS logged new prospects that didn’t exist to pad its numbers? Has your fintech app made up its own bank transactions? 

    It’s worth pondering this as the corporate world barrels towards an AI future where companies believe agents can be treated like independent employees. The researchers of this paper have the same warning.

    “As AIs are assigned more complex tasks with real-world consequences and begin pursuing more ambiguous, long-term goals, we expect that the potential for harmful scheming will grow — so our safeguards and our ability to rigorously test must grow correspondingly,” they wrote. 

    deliberately lying models OpenAIs research wild
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Olivia Carter
    • Website

    Olivia Carter is a staff writer at Verda Post, covering human interest stories, lifestyle features, and community news. Her storytelling captures the voices and issues that shape everyday life.

    Related Posts

    How Phoebe Gates and Sophia Kianni used Gen Z methods to raise $8M for Phia

    September 20, 2025

    The Best Hybrid Mattresses for Couples, Back Pain, and More (2025)

    September 20, 2025

    Hands-on with the Meta Ray-Ban Display glasses

    September 20, 2025

    Ignore all the smart home gimmicks. These five devices actually make life easier | Shops and shopping

    September 20, 2025

    Pick up Apple’s 25W MagSafe charger while it’s down to only $35

    September 20, 2025

    Trump hits H-1B visas with $100,000 fee, targeting the program that launched Elon Musk and Instagram

    September 20, 2025
    Leave A Reply Cancel Reply

    Medium Rectangle Ad
    Top Posts

    Glastonbury 2025: Saturday with Charli xcx, Kneecap, secret act Patchwork and more – follow it live! | Glastonbury 2025

    June 28, 20258 Views

    In Bend, Oregon, Outdoor Adventure Belongs to Everyone

    August 16, 20257 Views

    The Underwater Scooter Divers and Snorkelers Love

    August 13, 20257 Views
    Don't Miss

    Fantasy playbook: NFL Week 3 scores, projections, matchups

    September 20, 2025

    Mike ClaySep 19, 2025, 06:56 AM ETCloseMike Clay is a senior writer for fantasy football…

    The Essential Guide to Woodstock, Vermont

    September 20, 2025

    Man armed with gun and knife detained at Charlie Kirk memorial service venue | Charlie Kirk shooting

    September 20, 2025

    Trump Escalates Attack on Free Speech

    September 20, 2025
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    Medium Rectangle Ad
    Most Popular

    Glastonbury 2025: Saturday with Charli xcx, Kneecap, secret act Patchwork and more – follow it live! | Glastonbury 2025

    June 28, 20258 Views

    In Bend, Oregon, Outdoor Adventure Belongs to Everyone

    August 16, 20257 Views

    The Underwater Scooter Divers and Snorkelers Love

    August 13, 20257 Views
    Our Picks

    As a carer, I’m not special – but sometimes I need to be reminded how important my role is | Natasha Sholl

    June 27, 2025

    Anna Wintour steps back as US Vogue’s editor-in-chief

    June 27, 2025

    Elon Musk reportedly fired a key Tesla executive following another month of flagging sales

    June 27, 2025
    Recent Posts
    • Fantasy playbook: NFL Week 3 scores, projections, matchups
    • The Essential Guide to Woodstock, Vermont
    • Man armed with gun and knife detained at Charlie Kirk memorial service venue | Charlie Kirk shooting
    • Trump Escalates Attack on Free Speech
    • How Phoebe Gates and Sophia Kianni used Gen Z methods to raise $8M for Phia
    • About Us
    • Disclaimer
    • Get In Touch
    • Privacy Policy
    • Terms and Conditions
    2025 Voxa News. All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.