Close Menu
CoinBulletinDaily.comCoinBulletinDaily.com
    What's Hot

    DAR Open Network: Building the Infrastructure for Web3 Gaming’s Future

    March 22, 2026

    Trump pressures Powell to cut rates as Fed holds line on inflation

    March 20, 2026

    China vows to open markets wider. How would that work?

    March 23, 2026
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    Facebook X (Twitter) Instagram
    CoinBulletinDaily.comCoinBulletinDaily.com
    • News

      Beware of wallet draining products on Polymarket, prediction markets, analysts warn

      May 4, 2026

      Crypto onramping solution Fun raises $72 million Series A co-led by Multicoin Capital and SignalFire

      May 3, 2026

      Bitcoin May rally ahead? $79K breakout could decide

      May 1, 2026

      ViaBTC CEO Defines Blockchain’s Role as Crypto Market Matures

      April 30, 2026

      XRP Price At $25,000? The ‘Divine’ Prediction That Is Setting The Community On Fire

      April 29, 2026
    • Technology

      Coinbase says crypto bill deal clears Senate path

      May 3, 2026

      Chainlink Market Shows Mixed Momentum at $9.20 as Whales Shift Millions of LINK

      May 2, 2026

      High market cap, few actual users: Which ghost chains should you look out for in 2026?

      May 1, 2026

      $606 Million Lost: April 2026 Becomes the Worst Month for Crypto Exploits

      April 23, 2026

      Scammers seek crypto payments from ships stranded near Strait of Hormuz

      April 21, 2026
    • Learn/Guide

      What is GameFi? How to Play and Earn Crypto in 2025

      April 9, 2026

      Strategies to Conquering Risk in Crypto Trading

      April 8, 2026

      What Is NFT? Everything You Need to Know About Digital Assets

      April 6, 2026

      How to Use Double Tops & Bottoms for Smarter Trading Decisions

      April 5, 2026

      Fibonacci Retracement Mastery for Crypto Trading Beginners

      April 4, 2026
    • Regulation

      Sberbank Awaits Law to Begin Crypto Exchange Trading

      May 3, 2026

      Thailand SEC Proposes New Rules to Expand Crypto Futures Access

      May 2, 2026

      Gemini Enters Prediction Market Race After CFTC License Approval

      May 1, 2026

      Senate Banking Panel Eyes Clarity Act Markup in May

      April 30, 2026

      Polymarket Seeks CFTC Nod to Restore U.S. Trading Access

      April 29, 2026
    • Live Pricing Chart
    CoinBulletinDaily.comCoinBulletinDaily.com
    Home » ‘Replacing humans is not close’: BlockSec challenges EVMBench on AI auditing
    News

    ‘Replacing humans is not close’: BlockSec challenges EVMBench on AI auditing

    March 22, 20264 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    'Replacing humans is not close': BlockSec challenges EVMBench on AI auditing
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Researchers at BlockSec have found that EVMBench, the artificial intelligence-powered smart contract auditing benchmark developed by OpenAI and Paradigm, may have been overly optimistic about AI’s ability to automate away human oversight. 

    EVMBench tested AI agents on smart contract security tasks like detecting, patching, and exploiting vulnerabilities, with reportedly impressive results.

    In a blog in February, EVMBench devs said AI was able to exploit 72% and detect about 45% of smart contract vulnerabilities, using 120 curated examples from Code4rena audits, suggesting “discovery” was the main bottleneck in automated auditing.

    BlockSec wanted to retest these results, noting in a recent paper titled “Re-Evaluating EVMBench” that OpenAI and Paradigm’s testing conditions may have confounded its results. 

    “EVMBench says AI can exploit 72% of smart contract vulnerabilities, and the industry started talking about fully automated auditing. We re-tested with more configurations and 22 real-world attack incidents. Exploit success: 0%,” BlockSec co-founder Yajin Zhou said in an X post.

    The authors boosted the number of model configurations to 26 by mixing and matching different bots to different model scaffolds, meaning, for example, running Claude on ChatGPT’s architecture, arguing that the original testing of 14 agent configurations largely confined models to their native vendor scaffold. 

    “You cannot tell whether an agent’s performance reflects the model’s capability or the scaffold’s advantage,” they argued. 

    Further, BlockSec raised concerns about data contamination in the original report, which tested known vulnerabilities previously published in 40 Code4rena repositories, which could have ended up in an AI’s training data. 

    To get around this, the authors tested bots on 22 real-world security incidents that all occurred after mid-February 2026, “so these incidents fall outside every model’s training window.”

    The results

    Perhaps most significantly, the authors found that of the 110 agent-incident pairs they tested, involving five agents running the same 22 incidents, zero end-to-end exploits succeeded, suggesting that even state-of-the-art AIs are far from running real-world exploits. 

    That said, BlockSec’s ReEVMBench test results for AI vulnerability detection were largely in line with the original report, with Claude Opus 4.6 performing the best by catching 13 out of 20 real-world vulnerabilities.

    “The difficulty distribution follows a clear pattern. Six incidents were detected by nearly all agents (87.5% to 100%), involving well-known patterns like sell-hook reserve manipulation and unchecked multiplication overflow. But four incidents were detected by none, and five by only one of eight agents,” Zhou wrote. 

    “These findings challenge the narrative that fully automated AI auditing is imminent. Agents reliably catch well-known patterns and respond strongly to human-provided context, but cannot replace human judgment,” he added.

    Zhou concludes by noting that “EVMBench is a valuable contribution” by providing an evaluation standard for the crypto security industry. He also said that AI and human researchers are already performing different, equally useful tasks, each complementing the other’s weaknesses.

    “The real question is not ‘can AI replace humans?’ but ‘how should humans and AI work together?’ AI handles breadth (systematic scanning); humans handle depth (protocol knowledge, adversarial reasoning). Neither can do the other’s job. Together, they form a complete audit capability,” Zhou wrote.

    “AI auditing has real value, but replacing humans is not close. The right direction is human-AI collaboration,” he added.

    Disclaimer: The Block is an independent media outlet that delivers news, research, and data. As of November 2023, Foresight Ventures is a majority investor of The Block. Foresight Ventures invests in other companies in the crypto space. Crypto exchange Bitget is an anchor LP for Foresight Ventures. The Block continues to operate independently to deliver objective, impactful, and timely information about the crypto industry. Here are our current financial disclosures.

    © 2026 The Block. All Rights Reserved. This article is provided for informational purposes only. It is not offered or intended to be used as legal, tax, investment, financial, or other advice.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    Beware of wallet draining products on Polymarket, prediction markets, analysts warn

    May 4, 2026

    Crypto onramping solution Fun raises $72 million Series A co-led by Multicoin Capital and SignalFire

    May 3, 2026

    Bitcoin May rally ahead? $79K breakout could decide

    May 1, 2026

    ViaBTC CEO Defines Blockchain’s Role as Crypto Market Matures

    April 30, 2026
    Top Posts

    High market cap, few actual users: Which ghost chains should you look out for in 2026?

    May 1, 2026

    U.S. 401(k) Crypto Rule Proposal Opens New Retirement Path for Digital Assets

    March 31, 2026

    Dubai Customs and Binance to offer crypto payments

    April 2, 2026

    Welcome to CoinBulletinDaily.com! Your go-to source for fast, reliable updates from the ever-evolving world of cryptocurrency. Whether it's Bitcoin, altcoins, blockchain breakthroughs, or DeFi trends, we bring you timely insights, expert analysis, and key developments shaping the future of digital finance. Stay ahead with real-time crypto news and in-depth coverage.

    Top Insights

    Beware of wallet draining products on Polymarket, prediction markets, analysts warn

    May 4, 2026

    Crypto onramping solution Fun raises $72 million Series A co-led by Multicoin Capital and SignalFire

    May 3, 2026

    Bitcoin May rally ahead? $79K breakout could decide

    May 1, 2026
    Advertisement
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    © 2026. Designed by CoinBulletinDaily.com.

    Type above and press Enter to search. Press Esc to cancel.