Close Menu
CoinBulletinDaily.comCoinBulletinDaily.com
    What's Hot

    Polymarket Targets $400M Raise at $15B Valuation as Prediction Markets Surge

    April 20, 2026

    The Future Of Tech: How Blockchain AI And Will Converge By Late 2026

    March 18, 2026

    Kroll gives Ripple Prime ‘medium quality’ rating citing strong backing, concentration risks

    April 2, 2026
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    Facebook X (Twitter) Instagram
    CoinBulletinDaily.comCoinBulletinDaily.com
    • News

      ASTER jumps 20% after Aster ties nearly all platform fees to token buybacks

      June 17, 2026

      Standard Chartered Watches Three Signals For A Bitcoin Botto

      June 15, 2026

      Aster is Predicted to Drop to $0.477166 By Jun 19, 2026

      June 14, 2026

      U.S.-India trade deal hits G7 stage with a new 12.5% tariff threat hanging over talks

      June 13, 2026

      LG unleashes Arbitrum ad network, ARB price jumps 10%

      June 11, 2026
    • Technology

      SpaceX seeks $20B bond deal as Elon Musk faces selloff

      June 18, 2026

      Moody’s Credit Ratings Go Live on Solana as Institutional RWA Push Expands

      June 17, 2026

      Avalanche Sentiment Slides as FUD Peaks Amid Developer Growth Concerns

      June 16, 2026

      CFTC hires SEC crypto adviser as digital asset debate heats up

      June 15, 2026

      HBAR Whale Activity Surges Ahead of Hedera v0.74 Upgrade and Supply Shift Pressure

      June 14, 2026
    • Learn/Guide

      Wadoozie ($WADZ): The Ethereum Memecoin With a 48-State Tour and Hidden Token Rewards

      May 7, 2026

      What is GameFi? How to Play and Earn Crypto in 2025

      April 9, 2026

      Strategies to Conquering Risk in Crypto Trading

      April 8, 2026

      What Is NFT? Everything You Need to Know About Digital Assets

      April 6, 2026

      How to Use Double Tops & Bottoms for Smarter Trading Decisions

      April 5, 2026
    • Regulation

      Hungary Crypto Overhaul Targets EU MiCA Alignment and Market Return

      June 12, 2026

      Over 200 Crypto Groups Urge Senate Clarity Act Vote

      June 8, 2026

      UK FCA Warns Football Clubs Over Crypto Sponsorship Deals

      June 3, 2026

      SEC Charges Texas Man Nathan Fuller in $12.3M AI Crypto Trading Bot Fraud Case

      June 1, 2026

      UK Adds HTX to Russia Sanctions List Over A7, Garantex Ties

      May 27, 2026
    • Live Pricing Chart
    CoinBulletinDaily.comCoinBulletinDaily.com
    Home » ‘Replacing humans is not close’: BlockSec challenges EVMBench on AI auditing
    News

    ‘Replacing humans is not close’: BlockSec challenges EVMBench on AI auditing

    March 22, 20264 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    'Replacing humans is not close': BlockSec challenges EVMBench on AI auditing
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Researchers at BlockSec have found that EVMBench, the artificial intelligence-powered smart contract auditing benchmark developed by OpenAI and Paradigm, may have been overly optimistic about AI’s ability to automate away human oversight. 

    EVMBench tested AI agents on smart contract security tasks like detecting, patching, and exploiting vulnerabilities, with reportedly impressive results.

    In a blog in February, EVMBench devs said AI was able to exploit 72% and detect about 45% of smart contract vulnerabilities, using 120 curated examples from Code4rena audits, suggesting “discovery” was the main bottleneck in automated auditing.

    BlockSec wanted to retest these results, noting in a recent paper titled “Re-Evaluating EVMBench” that OpenAI and Paradigm’s testing conditions may have confounded its results. 

    “EVMBench says AI can exploit 72% of smart contract vulnerabilities, and the industry started talking about fully automated auditing. We re-tested with more configurations and 22 real-world attack incidents. Exploit success: 0%,” BlockSec co-founder Yajin Zhou said in an X post.

    The authors boosted the number of model configurations to 26 by mixing and matching different bots to different model scaffolds, meaning, for example, running Claude on ChatGPT’s architecture, arguing that the original testing of 14 agent configurations largely confined models to their native vendor scaffold. 

    “You cannot tell whether an agent’s performance reflects the model’s capability or the scaffold’s advantage,” they argued. 

    Further, BlockSec raised concerns about data contamination in the original report, which tested known vulnerabilities previously published in 40 Code4rena repositories, which could have ended up in an AI’s training data. 

    To get around this, the authors tested bots on 22 real-world security incidents that all occurred after mid-February 2026, “so these incidents fall outside every model’s training window.”

    The results

    Perhaps most significantly, the authors found that of the 110 agent-incident pairs they tested, involving five agents running the same 22 incidents, zero end-to-end exploits succeeded, suggesting that even state-of-the-art AIs are far from running real-world exploits. 

    That said, BlockSec’s ReEVMBench test results for AI vulnerability detection were largely in line with the original report, with Claude Opus 4.6 performing the best by catching 13 out of 20 real-world vulnerabilities.

    “The difficulty distribution follows a clear pattern. Six incidents were detected by nearly all agents (87.5% to 100%), involving well-known patterns like sell-hook reserve manipulation and unchecked multiplication overflow. But four incidents were detected by none, and five by only one of eight agents,” Zhou wrote. 

    “These findings challenge the narrative that fully automated AI auditing is imminent. Agents reliably catch well-known patterns and respond strongly to human-provided context, but cannot replace human judgment,” he added.

    Zhou concludes by noting that “EVMBench is a valuable contribution” by providing an evaluation standard for the crypto security industry. He also said that AI and human researchers are already performing different, equally useful tasks, each complementing the other’s weaknesses.

    “The real question is not ‘can AI replace humans?’ but ‘how should humans and AI work together?’ AI handles breadth (systematic scanning); humans handle depth (protocol knowledge, adversarial reasoning). Neither can do the other’s job. Together, they form a complete audit capability,” Zhou wrote.

    “AI auditing has real value, but replacing humans is not close. The right direction is human-AI collaboration,” he added.

    Disclaimer: The Block is an independent media outlet that delivers news, research, and data. As of November 2023, Foresight Ventures is a majority investor of The Block. Foresight Ventures invests in other companies in the crypto space. Crypto exchange Bitget is an anchor LP for Foresight Ventures. The Block continues to operate independently to deliver objective, impactful, and timely information about the crypto industry. Here are our current financial disclosures.

    © 2026 The Block. All Rights Reserved. This article is provided for informational purposes only. It is not offered or intended to be used as legal, tax, investment, financial, or other advice.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    ASTER jumps 20% after Aster ties nearly all platform fees to token buybacks

    June 17, 2026

    Standard Chartered Watches Three Signals For A Bitcoin Botto

    June 15, 2026

    Aster is Predicted to Drop to $0.477166 By Jun 19, 2026

    June 14, 2026

    U.S.-India trade deal hits G7 stage with a new 12.5% tariff threat hanging over talks

    June 13, 2026
    Top Posts

    US government transfers seized BTC to Coinbase as $22B Bitcoin stockpile expands

    April 11, 2026

    Here’s What Historical Data Says Is Coming Next

    May 5, 2026

    Tim Draper claims Bitcoin is safer than Banks in Quantum era

    June 11, 2026

    Welcome to CoinBulletinDaily.com! Your go-to source for fast, reliable updates from the ever-evolving world of cryptocurrency. Whether it's Bitcoin, altcoins, blockchain breakthroughs, or DeFi trends, we bring you timely insights, expert analysis, and key developments shaping the future of digital finance. Stay ahead with real-time crypto news and in-depth coverage.

    Top Insights

    ASTER jumps 20% after Aster ties nearly all platform fees to token buybacks

    June 17, 2026

    Standard Chartered Watches Three Signals For A Bitcoin Botto

    June 15, 2026

    Aster is Predicted to Drop to $0.477166 By Jun 19, 2026

    June 14, 2026
    Advertisement
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    © 2026. Designed by CoinBulletinDaily.com.

    Type above and press Enter to search. Press Esc to cancel.