Google’s new AI agent rewrites code to automate vulnerability fixes

Google’s new AI agent rewrites code to automate vulnerability fixes

Google DeepMind has deployed a new AI agent designed to autonomously find and fix critical security vulnerabilities in software code. The system, aptly-named CodeMender, has already contributed 72 security fixes to established open-source projects in the last six months.

Identifying and patching vulnerabilities is a notoriously difficult and time-consuming process, even with the aid of traditional automated methods like fuzzing. Google DeepMind’s own research, including AI-based projects such as Big Sleep and OSS-Fuzz, has proven effective at discovering new zero-day vulnerabilities in well-audited code. This success, however, creates a new bottleneck: as AI accelerates the discovery of flaws, the burden on human developers to fix them intensifies.

CodeMender is engineered to address this imbalance. It functions as an autonomous AI agent that takes a comprehensive approach to fix code security. Its capabilities are both reactive, allowing it to patch newly discovered vulnerabilities instantly, and proactive, enabling it to rewrite existing code to eliminate entire classes of security flaws before they can be exploited. This allows human developers and project maintainers to dedicate more of their time to building features and improving software functionality.

The system operates by leveraging the advanced reasoning capabilities of Google’s recent Gemini Deep Think models. This foundation allows the agent to debug and resolve complex security issues with a high degree of autonomy. To achieve this, the system is equipped with a set of tools that permit it to analyse and reason about code before implementing any changes. CodeMender also includes a validation process to ensure any modifications are correct and do not introduce new problems, known as regressions.

While large language models are advancing rapidly, a mistake when it comes to code security can have costly consequences. CodeMender’s automatic validation framework is therefore essential. It systematically checks that any proposed changes fix the root cause of an issue, are functionally correct, do not break existing tests, and adhere to the project’s coding style guidelines. Only high-quality patches that satisfy these stringent criteria are surfaced for human review.

To enhance its code fixing effectiveness, the DeepMind team developed new techniques for the AI agent. CodeMender employs advanced program analysis, utilising a suite of tools including static and dynamic analysis, differential testing, fuzzing, and SMT solvers. These instruments allow it to systematically scrutinise code patterns, control flow, and data flow to identify the fundamental causes of security flaws and architectural weaknesses.

The system also uses a multi-agent architecture, where specialised agents are deployed to tackle specific aspects of a problem. For example, a dedicated large language model-based critique tool reveals the differences between original and modified code. This allows the primary agent to verify that its proposed changes do not introduce unintended side effects and to self-correct its approach when necessary.

In one practical example, CodeMender addressed a vulnerability where a crash report indicated a heap buffer overflow. Although the final patch only required changing a few lines of code, the root cause was not immediately obvious. By using a debugger and code search tools, the agent determined the true problem was an incorrect stack management issue with Extensible Markup Language (XML) elements during parsing, located elsewhere in the codebase. In another case, the agent devised a non-trivial patch for a complex object lifetime issue, modifying a custom system for generating C code within the target project.

Beyond simply reacting to existing bugs, CodeMender is designed to proactively harden software against future threats. The team deployed the agent to apply -fbounds-safety annotations to parts of libwebp, a widely used image compression library. These annotations instruct the compiler to add bounds checks to the code, which can prevent an attacker from exploiting a buffer overflow to execute arbitrary code.

This work is particularly relevant given that a heap buffer overflow vulnerability in libwebp, tracked as CVE-2023-4863, was used by a threat actor in a zero-click iOS exploit several years ago. DeepMind notes that with these annotations in place, that specific vulnerability, along with most other buffer overflows in the annotated sections, would have been rendered unexploitable.

The AI agent’s proactive code fixing involves a sophisticated decision-making process. When applying annotations, it can automatically correct new compilation errors and test failures that arise from its own changes. If its validation tools detect that a modification has broken functionality, the agent self-corrects based on the feedback and attempts a different solution.

Despite these promising early results, Google DeepMind is taking a cautious and deliberate approach to deployment, with a strong focus on reliability. At present, every patch generated by CodeMender is reviewed by human researchers before being submitted to an open-source project. The team is gradually increasing its submissions to ensure high quality and to systematically incorporate feedback from the open-source community.

Looking ahead, the researchers plan to reach out to maintainers of critical open-source projects with CodeMender-generated patches. By iterating on community feedback, they hope to eventually release CodeMender as a publicly available tool for all software developers.

The DeepMind team also intends to publish technical papers and reports in the coming months to share their techniques and results. This work represents the first steps in exploring the potential of AI agents to proactively fix code and fundamentally enhance software security for everyone.

See also: CAMIA privacy attack reveals what AI models memorise

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security Expo, click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

Ledger

Be the first to comment

Leave a Reply

Your email address will not be published.


*