https://www.theregister.com/2026/02/24/ai_finding_bugs/
https://www.anthropic.com/news/claude-code-security

Anthropic recently touted its Claude Code Security tool’s ability to discover over 500 vulnerabilities in production open-source codebases, positioning AI-powered code scanning as a transformative capability for cybersecurity teams. However, security researchers are pushing back on these claims, arguing that vulnerability discovery alone provides limited value without corresponding validation and remediation efforts. They pointed out that only two to three of the reported 500 vulnerabilities have been fixed, with the absence of CVE assignments suggesting the security process remains incomplete. Finding vulnerabilities was never the bottleneck in security operations, with the researcher noting that during his time managing vulnerability response at the Microsoft Security Response Center, the introduction of AI multiplied reports by up to 200 times while adding significant noise through unvalidated findings lacking concrete impact assessments.

The proliferation of AI-generated vulnerability reports is creating serious operational challenges for already overwhelmed open-source maintainers and security teams. The National Vulnerability Database faced a backlog of approximately 30,000 CVE entries awaiting analysis in 2025, with nearly two-thirds of reported open-source vulnerabilities lacking severity scores. This burden has forced some prominent projects to close their bug bounty programs entirely, with the curl project shutting down its program after being inundated with poorly crafted AI-generated reports that maintainers could not efficiently process. Security researchers warned that rather than helping the open-source community, the flood of unvalidated vulnerability candidates from AI tools may be accelerating the collapse of security infrastructure by overwhelming maintainers who lack resources to properly triage and address the volume of reports.

Security experts acknowledge that AI has dramatically reduced the cost of vulnerability discovery, with models becoming increasingly adept at exploring codebases and reasoning across complex software components. However, the challenging work begins after discovery, requiring validation of findings, confirmation of affected versions, assessment of real-world impact, coordination with maintainers, and development of patches that align with project architectures. As AI security tools become more powerful and widespread, it is predicted that security teams will face an escalating deluge of patches, upgrades, and emergency fixes constrained by maintainers’ capacity to prioritise, test, and implement changes safely. The competitive advantage in this emerging landscape will belong not to those generating the most vulnerability findings, but to organisations that can efficiently convert discoveries into prioritised, low-disruption remediation actions.