Security

#Article #NLP #LanguageModel #AIAgents #One-Line Notes
Issue Date: 2025-10-31 Introducing Aardvark: OpenAI’s agentic security researcher, OpenAI, 2025.10 Comment

元ポスト:

Loading…

> In benchmark testing on “golden” repositories, Aardvark identified 92% of known and synthetically-introduced vulnerabilities, demonstrating high recall and real-world effectiveness.

合成された脆弱性については92%程度検出できたとのこと。Claudeとかだとこの辺はどの程度の性能なのだろう。