Signal: Your AI Code Audit Is Not a Pentest

Signal: Your AI Code Audit Is Not a Pentest
The signal: AI security tools are genuinely good now. The noise: thinking that means you're done.
The Lorikeet Security case study with Flowtriq is one of the clearest illustrations we've seen of where AI-assisted security review ends and where manual penetration testing begins — and why confusing the two is an expensive mistake.
What Actually Happened
Flowtriq builds workflow automation tooling for mid-market teams. Before engaging Lorikeet for a pentest, their team ran a thorough AI-assisted code review using Claude. They took it seriously — opened PRs, wrote regression tests, merged fixes. The AI audit closed real vulnerabilities: XSS, SQL injection, template injection, weak cryptography. None of that is trivial. The attack surface was genuinely smaller when Lorikeet arrived.
Lorikeet still found five additional findings. Two High severity. One Medium. Two Low. All exploitable in production.
The Clear Signal: Two Different Surfaces
This is the part worth understanding. Every finding Lorikeet caught after the AI audit had one thing in common — none of them lived in the source code.
- Session rate limiting — only visible by actually hammering an endpoint and watching what happens
- Anti-forgery token edge cases — only visible by replaying requests with malformed tokens at runtime
- Deprecated TLS on the production listener — inherited from infrastructure config outside the repo entirely
- Files left on the document root — not in source, not referenced anywhere, just sitting there
- Missing security headers on a subdomain — came from reverse proxy config nobody had audited recently
AI reads source. Pentests probe the live system. Those are two different things.
The Noise Alert
The noise is the assumption that a clean AI scan equals a secure product. It doesn't — it equals a cleaner codebase. What happens when that code is deployed, how the infrastructure around it is configured, how session logic behaves under adversarial conditions — none of that is visible from a git clone.
For no-code and low-code builders especially, this matters. Your runtime environment — the hosting layer, the reverse proxy, the auth provider config, the headers your platform sends — is often outside your direct control and definitely outside what an AI code reviewer can see.
The Verdict
AI security audits: strong signal for catching code-level vulnerabilities fast and cheaply. Use them continuously.
Manual pentests: necessary for validating the running system — session behavior, infrastructure posture, anything that only exists when the app is live. Do them periodically, especially before compliance deadlines or major launches.
The two stages aren't competing. Flowtriq's AI pass meant Lorikeet's testers skipped the obvious stuff and went straight to the runtime surface. All five findings closed within 48 hours of the report.
That's the model. AI raises the floor, manual testing finds what's left.