Signal: Your AI Code Audit Is Not a Pentest

The signal: AI security tools are genuinely good now. The noise: thinking that means you're done.

The Lorikeet Security case study with Flowtriq is one of the clearest illustrations we've seen of where AI-assisted security review ends and where manual penetration testing begins — and why confusing the two is an expensive mistake.

What Actually Happened

Flowtriq builds workflow automation tooling for mid-market teams. Before engaging Lorikeet for a pentest, their team ran a thorough AI-assisted code review using Claude. They took it seriously — opened PRs, wrote regression tests, merged fixes. The AI audit closed real vulnerabilities: XSS, SQL injection, template injection, weak cryptography. None of that is trivial. The attack surface was genuinely smaller when Lorikeet arrived.

Lorikeet still found five additional findings. Two High severity. One Medium. Two Low. All exploitable in production.

The Clear Signal: Two Different Surfaces

This is the part worth understanding. Every finding Lorikeet caught after the AI audit had one thing in common — none of them lived in the source code.

Session rate limiting — only visible by actually hammering an endpoint and watching what happens
Anti-forgery token edge cases — only visible by replaying requests with malformed tokens at runtime
Deprecated TLS on the production listener — inherited from infrastructure config outside the repo entirely
Files left on the document root — not in source, not referenced anywhere, just sitting there
Missing security headers on a subdomain — came from reverse proxy config nobody had audited recently

AI reads source. Pentests probe the live system. Those are two different things.

The Noise Alert

The noise is the assumption that a clean AI scan equals a secure product. It doesn't — it equals a cleaner codebase. What happens when that code is deployed, how the infrastructure around it is configured, how session logic behaves under adversarial conditions — none of that is visible from a git clone.

For no-code and low-code builders especially, this matters. Your runtime environment — the hosting layer, the reverse proxy, the auth provider config, the headers your platform sends — is often outside your direct control and definitely outside what an AI code reviewer can see.

The Verdict

AI security audits: strong signal for catching code-level vulnerabilities fast and cheaply. Use them continuously.

Manual pentests: necessary for validating the running system — session behavior, infrastructure posture, anything that only exists when the app is live. Do them periodically, especially before compliance deadlines or major launches.

The two stages aren't competing. Flowtriq's AI pass meant Lorikeet's testers skipped the obvious stuff and went straight to the runtime surface. All five findings closed within 48 hours of the report.

That's the model. AI raises the floor, manual testing finds what's left.

Read the full Lorikeet Security case study →