GoForLaunch
Back
Accuracy benchmark

How accurate are the deep rules?

Each of GoForLaunch's structural data-flow rules is measured against paired fixtures: a vulnerable sample that must be flagged (recall) and a correctly-scoped safe sample that must not be (specificity). The numbers below are recomputed from the live scanner engine on every page load — they cannot drift from the rules they describe.

100%
Recall

34/34 vulnerable fixtures correctly flagged.

100%
Specificity

37/37 safe fixtures correctly cleared (no false positives).

71
Paired fixtures

34 vulnerable + 37 safe across 21 rules.

By rule

IDORrecall 4/4 · specificity 7/7

Reads the actual where-keys of find/update/delete in [id] routes; flags id-only queries with no owner column and no post-fetch ownership check.

Mass mutationrecall 2/2 · specificity 2/2

Flags Prisma deleteMany/updateMany with no (or empty) where clause — a single call that rewrites every row.

Concurrencyrecall 1/1 · specificity 1/1

Flags read-modify-write (findUnique → update/upsert) check-then-act races; ignores idempotent findUnique → delete.

Supabase RLSrecall 2/2 · specificity 4/4

Flags PostgREST .delete()/.update() with no filter in the chain; context-aware so matches inside strings/comments are ignored.

Input Validationrecall 1/1 · specificity 2/2

Alias trackingrecall 2/2 · specificity 2/2

Missing-await authrecall 2/2 · specificity 1/1

JWT decoderecall 2/2 · specificity 1/1

Identity fallbackrecall 2/2 · specificity 1/1

Taint (SQL)recall 1/1 · specificity 1/1

Taint (SSRF)recall 1/1 · specificity 0/0

Taint (path)recall 1/1 · specificity 0/0

Taint (eval)recall 1/1 · specificity 0/0

Taint (interproc)recall 1/1 · specificity 1/1

Taint (clean)recall 0/0 · specificity 1/1

Billingrecall 1/1 · specificity 2/2

Launch Configrecall 1/1 · specificity 2/2

Scalabilityrecall 1/1 · specificity 2/2

Launch Assetsrecall 4/4 · specificity 4/4

Build Configrecall 2/2 · specificity 1/1

Emailrecall 2/2 · specificity 2/2

Methodology

  • Every fixture is a real code snippet run through the same deterministic engine the product ships (71 fixtures total). No LLM passes are involved in these numbers.
  • Recall counts vulnerable fixtures that are flagged. Specificity counts safe fixtures that are correctly left alone — the false-positive control.
  • Context-awareness is part of the safe set: vulnerable patterns appearing only inside strings or comments must not be flagged.
  • The full fixture list lives in lib/scanner/benchmark-cases.ts and is asserted in CI by tests/scanner-benchmark.test.ts.
  • The engine has three layers: fast regex rules, AST-lite structural rules (with one-hop alias tracking), and a real recursive-descent parser feeding a cross-module interprocedural taint engine that tracks request input into dangerous sinks through helpers defined in other files, clearing taint at sanitizers and dropping import edges that don't resolve uniquely.
  • What this is not: a real-world accuracy guarantee. These numbers prove the rules behave as designed on a curated set; on an arbitrary repo, false positives and negatives will be higher. Taint is cross-module within the scanned set but doesn't follow third-party packages, and the regex rules and optional LLM passes are not part of this set. GoForLaunch is a high-precision scanner for the Next.js/Supabase/Stripe stack — strong, but not a full replacement for mature SAST like CodeQL.
See it on your own repo

These rules run on your code in minutes — or preview a full report first.

Sample reportStart free
Scanner Accuracy Benchmark | GoForLaunch