How it works
The two-stage review pipeline — fast summary in about 10 seconds, deep review in under a minute.
Every pull request triggers two reviews, not one. A fast summary lands in the PR description within about 10 seconds so you have something to read while the deeper analysis runs. The deep review follows in 30 to 90 seconds with severity-tagged inline comments. This page walks through the pipeline and the design choices behind it — useful if you want to understand what the bot is doing while you wait, or you're debugging why a review didn't appear.
The high-level flow
When you open a PR or push a new commit, GitHub sends a webhook to Revvu. The webhook handler does one thing: verify the signature and put two jobs on the queue, one for the fast summary and one for the deep review. Both run in parallel. The summary writes to the PR description. The deep review writes inline comments on changed lines.
- GitHub sends a webhook the moment the PR opens or you push.
- The webhook handler verifies the signature and queues two jobs.
- The summary job runs fast — about 10 seconds — and updates the PR description.
- The deep review job runs longer — 30 to 90 seconds for a typical PR — and posts inline comments.
- Each step inside the deep review is independently retryable, so a transient failure in one step doesn't restart the whole pass.
Why two stages
Reviewers want two different things from an AI reviewer, and they want them on different timelines. They want a quick scan they can read while context is still loaded — what the PR does, where the risk is. And they want a thorough analysis that's worth waiting for — line-level findings with reasoning. Trying to deliver both in a single call means the fast thing is too slow and the thorough thing is too shallow. Splitting them lets each stage optimize for its own job.
Inside the deep review
The deep review is more than a single LLM call. It's a sequence of steps, each one independently retryable, that work together to produce comments that are accurate, on the right lines, and not duplicates of comments already on the PR.
- Context gathering — Before any analysis runs, the bot pulls the changed files, the surrounding file contents, related test files, and a couple of levels of imports. A finding is more useful when the bot can see what calls the changed function.
- File grouping — On larger PRs, related files are grouped together so they get reviewed in the same call. A 12-file refactor might fan out as six groups in parallel rather than 12 isolated reviews.
- Parallel analysis — Each group runs as its own step, with concurrency capped so the upstream rate bucket stays safe. Each group has its own time budget — one slow group doesn't drag the others.
- Fix detection — Comments from previous pushes are checked one by one to see whether the underlying issue has been fixed in this push. Fixed comments get a friendly acknowledgement and the thread is auto-collapsed.
- Deduplication — Before posting, new findings are matched against existing comments. A near-duplicate that says the same thing in different words is filtered out.
Reliability by design
Every step in the pipeline runs as its own retryable unit. If posting comments fails because of a rate limit, only the post step retries — the analysis isn't repeated. If the model times out on one file group, the other groups still complete. Non-essential steps like the in-progress status row, context enrichment, and per-repo learnings are wrapped in try/catch and degrade silently if the underlying service has a hiccup. Your review still gets posted.
Performance
A typical PR — a handful of files, a few hundred changed lines — sees the summary land in about 10 seconds and the deep review complete in 30 to 90 seconds. Larger PRs scale by file groups: a 30-file PR fans out into more groups, but each group still runs in parallel, so the wall-clock time grows slowly rather than linearly.
| Stage | Typical time | Output |
|---|---|---|
| Webhook handoff | Under 50 ms | Acknowledgement only |
| Fast summary | ~10 seconds | PR description block |
| Deep review (small PR, 1–3 files) | 30–45 seconds | Inline comments + check run |
| Deep review (medium PR, 4–10 files) | 45–75 seconds | Inline comments + check run |
| Deep review (large PR, 10+ files) | 60–120 seconds | Inline comments + check run |
The pipeline in detail
Here's the full sequence the deep review runs through, from webhook to comment. Each numbered step is its own retryable unit.
Step 1: Fetch the PR diff and head SHA from GitHub
Step 2: Create a status check on the PR (shows "in progress")
Step 3: Enrich context — file contents, adjacent tests, import graph
Step 4: Load repo-specific rules (.cursorrules, tsconfig, project conventions)
Step 5: Pull learnings the team has taught the bot on this repo
Step 6: Fetch our own previous comments on this PR
Step 6b: Group related files for parallel analysis
Step 7: Run analysis per file group (in parallel)
Step 8: Assess whether previously-flagged issues are now fixed
Step 9: Deduplicate new findings against existing comments
Step 10: Reconcile — split into fixed, persisting, novel
Step 11: Post novel inline comments + acknowledge fixed ones
Step 12: Auto-resolve threads for fixed comments
Step 13: Mark the status check as complete
Step 14: Save the review record to the databaseWhat the bot doesn't do
The bot doesn't store your code. Diffs and file contents are pulled from GitHub at review time, processed in memory, and discarded. Only the review comments themselves and metadata (timestamps, finding categories, severities) are saved — not the source code they refer to. The bot also doesn't review every PR automatically if you don't want it to: auto-review is a per-repo toggle in repo settings, and you can always trigger a review on demand by mentioning the bot in a comment.