// docs · v1.0

Safety, cost, and cache

Three pre-send guards — secret scanner, .commitbrief/ guard, cost preflight — plus the local response cache and how to manage it.

CommitBrief runs three guards before any LLM call and keeps a local response cache so re-running a review on an unchanged diff is essentially free.

Secret scanner

Pre-send guard that flags credential-shaped strings in the diff before the provider call. Prevents accidental upload of API keys, private keys, tokens, etc. to a third-party LLM.

What it scans

The diff — only added lines (+-prefixed, excluding the +++ b/path header). Removed and context lines are skipped — the goal is to catch new leaks, not re-flag history that is already on disk somewhere.
User-authored rules content — any non-default COMMITBRIEF.md or OUTPUT.md. These join the system prompt verbatim, so a credential pasted into either file would leak just as surely as one in the diff. Embedded defaults are presumed clean and skipped.

Patterns matched

Name	Pattern (regex)
AWS Access Key	`AKIA[0-9A-Z]{16}`
GitHub Token	`gh[pousr]_[A-Za-z0-9]{36,}`
GitLab Token	`glpat-[A-Za-z0-9_-]{20,}`
Anthropic API Key	`sk-ant-[A-Za-z0-9_-]{40,}`
OpenAI API Key	`sk-(?:proj-\|live-)?[A-Za-z0-9]{40,}`
JWT	`eyJ[A-Za-z0-9_-]{8,}\.eyJ[A-Za-z0-9_-]{8,}\.[A-Za-z0-9_-]{8,}`
Stripe Live Key	`sk_live_[A-Za-z0-9]{24,}`
PEM Private Key	`-----BEGIN [A-Z ]*PRIVATE KEY-----`

What the user sees

⚠  Possible secrets detected in diff (2 line(s)):
   line 42: AWS Access Key
   line 87: GitHub Token, OpenAI API Key

   Send to LLM anyway? [y/N]:

The matched substring is never printed — only line numbers and pattern names. This keeps the secret out of stderr (and any CI log that captures it) even when the scanner fires.

Behavior matrix

Context	Outcome
TTY, no `--allow-secrets`, no `--yes`	Prompt; default `no` → abort.
TTY, `--yes`	Prompt still shows. `--yes` does not bypass the scanner.
Non-TTY, no `--allow-secrets`	Abort with `Aborted (non-interactive); pass --allow-secrets to override.`
Any context, `--allow-secrets`	Skip the scanner entirely.
`guard.secret_scan: false` in config	Skip the scanner entirely.

The matched-line list is still printed to stderr in the --allow-secrets case — you opted in, but you still see what was flagged.

Disabling

# Per-invocation
commitbrief --staged --allow-secrets

# Per-config (whole binary)
commitbrief config set guard.secret_scan false

The whole-binary opt-out is appropriate if you have a separate secrets-management layer (gitleaks, trufflehog, pre-commit hooks) doing the same job and the prompt is just noise.

Cost preflight

Pre-send guard that estimates the dollar cost of a review and prompts/aborts when it exceeds the configured threshold. Catches “oops, I pasted a 50k-line generated file” before tokens are actually spent.

When it runs

Right after the cache lookup, before the provider call. A cache hit skips the preflight entirely (no provider call, no cost).

How the estimate is computed

estimated_input_tokens   = chars / 4   (well-known approximation)
estimated_output_tokens  = clamp(input / 4, 200, 1500)
estimated_cost           = provider.Pricing(model).Cost(...)

Output tokens are capped at 1500 and floored at 200. Underestimating output matters: a high-priced output model would otherwise systematically hide its actual bill.

Threshold

cost:
  warn_threshold_usd: 0.50    # default; <=0 disables

Bump it for scheduled jobs (a CI runner doing 100 reviews/day cares less about a single $1 review):

commitbrief config set cost.warn_threshold_usd 5.0

Behavior matrix

Estimate	Context	Outcome
`<= threshold`	any	Silent. Review proceeds.
`> threshold`	TTY, no `--no-cost-check`, no `--yes`	Prompt; default no → abort.
`> threshold`	TTY, `--yes`	Prompt still shows. `--yes` does not bypass.
`> threshold`	Non-TTY, no `--no-cost-check`	Abort.
any	`--no-cost-check`	Skip the preflight entirely.

Providers with zero pricing

Ollama, claude-cli, and gemini-cli return Pricing{} (all zero). The preflight short-circuits silently — estimated cost is always 0, never above the threshold. The verbose footer shows — instead of a dollar figure.

The `.commitbrief/` guard

Any diff touching files under .commitbrief/ (excluding the root COMMITBRIEF.md and .commitbriefignore) prompts before any LLM call. The rationale: .commitbrief/ files are usually user-specific (per-repo config, OUTPUT.md template) and committing them may break other developers’ configurations or leak API keys.

Context	Outcome
TTY	Prompt; `n` → abort, `y` → proceed.
Non-TTY, no `--yes`	Auto-abort.
Any, `--yes`	Proceed without prompting.

This is the one guard --yes still bypasses (since v0.9.1 the secret scanner and cost preflight have dedicated bypass flags).

Cache

CommitBrief caches LLM responses on disk so re-running a review on an unchanged diff is essentially free.

Cache key

SHA-256 over the concatenation:

diff_text + system_prompt + provider_name + model + lang_code + schema_version

Change any one input and you get a fresh review:

Editing the diff (staging more files, amending the commit, etc.).
Editing COMMITBRIEF.md (system prompt differs).
Switching providers (--provider gemini).
Switching models (--model gpt-4o-mini).
Switching locale (--lang tr).

Location

<repo-root>/.commitbrief/cache/ — one JSON file per cached review. Per-repo: commitbrief cache clear in one repo does not touch another repo’s cache. Gitignored automatically.

TTL and disable

cache:
  enabled: true      # false skips reads + writes entirely
  ttl_days: 7        # 0 → DefaultTTL (7 days); cannot be negative

Per-invocation: --no-cache to force a fresh provider call.

Management commands

# Wipe everything for this repo.
commitbrief cache clear

# Bounded cleanup with defaults (keep newest 500 + last 7 days).
commitbrief cache prune

# Trim aggressively.
commitbrief cache prune --keep-last 100 --older-than 1d

# Per-provider/model scope.
commitbrief cache prune --provider anthropic --model claude-opus-4-7

commitbrief cache prune accepts duration units d / w / m (30 days) / y (365 days). Decimals, negatives, and stdlib h/m/s shorthand all reject — so off-by-one surprises are impossible.

On-disk format

{
  "version": 1,
  "created_at": "2026-05-27T18:29:13Z",
  "ttl": 604800,
  "key": {
    "diff_hash": "sha256:abc123...",
    "system_prompt_hash": "sha256:def456...",
    "provider": "anthropic",
    "model": "claude-opus-4-7",
    "lang": "en"
  },
  "result": {
    "content": "<LLM response>",
    "format": "json",
    "tokens": { "input": 2105, "output": 526, "cached": 0 }
  }
}

result.format is json on the happy path, markdown-fallback when the LLM produced unparseable JSON twice, or plain-text for CLI-tool-backed providers.

Cache-hit reporting

On a hit, the verbose footer shows Saved: $X (the cost figure that would have been spent) and tokens are marked as (local cache hit). The pipeline still rebuilds the prompt (so the dry-run cache-key matches) but skips the provider call.

Secret scanner

What it scans

Patterns matched

What the user sees

Behavior matrix

Disabling

Cost preflight

When it runs

How the estimate is computed

Threshold

Behavior matrix

Providers with zero pricing

The .commitbrief/ guard

Cache

Cache key

Location

TTL and disable

Management commands

On-disk format

Cache-hit reporting

See also

The `.commitbrief/` guard