Documentation
Everything you need to get DiffFence running in your project.
Quick Start
Get up and running in under a minute.
Install
npm install -D difffence
Set your API key
# OpenAI (default)
export OPENAI_API_KEY=sk-...
# or Anthropic
export ANTHROPIC_API_KEY=sk-ant-...
Run on your feature branch
# From your feature branch with uncommitted or committed changes
npx difffence run
DiffFence will diff against your base branch (auto-detected), generate tests, run them on both branches, and show you what broke.
npx difffence init to generate a .difffencerc.json with your preferences. This is not required — DiffFence has sensible defaults.
How It Works
DiffFence runs a 6-stage pipeline on every diff. Each stage feeds into the next.
Intent-aware mutant generation
Unlike naive test generators, DiffFence uses a 3-phase approach:
- Infer intent — the LLM describes what the developer is trying to change
- Generate mutants — 3-5 plausible implementation mistakes for that intent
- Synthesize tests — tests that catch each mutant (pass on base, fail if the mutant were real)
CLI Reference
Commands
| Command | Description |
|---|---|
difffence run | Run the full pipeline on current diff |
difffence ci | CI mode — structured output, exit codes, auto-post to PR |
difffence init | Generate a .difffencerc.json config file |
difffence history | List previous runs stored in the local database |
difffence show <run_id> | View detailed results from a past run |
difffence feedback <catch_id> <rating> | Rate a catch (thumbs_up or thumbs_down) |
difffence tune | View auto-tuning recommendations based on feedback |
difffence usage | Show LLM token usage and cost summary |
difffence usage pr | Show per-PR usage breakdown |
difffence token create | Create an API token for programmatic access |
difffence token list | List existing API tokens |
difffence token revoke | Revoke an API token |
Flags
| Flag | Description | Default |
|---|---|---|
-b, --base <branch> | Base branch to diff against | auto-detect |
-p, --provider <name> | LLM provider (openai or anthropic) | openai |
-m, --model <model> | Model name | gpt-4o |
-f, --framework <name> | Test framework (jest, vitest, mocha) | auto-detect |
-t, --threshold <n> | Confidence threshold (0-1) | 0.6 |
-k, --api-key <key> | API key (overrides env var) | — |
--export | Export catching tests as files | off |
--pr-comment | Generate PR comment markdown | off |
--post-to-pr <n> | Auto-post results to GitHub PR #n | — |
--no-ensemble | Single LLM scoring instead of 3-vote | ensemble on |
--json | Machine-readable JSON output | off |
--verbose | Show detailed pipeline logs | off |
--output-format <fmt> | text, json, or github-actions | text |
--fail-threshold <n> | Min catches to fail CI (ci mode) | 1 |
Configuration
Create .difffencerc.json in your repo root, or run difffence init to generate one.
{
"provider": {
"name": "openai",
"model": "gpt-4o",
"apiKeyEnv": "OPENAI_API_KEY"
},
"analysis": {
"baseBranch": "main",
"maxFunctions": 20,
"ignorePaths": ["dist/", "node_modules/", "*.test.*"]
},
"generation": {
"framework": "jest",
"maxCandidatesPerFunction": 5,
"temperature": 0.3
},
"assessment": {
"threshold": 0.6,
"preset": "balanced"
},
"delivery": {
"prComments": { "enabled": true },
"testExport": { "enabled": false, "outputDir": "__catching_tests__" },
"dashboard": { "enabled": true, "port": 4200 }
}
}
Config fields
| Field | Description |
|---|---|
provider.name | openai or anthropic |
provider.model | Model to use (e.g. gpt-4o, claude-sonnet-4-20250514) |
provider.apiKeyEnv | Name of the env var holding the API key |
analysis.baseBranch | Branch to diff against (main, master, etc.) |
analysis.maxFunctions | Max functions to analyze per run |
analysis.ignorePaths | Glob patterns to exclude from analysis |
generation.framework | Test framework: jest, vitest, mocha |
generation.maxCandidatesPerFunction | Max test candidates generated per function |
generation.temperature | LLM temperature (lower = more deterministic) |
assessment.threshold | Min confidence to surface a catch (0-1) |
assessment.preset | strict (fewer, higher-quality) or balanced |
OPENAI_API_KEY, etc.) are always respected.
CI Integration
GitHub Actions
name: DiffFence
on: [pull_request]
jobs:
difffence:
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history needed
- uses: actions/setup-node@v4
with:
node-version: '20'
- run: npm ci
- name: Run DiffFence
run: npx difffence ci --output-format github-actions
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
What CI mode does
- Auto-detects the PR branch and base branch from
GITHUB_REF - Runs the full 6-stage pipeline
- Posts results as PR review comments (when
GITHUB_TOKENis set) - Uses GitHub Actions annotation format (
::error::,::warning::) - Exits with code 1 if regressions are found (configurable via
--fail-threshold)
fetch-depth: 0 in your checkout step. DiffFence needs full git history to diff between branches and create worktrees.
Web Dashboard
DiffFence includes a web UI for running the pipeline visually and browsing results.
npx tsx web-ui/server.ts
# Opens at http://localhost:4200
The dashboard provides:
- Pipeline runner — enter a repo path and run the pipeline with real-time SSE streaming
- Run history — browse past runs, catches, and their verdicts
- Analytics — precision stats, coverage heatmap, run trends
- Feedback — rate catches directly from the UI
- Usage tracking — monitor LLM token usage and costs
Feedback & Tuning
DiffFence learns from your feedback. Rating catches improves future accuracy.
Rating catches
# Rate a catch as useful
difffence feedback 42 thumbs_up
# Rate a catch as noise
difffence feedback 42 thumbs_down
Auto-tuning
After enough feedback, DiffFence can recommend configuration changes:
difffence tune
This analyzes your feedback history and suggests:
- Threshold adjustments — raise or lower the confidence threshold
- Rule disabling — turn off specific RubFake rules that produce false positives in your codebase
- Framework-specific tweaks — adjustments based on your test runner patterns
Tuning state is stored locally in your .difffence/difffence.db SQLite database.
API Tokens
Create tokens for programmatic access to DiffFence's local API (used by the web dashboard and CI integrations).
# Create a token
difffence token create --name "ci-bot" --scopes "read,write"
# List tokens (values are masked)
difffence token list
# Revoke a token
difffence token revoke --id 3
Tokens are stored locally in the SQLite database. They use the df_ prefix and support scoped access (read, write).
LLM Providers
OpenAI (default)
export OPENAI_API_KEY=sk-...
difffence run -p openai -m gpt-4o
Anthropic
export ANTHROPIC_API_KEY=sk-ant-...
difffence run -p anthropic -m claude-sonnet-4-20250514
provider.baseUrl in your config file.