Documentation

Everything you need to get DiffFence running in your project.

Quick Start

Get up and running in under a minute.

Install

bash

npm install -D difffence

Set your API key

bash

# OpenAI (default)
export OPENAI_API_KEY=sk-...

# or Anthropic
export ANTHROPIC_API_KEY=sk-ant-...

Run on your feature branch

bash

# From your feature branch with uncommitted or committed changes
npx difffence run

DiffFence will diff against your base branch (auto-detected), generate tests, run them on both branches, and show you what broke.

Optional: Initialize config Run npx difffence init to generate a .difffencerc.json with your preferences. This is not required — DiffFence has sensible defaults.

How It Works

DiffFence runs a 6-stage pipeline on every diff. Each stage feeds into the next.

Stage 1

Diff Analyzer

Parses git diff, extracts changed functions using tree-sitter AST

Stage 2

Context Collector

Gathers callers, existing tests, types, and module exports

Stage 3

Test Generator

LLM infers intent, generates mutants, synthesizes catching tests

Stage 4

Diff Executor

Runs tests on base branch (should pass) and PR branch (should fail)

Stage 5

Assessor

RubFake rules + LLM scoring filter false positives (~70% reduction)

Stage 6

Delivery

Outputs results to CLI, PR comments, exported test files, or JSON

Intent-aware mutant generation

Unlike naive test generators, DiffFence uses a 3-phase approach:

Infer intent — the LLM describes what the developer is trying to change
Generate mutants — 3-5 plausible implementation mistakes for that intent
Synthesize tests — tests that catch each mutant (pass on base, fail if the mutant were real)

CLI Reference

Commands

Command	Description
`difffence run`	Run the full pipeline on current diff
`difffence ci`	CI mode — structured output, exit codes, auto-post to PR
`difffence init`	Generate a `.difffencerc.json` config file
`difffence history`	List previous runs stored in the local database
`difffence show <run_id>`	View detailed results from a past run
`difffence feedback <catch_id> <rating>`	Rate a catch (`thumbs_up` or `thumbs_down`)
`difffence tune`	View auto-tuning recommendations based on feedback
`difffence usage`	Show LLM token usage and cost summary
`difffence usage pr`	Show per-PR usage breakdown
`difffence token create`	Create an API token for programmatic access
`difffence token list`	List existing API tokens
`difffence token revoke`	Revoke an API token

Flags

Flag	Description	Default
`-b, --base <branch>`	Base branch to diff against	auto-detect
`-p, --provider <name>`	LLM provider (`openai` or `anthropic`)	`openai`
`-m, --model <model>`	Model name	`gpt-4o`
`-f, --framework <name>`	Test framework (`jest`, `vitest`, `mocha`)	auto-detect
`-t, --threshold <n>`	Confidence threshold (0-1)	`0.6`
`-k, --api-key <key>`	API key (overrides env var)	—
`--export`	Export catching tests as files	off
`--pr-comment`	Generate PR comment markdown	off
`--post-to-pr <n>`	Auto-post results to GitHub PR #n	—
`--no-ensemble`	Single LLM scoring instead of 3-vote	ensemble on
`--json`	Machine-readable JSON output	off
`--verbose`	Show detailed pipeline logs	off
`--output-format <fmt>`	`text`, `json`, or `github-actions`	`text`
`--fail-threshold <n>`	Min catches to fail CI (ci mode)	`1`

Configuration

Create .difffencerc.json in your repo root, or run difffence init to generate one.

json

{
  "provider": {
    "name": "openai",
    "model": "gpt-4o",
    "apiKeyEnv": "OPENAI_API_KEY"
  },
  "analysis": {
    "baseBranch": "main",
    "maxFunctions": 20,
    "ignorePaths": ["dist/", "node_modules/", "*.test.*"]
  },
  "generation": {
    "framework": "jest",
    "maxCandidatesPerFunction": 5,
    "temperature": 0.3
  },
  "assessment": {
    "threshold": 0.6,
    "preset": "balanced"
  },
  "delivery": {
    "prComments": { "enabled": true },
    "testExport": { "enabled": false, "outputDir": "__catching_tests__" },
    "dashboard": { "enabled": true, "port": 4200 }
  }
}

Config fields

Field	Description
`provider.name`	`openai` or `anthropic`
`provider.model`	Model to use (e.g. `gpt-4o`, `claude-sonnet-4-20250514`)
`provider.apiKeyEnv`	Name of the env var holding the API key
`analysis.baseBranch`	Branch to diff against (`main`, `master`, etc.)
`analysis.maxFunctions`	Max functions to analyze per run
`analysis.ignorePaths`	Glob patterns to exclude from analysis
`generation.framework`	Test framework: `jest`, `vitest`, `mocha`
`generation.maxCandidatesPerFunction`	Max test candidates generated per function
`generation.temperature`	LLM temperature (lower = more deterministic)
`assessment.threshold`	Min confidence to surface a catch (0-1)
`assessment.preset`	`strict` (fewer, higher-quality) or `balanced`

Config priority CLI flags override config file values, and config file values override defaults. Environment variables (OPENAI_API_KEY, etc.) are always respected.

CI Integration

GitHub Actions

yaml

name: DiffFence
on: [pull_request]

jobs:
  difffence:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full history needed

      - uses: actions/setup-node@v4
        with:
          node-version: '20'

      - run: npm ci

      - name: Run DiffFence
        run: npx difffence ci --output-format github-actions
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

What CI mode does

Auto-detects the PR branch and base branch from GITHUB_REF
Runs the full 6-stage pipeline
Posts results as PR review comments (when GITHUB_TOKEN is set)
Uses GitHub Actions annotation format (::error::, ::warning::)
Exits with code 1 if regressions are found (configurable via --fail-threshold)

Important: fetch-depth Always set fetch-depth: 0 in your checkout step. DiffFence needs full git history to diff between branches and create worktrees.

Web Dashboard

DiffFence includes a web UI for running the pipeline visually and browsing results.

bash

npx tsx web-ui/server.ts
# Opens at http://localhost:4200

The dashboard provides:

Pipeline runner — enter a repo path and run the pipeline with real-time SSE streaming
Run history — browse past runs, catches, and their verdicts
Analytics — precision stats, coverage heatmap, run trends
Feedback — rate catches directly from the UI
Usage tracking — monitor LLM token usage and costs

Feedback & Tuning

DiffFence learns from your feedback. Rating catches improves future accuracy.

Rating catches

bash

# Rate a catch as useful
difffence feedback 42 thumbs_up

# Rate a catch as noise
difffence feedback 42 thumbs_down

Auto-tuning

After enough feedback, DiffFence can recommend configuration changes:

bash

difffence tune

This analyzes your feedback history and suggests:

Threshold adjustments — raise or lower the confidence threshold
Rule disabling — turn off specific RubFake rules that produce false positives in your codebase
Framework-specific tweaks — adjustments based on your test runner patterns

Tuning state is stored locally in your .difffence/difffence.db SQLite database.

API Tokens

Create tokens for programmatic access to DiffFence's local API (used by the web dashboard and CI integrations).

bash

# Create a token
difffence token create --name "ci-bot" --scopes "read,write"

# List tokens (values are masked)
difffence token list

# Revoke a token
difffence token revoke --id 3

Tokens are stored locally in the SQLite database. They use the df_ prefix and support scoped access (read, write).

LLM Providers

OpenAI (default)

bash

export OPENAI_API_KEY=sk-...
difffence run -p openai -m gpt-4o

Anthropic

bash

export ANTHROPIC_API_KEY=sk-ant-...
difffence run -p anthropic -m claude-sonnet-4-20250514

Custom base URLs The OpenAI provider supports custom base URLs for Azure OpenAI or compatible endpoints. Set provider.baseUrl in your config file.