LLM Configuration Guide

✅ Production Status: All 5 LLM providers are production-ready (Phase 2 Complete - Nov 9, 2025)

OpenAI API, Anthropic API, Claude CLI, Codex CLI, Ollama

All providers support retry logic, cost tracking, and health checks

GPU acceleration available for Ollama (NVIDIA, AMD, Apple Silicon)

HTTP connection pooling and model auto-download features included

This guide covers advanced LLM configuration features including configuration files, presets, and environment variable interpolation.

Note: LLM features are supported by both apply and analyze commands with identical configuration options. See Also: Main Configuration Guide for basic LLM setup and provider-specific documentation.

Configuration File Support

The resolver supports YAML and TOML configuration files for LLM settings. This allows you to:

Store non-sensitive configuration in version control
Share team-wide LLM settings
Manage complex configurations more easily
Use environment variable interpolation for secrets

YAML Configuration

Create a config.yaml file:

llm:
  enabled: true
  provider: anthropic
  model: claude-sonnet-4-5
  api_key: ${ANTHROPIC_API_KEY}  # Environment variable reference
  fallback_to_regex: true
  cache_enabled: true
  max_tokens: 2000
  confidence_threshold: 0.6  # Reject changes below 60% confidence
  cost_budget: 5.0  # Note: cost_budget is advisory and not currently enforced (see [Sub-Issue #225](../planning/ROADMAP.md))

Use with:

# With apply command
pr-resolve apply 123 --config config.yaml

# With analyze command
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --config config.yaml

TOML Configuration

Create a config.toml file:

[llm]
enabled = true
provider = "openai"
model = "gpt-4o-mini"
api_key = "${OPENAI_API_KEY}"  # Environment variable reference
fallback_to_regex = true
cache_enabled = true
max_tokens = 2000
confidence_threshold = 0.6  # Reject changes below 60% confidence
cost_budget = 5.0  # Note: cost_budget is advisory and not currently enforced.
                    # This field allows users to express intended spending limits and
                    # serves as a placeholder for future enforcement/alerts (see [Sub-Issue #225](../planning/ROADMAP.md)).

Use with:

pr-resolve apply 123 --config config.toml

Configuration File Schema

Field	Type	Default	Description
`llm.enabled`	boolean	`false`	Enable LLM-powered features
`llm.provider`	string	`claude-cli`	Provider name (`claude-cli`, `codex-cli`, `ollama`, `openai`, `anthropic`)
`llm.model`	string	provider-specific	Model identifier (e.g., `claude-sonnet-4-5`, `gpt-4o-mini`)
`llm.api_key`	string	`null`	Must use `${VAR}` syntax - direct keys are rejected
`llm.fallback_to_regex`	boolean	`true`	Fall back to regex parsing if LLM fails
`llm.cache_enabled`	boolean	`true`	Enable response caching
`llm.max_tokens`	integer	`2000`	Maximum tokens per LLM request
`llm.confidence_threshold`	float	`0.5`	Minimum LLM confidence (0.0-1.0) required to accept changes
`llm.cost_budget`	float	`null`	Cost budget configuration (advisory only, not currently enforced). This field allows users to express intended spending limits and serves as a placeholder for future enforcement/alerts (see Sub-Issue #225).
`llm.ollama_base_url`	string	`http://localhost:11434`	Ollama server URL (Ollama only)
`llm.effort`	string	`null`	Effort level for speed/cost vs accuracy tradeoff (`none`, `low`, `medium`, `high`)

LLM Effort Level

The --llm-effort option controls the speed/cost vs accuracy tradeoff for LLM providers that support extended reasoning capabilities.

Effort Levels

Level	Description	Use Case
`none`	Fastest, minimal reasoning	Quick parsing, cost-sensitive
`low`	Light reasoning	Balanced speed/accuracy
`medium`	Moderate reasoning	Complex comments
`high`	Most thorough reasoning	Maximum accuracy, complex parsing

Provider Support

Provider	Parameter	Notes
OpenAI	`reasoning_effort`	Supported on GPT-5.x models
Anthropic	`effort`	Supported on Claude Opus 4.5
Ollama	Not supported	Uses standard inference
Claude CLI	Not supported	Uses standard inference
Codex CLI	Not supported	Uses standard inference

Usage

CLI flag:

# Fast parsing (minimal reasoning)
pr-resolve apply 123 --llm-effort none

# Maximum accuracy (thorough reasoning)
pr-resolve apply 123 --llm-effort high

# With analyze command
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-effort medium

Environment variable:

export CR_LLM_EFFORT=medium
pr-resolve apply 123

Configuration file:

llm:
  enabled: true
  provider: anthropic
  model: claude-opus-4-5
  api_key: ${ANTHROPIC_API_KEY}
  effort: high  # Maximum reasoning for complex parsing

Cost Considerations

Higher effort levels typically increase:

Response latency (more reasoning time)
Token usage (reasoning tokens counted)
Per-request cost

For cost-sensitive deployments, use none or low effort levels. Reserve high for complex parsing tasks where accuracy is critical.

LLM Presets

Presets provide zero-config LLM setup with sensible defaults for common use cases.

Available Presets

Preset	Provider	Model	Status	Cost	Requires
`codex-cli-free`	Codex CLI	`codex`	✅ Production	Free	GitHub Copilot subscription
`ollama-local`	Ollama	`qwen2.5-coder:7b`	✅ Production + GPU	Free	Local Ollama + GPU (optional)
`claude-cli-sonnet`	Claude CLI	`claude-sonnet-4-5`	✅ Production	Free	Claude subscription
`openai-api-mini`	OpenAI API	`gpt-4o-mini`	✅ Production	~$0.15/1M tokens	API key ($5 budget)
`anthropic-api-balanced`	Anthropic API	`claude-haiku-4`	✅ Production	~$0.25/1M tokens	API key ($5 budget)

Using Presets

CLI-Based Presets (Free)

No API key required:

# GitHub Codex (requires Copilot subscription)
pr-resolve apply 123 --llm-preset codex-cli-free
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-preset codex-cli-free

# Local Ollama (requires ollama installation)
pr-resolve apply 123 --llm-preset ollama-local
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-preset ollama-local

# Claude CLI (requires Claude subscription)
pr-resolve apply 123 --llm-preset claude-cli-sonnet
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-preset claude-cli-sonnet

API-Based Presets (Paid)

Require API key via environment variable or CLI flag:

# OpenAI (low-cost)
export OPENAI_API_KEY="sk-..."
pr-resolve apply 123 --llm-preset openai-api-mini

# Anthropic (balanced, with caching)
export ANTHROPIC_API_KEY="sk-ant-..."
pr-resolve apply 123 --llm-preset anthropic-api-balanced

# Or pass API key via CLI flag
pr-resolve apply 123 --llm-preset openai-api-mini --llm-api-key sk-...

Available Presets (Provider-Specific)

The following LLM presets are available:

codex-cli-free: Free Codex CLI - Requires GitHub Copilot subscription
- Provider: codex-cli
- Model: codex
- Requires API key: No
ollama-local: Local Ollama - Free, private, offline (recommended: qwen2.5-coder:7b)
- Provider: ollama
- Model: qwen2.5-coder:7b
- Requires API key: No
claude-cli-sonnet: Claude CLI with Sonnet 4.5 - Requires Claude subscription
- Provider: claude-cli
- Model: claude-sonnet-4-5
- Requires API key: No
openai-api-mini: OpenAI GPT-4o-mini - Low-cost API (requires API key)
- Provider: openai
- Model: gpt-4o-mini
- Requires API key: Yes
- Cost budget: $5.00
anthropic-api-balanced: Anthropic Claude Haiku 4 - Balanced cost/performance (requires API key)
- Provider: anthropic
- Model: claude-haiku-4
- Requires API key: Yes
- Cost budget: $5.00

Privacy Considerations

Different LLM providers have significantly different privacy characteristics. Understanding these differences is crucial for choosing the right provider for your use case.

Privacy Comparison

Provider	LLM Vendor Exposure	GitHub API Required	Best For
Ollama	✅ None (localhost)	⚠️ Yes	Reducing third-party exposure, compliance
OpenAI	❌ OpenAI (US)	⚠️ Yes	Cost-effective, production
Anthropic	❌ Anthropic (US)	⚠️ Yes	Quality, caching benefits
Claude CLI	❌ Anthropic (US)	⚠️ Yes	Interactive, convenience
Codex CLI	❌ GitHub/OpenAI	⚠️ Yes	GitHub integration

Note: All options require GitHub API access (internet required). The privacy difference is whether an LLM vendor also sees your review comments.

Ollama: Reduced Third-Party Exposure 🔒

When using Ollama (ollama-local preset):

✅ Local LLM processing - Review comments processed locally (no LLM vendor)
✅ No LLM vendor exposure - OpenAI/Anthropic never see your comments
✅ Simpler compliance - One fewer data processor (no LLM vendor BAA/DPA)
✅ Zero LLM costs - Free after hardware investment
✅ No LLM API keys required - No credential management
⚠️ GitHub API required - Internet needed to fetch PR data (not offline/air-gapped)

Reality Check:

⚠️ Code is on GitHub (required for PR workflow)
⚠️ CodeRabbit has access (required for reviews)
✅ LLM vendor does NOT have access (eliminated)

Recommended for:

Reducing third-party LLM vendor exposure
Regulated industries wanting simpler compliance chain (GDPR, HIPAA, SOC2)
Organizations with policies against cloud LLM services
Cost-conscious usage (no per-request LLM fees)

Learn more:

Privacy Architecture - Detailed privacy analysis
Local LLM Operation Guide - Setup instructions
Privacy FAQ - Common privacy questions

API Providers: Convenience vs. Privacy Trade-off

When using API providers (OpenAI, Anthropic, etc.):

⚠️ Data transmitted to cloud - Review comments and code sent via HTTPS
⚠️ Third-party data policies - Subject to provider’s retention and usage policies
⚠️ Internet required - No offline operation
⚠️ Costs per request - Ongoing usage fees
⚠️ Compliance complexity - May require Data Processing Agreements (DPA), Business Associate Agreements (BAA)

Acceptable for:

Open source / public code repositories
Organizations with enterprise LLM agreements
Use cases where privacy trade-off is acceptable
When highest model quality is required (GPT-4, Claude Opus)

Privacy safeguards:

✅ Data encrypted in transit (HTTPS/TLS)
✅ API keys never logged by pr-resolve
✅ Anthropic: No training on API data by default
✅ OpenAI: Can opt out of training data usage

Privacy Verification

Verify Ollama’s local-only operation:

# Run privacy verification script
./scripts/verify_privacy.sh

# Expected output
# ✅ Privacy Verification: PASSED
# ✅ No external network connections detected
# ✅ All Ollama traffic is localhost-only

The script monitors network traffic during Ollama inference and confirms no external LLM vendor connections are made.

Note: GitHub API connections will still appear (required for PR workflow).

See: Local LLM Operation Guide - Privacy Verification for details.

Making the Right Choice

Choose Ollama if you need:

✅ Reduced third-party exposure (eliminate LLM vendor)
✅ Simpler compliance chain (one fewer data processor)
✅ Zero ongoing LLM costs
⚠️ NOT for: Offline/air-gapped operation (requires GitHub API)

Choose API providers if:

✅ Privacy trade-off is acceptable (public/open-source code)
✅ Enterprise agreements are in place (DPA, BAA)
✅ Highest model quality is priority
✅ Budget is available for per-request fees

For detailed privacy analysis and compliance considerations, see the Privacy Architecture documentation.

Environment Variable Interpolation

Configuration files support ${VAR_NAME} syntax for injecting environment variables at runtime.

Syntax

llm:
  api_key: ${ANTHROPIC_API_KEY}
  model: ${LLM_MODEL:-claude-haiku-4}  # With default value (not yet supported)

Behavior

Found: Variable is replaced with its value
Not Found: Placeholder remains (${VAR_NAME}) with warning logged
Security: Only ${VAR} syntax is allowed for API keys in config files

Examples

Basic Interpolation

llm:
  provider: ${LLM_PROVIDER}
  model: ${LLM_MODEL}
  api_key: ${OPENAI_API_KEY}

export LLM_PROVIDER="openai"
export LLM_MODEL="gpt-4o-mini"
export OPENAI_API_KEY="sk-..."

pr-resolve apply 123 --config config.yaml

Multiple Variables

[llm]
provider = "${PROVIDER}"
api_key = "${API_KEY}"

[llm.ollama]
base_url = "${OLLAMA_URL}"

Nested Structures

llm:
  enabled: true
  provider: anthropic
  api_key: ${ANTHROPIC_API_KEY}
  cache:
    enabled: ${CACHE_ENABLED}
    ttl: ${CACHE_TTL}

Configuration Precedence

Configuration sources are applied in this order (highest to lowest priority):

CLI Flags - Command-line arguments (--llm-provider openai)
Environment Variables - CR_LLM_* variables
Configuration File - YAML/TOML file (--config config.yaml)
LLM Presets - Preset via --llm-preset flag
Default Values - Built-in defaults

Example: Layering Configuration

# Start with preset
export LLM_PRESET="openai-api-mini"

# Override with env vars
export CR_LLM_MODEL="gpt-4"
export CR_LLM_MAX_TOKENS=4000

# Override with CLI flags
pr-resolve apply 123 \
  --llm-preset openai-api-mini \
  --llm-model gpt-4o \
  --llm-api-key sk-...

# Result
# - provider: openai (from preset)
# - model: gpt-4o (CLI flag overrides env var)
# - api_key: sk-... (CLI flag)
# - max_tokens: 4000 (env var)
# - cost_budget: 5.0 (preset default)

Precedence Table

Setting	CLI Flag	Env Var	Config File	Preset	Default
Priority	1 (highest)	2	3	4	5 (lowest)
Scope	Single run	Session	Project	Quick setup	Fallback
Use Case	Testing, overrides	Personal settings	Team config	Zero-config	Sensible defaults

API Key Security

Security Rules

Never commit API keys to version control
API keys MUST use environment variables
Config files MUST use ${VAR} syntax for API keys
Direct API keys in config files are rejected

Valid Configuration

✅ Allowed - Environment variable reference:

llm:
  api_key: ${ANTHROPIC_API_KEY}  # ✅ Valid

[llm]
api_key = "${OPENAI_API_KEY}"  # ✅ Valid

❌ Rejected - Direct API key:

llm:
  api_key: sk-ant-real-key-12345  # ❌ REJECTED

[llm]
api_key = "sk-openai-real-key"  # ❌ REJECTED

Error Message

When a real API key is detected in a config file:

ConfigError: SECURITY: API keys must NOT be stored in configuration files (config.yaml).
Use environment variables: CR_LLM_API_KEY or ${OPENAI_API_KEY}.
Example: api_key: ${ANTHROPIC_API_KEY}

Supported environment variables:
* CR_LLM_API_KEY (generic)
* OPENAI_API_KEY (OpenAI)
* ANTHROPIC_API_KEY (Anthropic)

Best Practices

Use .env file for local development:

# .env (add to .gitignore)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Reference in config file:
```
llm:
  api_key: ${OPENAI_API_KEY}
```

Load environment variables:

source .env
pr-resolve apply 123 --config config.yaml

Ollama Auto-Download Feature

The Ollama provider supports automatic model downloading for streamlined setup.

Quick Setup with Scripts

Use the automated setup scripts for the easiest Ollama installation:

# 1. Install and setup Ollama
./scripts/setup_ollama.sh

# 2. Download recommended model
./scripts/download_ollama_models.sh

# 3. Use with pr-resolve
pr-resolve apply 123 --llm-preset ollama-local

See the Ollama Setup Guide for comprehensive documentation.

Auto-Download via Python API

Enable automatic model downloads in Python code:

from review_bot_automator.llm.providers.ollama import OllamaProvider

# Auto-download enabled - model will be downloaded if not available
provider = OllamaProvider(
    model="qwen2.5-coder:7b",
    auto_download=True  # Downloads model automatically (may take several minutes)
)

# Get model recommendations
models = OllamaProvider.list_recommended_models()
for model in models:
    print(f"{model['name']}: {model['description']}")

Benefits:

No manual ollama pull required
Automated CI/CD setup
Seamless model switching

Note: Auto-download is not currently exposed via CLI flags. Use the interactive scripts or manual ollama pull for CLI usage.

Examples

Example 1: Free Local Setup (Ollama)

Quick Setup (Recommended):

# Automated setup
./scripts/setup_ollama.sh
./scripts/download_ollama_models.sh

# Use preset
pr-resolve apply 123 --llm-preset ollama-local

Manual Setup:

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull model
ollama pull qwen2.5-coder:7b

Option A: Preset:

pr-resolve apply 123 --llm-preset ollama-local

Option B: Config File:

# config.yaml
llm:
  enabled: true
  provider: ollama
  model: qwen2.5-coder:7b

pr-resolve apply 123 --config config.yaml

See Ollama Setup Guide for detailed installation instructions, model recommendations, and troubleshooting.

Example 2: Paid API Setup (OpenAI)

config.yaml:

llm:
  enabled: true
  provider: openai
  model: gpt-4o-mini
  api_key: ${OPENAI_API_KEY}
  cost_budget: 5.0
  cache_enabled: true
  fallback_to_regex: true

.env:

OPENAI_API_KEY=sk-...

Usage:

source .env
pr-resolve apply 123 --config config.yaml

Example 3: Team Configuration

team-config.yaml (committed to repo):

llm:
  enabled: true
  provider: anthropic
  model: claude-haiku-4
  api_key: ${ANTHROPIC_API_KEY}  # Each dev sets their own key
  fallback_to_regex: true
  cache_enabled: true
  max_tokens: 2000
  cost_budget: 10.0

Each developer:

# Set personal API key
export ANTHROPIC_API_KEY="sk-ant-..."

# Use team config
pr-resolve apply 123 --config team-config.yaml

Example 4: Override Preset Settings

# Start with preset, override specific settings
pr-resolve apply 123 \
  --llm-preset openai-api-mini \
  --llm-model gpt-4 \
  --llm-max-tokens 4000 \
  --llm-cost-budget 10.0

Example 5: Multi-Environment Setup

dev.yaml:

llm:
  enabled: true
  provider: ollama
  model: qwen2.5-coder:7b

staging.yaml:

llm:
  enabled: true
  provider: anthropic
  model: claude-haiku-4
  api_key: ${STAGING_API_KEY}
  cost_budget: 5.0

prod.yaml:

llm:
  enabled: true
  provider: anthropic
  model: claude-sonnet-4-5
  api_key: ${PROD_API_KEY}
  cost_budget: 20.0

Usage:

# Development
pr-resolve apply 123 --config dev.yaml

# Staging
export STAGING_API_KEY="sk-ant-staging-..."
pr-resolve apply 123 --config staging.yaml

# Production
export PROD_API_KEY="sk-ant-prod-..."
pr-resolve apply 123 --config prod.yaml

Retry & Resilience Configuration

Rate Limit Retry (Phase 5)

Configure automatic retry behavior for rate limit and transient errors:

Variable	Default	Description
`CR_LLM_RETRY_ON_RATE_LIMIT`	`true`	Enable retry on rate limit errors
`CR_LLM_RETRY_MAX_ATTEMPTS`	`3`	Maximum retry attempts (>=1)
`CR_LLM_RETRY_BASE_DELAY`	`2.0`	Base delay in seconds for exponential backoff

Example YAML configuration:

llm:
  retry_on_rate_limit: true
  retry_max_attempts: 5
  retry_base_delay: 3.0

Exponential backoff formula:

delay = base_delay * 2^attempt + random_jitter

For example, with retry_base_delay: 2.0:

Attempt 1: ~2s delay
Attempt 2: ~4s delay
Attempt 3: ~8s delay

Circuit Breaker

Prevents cascading failures by temporarily disabling failing providers:

Variable	Default	Description
`CR_LLM_CIRCUIT_BREAKER_ENABLED`	`true`	Enable circuit breaker pattern
`CR_LLM_CIRCUIT_BREAKER_THRESHOLD`	`5`	Consecutive failures before circuit opens
`CR_LLM_CIRCUIT_BREAKER_COOLDOWN`	`60.0`	Seconds before attempting recovery

Circuit breaker states:

CLOSED (normal): Requests pass through
OPEN (failing): Requests fail immediately without calling provider
HALF_OPEN (recovery): Single test request to check if provider recovered

Cache Warming

Pre-populate the cache for cold start optimization:

from review_bot_automator.llm.cache.prompt_cache import PromptCache

cache = PromptCache()
entries = [
    {
        "prompt": "Parse this CodeRabbit comment...",
        "provider": "anthropic",
        "model": "claude-sonnet-4-5",
        "response": "..."
    },
    # ... more entries
]
loaded, skipped = cache.warm_cache(entries)
print(f"Loaded {loaded} entries, skipped {skipped}")

Benefits:

Eliminates cold start latency
O(n) bulk loading (optimized, no per-entry eviction checks)
Skips duplicates automatically
Thread-safe for concurrent access

Via CachingProvider:

from review_bot_automator.llm.providers.caching_provider import CachingProvider

cached_provider = CachingProvider(base_provider)
loaded, skipped = cached_provider.warm_up(entries)

Troubleshooting

Environment Variable Not Interpolated

Symptom: Config shows ${VAR_NAME} instead of value.

Cause: Environment variable not set.

Solution:

# Check if variable is set
echo $ANTHROPIC_API_KEY

# Set the variable
export ANTHROPIC_API_KEY="sk-ant-..."

# Verify
pr-resolve apply 123 --config config.yaml --dry-run

API Key Rejected in Config File

Error:

ConfigError: SECURITY: API keys must NOT be stored in configuration files

Cause: Real API key in config file instead of ${VAR} syntax.

Solution:

# ❌ Wrong
llm:
  api_key: sk-ant-real-key

# ✅ Correct
llm:
  api_key: ${ANTHROPIC_API_KEY}

Preset Not Found

Error:

ConfigError: Unknown preset 'invalid-preset'

Solution: List available presets:

pr-resolve config show-presets

Configuration Not Applied

Symptom: Settings from config file ignored.

Cause: CLI flags or environment variables have higher precedence.

Solution: Check precedence order:

Remove conflicting CLI flags
Unset conflicting environment variables (unset CR_LLM_PROVIDER)
Verify config file syntax (--config config.yaml --dry-run)

LLM Still Disabled After Configuration

Cause: API-based provider without API key.

Solution:

# Check configuration
pr-resolve config show

# Ensure API key is set
export OPENAI_API_KEY="sk-..."

# Or use CLI-based preset (no API key needed)
pr-resolve apply 123 --llm-preset codex-cli-free

Ollama Connection Failed

Error:

LLMProviderError: Failed to connect to Ollama at http://localhost:11434

Solution:

# Check Ollama is running
ollama list

# Start Ollama if needed
ollama serve

# Or specify custom URL
export OLLAMA_BASE_URL="http://ollama-server:11434"
pr-resolve apply 123 --config config.yaml

Performance Considerations

Choosing the Right Provider

Different LLM providers have different characteristics in terms of latency, cost, accuracy, and privacy. Consider these factors when choosing a provider:

For Speed-Critical Applications:

OpenAI and Anthropic APIs typically offer the lowest latency (1-3s mean)
Best for real-time workflows and interactive use cases

For Cost-Sensitive Deployments:

Ollama (local) has zero per-request cost but requires hardware
OpenAI’s gpt-4o-mini offers good balance of cost and performance
Anthropic with prompt caching can reduce costs by 50-90%

For Privacy-First Requirements:

Ollama eliminates LLM vendor exposure (no OpenAI/Anthropic)
Simplifies compliance for HIPAA, GDPR (one fewer data processor)
Note: GitHub/CodeRabbit still have access (required)
Trade-off: Higher latency, especially on CPU-only systems

For High-Volume Production:

Anthropic with prompt caching (50-90% cost reduction on repeated prompts)
Connection pooling and retry logic built-in for all providers

Performance Benchmarking

Comprehensive performance benchmarks comparing all providers are available in the Performance Benchmarks document. The benchmarks measure:

Latency: Mean, median, P95, P99 response times
Throughput: Requests per second
Accuracy: Parsing success rates vs ground truth
Cost: Per-request and monthly estimates at scale

Run your own benchmarks:

# Benchmark all providers (requires API keys)
python scripts/benchmark_llm.py --iterations 100

# Benchmark specific providers
python scripts/benchmark_llm.py --providers ollama openai --iterations 50

# Custom dataset
python scripts/benchmark_llm.py --dataset my_comments.json --output my_report.md

See python scripts/benchmark_llm.py --help for all options.

LLM Configuration Guide

Table of Contents

Configuration File Support

YAML Configuration

TOML Configuration

Configuration File Schema

LLM Effort Level

Effort Levels

Provider Support

Usage

Cost Considerations

LLM Presets

Available Presets

Using Presets

CLI-Based Presets (Free)

API-Based Presets (Paid)

Available Presets (Provider-Specific)

Privacy Considerations

Privacy Comparison

Ollama: Reduced Third-Party Exposure 🔒

API Providers: Convenience vs. Privacy Trade-off

Privacy Verification

Making the Right Choice

Environment Variable Interpolation

Syntax

Behavior

Examples

Basic Interpolation

Multiple Variables

Nested Structures

Configuration Precedence

Example: Layering Configuration

Precedence Table

API Key Security

Security Rules

Valid Configuration

Error Message

Best Practices

Ollama Auto-Download Feature

Quick Setup with Scripts

Auto-Download via Python API

Examples

Example 1: Free Local Setup (Ollama)

Example 2: Paid API Setup (OpenAI)

Example 3: Team Configuration

Example 4: Override Preset Settings

Example 5: Multi-Environment Setup

Retry & Resilience Configuration

Rate Limit Retry (Phase 5)

Circuit Breaker

Cache Warming

Troubleshooting

Environment Variable Not Interpolated

API Key Rejected in Config File

Preset Not Found

Configuration Not Applied

LLM Still Disabled After Configuration

Ollama Connection Failed

Performance Considerations

Choosing the Right Provider

Performance Benchmarking

See Also