LLM Configuration Guideο
β Production Status: All 5 LLM providers are production-ready (Phase 2 Complete - Nov 9, 2025)
OpenAI API, Anthropic API, Claude CLI, Codex CLI, Ollama
All providers support retry logic, cost tracking, and health checks
GPU acceleration available for Ollama (NVIDIA, AMD, Apple Silicon)
HTTP connection pooling and model auto-download features included
This guide covers advanced LLM configuration features including configuration files, presets, and environment variable interpolation.
Note: LLM features are supported by both
applyandanalyzecommands with identical configuration options. See Also: Main Configuration Guide for basic LLM setup and provider-specific documentation.
Table of Contentsο
Configuration File Supportο
The resolver supports YAML and TOML configuration files for LLM settings. This allows you to:
Store non-sensitive configuration in version control
Share team-wide LLM settings
Manage complex configurations more easily
Use environment variable interpolation for secrets
YAML Configurationο
Create a config.yaml file:
llm:
enabled: true
provider: anthropic
model: claude-sonnet-4-5
api_key: ${ANTHROPIC_API_KEY} # Environment variable reference
fallback_to_regex: true
cache_enabled: true
max_tokens: 2000
confidence_threshold: 0.6 # Reject changes below 60% confidence
cost_budget: 5.0 # Note: cost_budget is advisory and not currently enforced (see [Sub-Issue #225](../planning/ROADMAP.md))
Use with:
# With apply command
pr-resolve apply 123 --config config.yaml
# With analyze command
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --config config.yaml
TOML Configurationο
Create a config.toml file:
[llm]
enabled = true
provider = "openai"
model = "gpt-4o-mini"
api_key = "${OPENAI_API_KEY}" # Environment variable reference
fallback_to_regex = true
cache_enabled = true
max_tokens = 2000
confidence_threshold = 0.6 # Reject changes below 60% confidence
cost_budget = 5.0 # Note: cost_budget is advisory and not currently enforced.
# This field allows users to express intended spending limits and
# serves as a placeholder for future enforcement/alerts (see [Sub-Issue #225](../planning/ROADMAP.md)).
Use with:
pr-resolve apply 123 --config config.toml
Configuration File Schemaο
Field |
Type |
Default |
Description |
|---|---|---|---|
|
boolean |
|
Enable LLM-powered features |
|
string |
|
Provider name ( |
|
string |
provider-specific |
Model identifier (e.g., |
|
string |
|
Must use |
|
boolean |
|
Fall back to regex parsing if LLM fails |
|
boolean |
|
Enable response caching |
|
integer |
|
Maximum tokens per LLM request |
|
float |
|
Minimum LLM confidence (0.0-1.0) required to accept changes |
|
float |
|
Cost budget configuration (advisory only, not currently enforced). This field allows users to express intended spending limits and serves as a placeholder for future enforcement/alerts (see Sub-Issue #225). |
|
string |
|
Ollama server URL (Ollama only) |
|
string |
|
Effort level for speed/cost vs accuracy tradeoff ( |
LLM Effort Levelο
The --llm-effort option controls the speed/cost vs accuracy tradeoff for LLM providers that support extended reasoning capabilities.
Effort Levelsο
Level |
Description |
Use Case |
|---|---|---|
|
Fastest, minimal reasoning |
Quick parsing, cost-sensitive |
|
Light reasoning |
Balanced speed/accuracy |
|
Moderate reasoning |
Complex comments |
|
Most thorough reasoning |
Maximum accuracy, complex parsing |
Provider Supportο
Provider |
Parameter |
Notes |
|---|---|---|
OpenAI |
|
Supported on GPT-5.x models |
Anthropic |
|
Supported on Claude Opus 4.5 |
Ollama |
Not supported |
Uses standard inference |
Claude CLI |
Not supported |
Uses standard inference |
Codex CLI |
Not supported |
Uses standard inference |
Usageο
CLI flag:
# Fast parsing (minimal reasoning)
pr-resolve apply 123 --llm-effort none
# Maximum accuracy (thorough reasoning)
pr-resolve apply 123 --llm-effort high
# With analyze command
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-effort medium
Environment variable:
export CR_LLM_EFFORT=medium
pr-resolve apply 123
Configuration file:
llm:
enabled: true
provider: anthropic
model: claude-opus-4-5
api_key: ${ANTHROPIC_API_KEY}
effort: high # Maximum reasoning for complex parsing
Cost Considerationsο
Higher effort levels typically increase:
Response latency (more reasoning time)
Token usage (reasoning tokens counted)
Per-request cost
For cost-sensitive deployments, use none or low effort levels. Reserve high for complex parsing tasks where accuracy is critical.
LLM Presetsο
Presets provide zero-config LLM setup with sensible defaults for common use cases.
Available Presetsο
Preset |
Provider |
Model |
Status |
Cost |
Requires |
|---|---|---|---|---|---|
|
Codex CLI |
|
β Production |
Free |
GitHub Copilot subscription |
|
Ollama |
|
β Production + GPU |
Free |
Local Ollama + GPU (optional) |
|
Claude CLI |
|
β Production |
Free |
Claude subscription |
|
OpenAI API |
|
β Production |
~$0.15/1M tokens |
API key ($5 budget) |
|
Anthropic API |
|
β Production |
~$0.25/1M tokens |
API key ($5 budget) |
Using Presetsο
CLI-Based Presets (Free)ο
No API key required:
# GitHub Codex (requires Copilot subscription)
pr-resolve apply 123 --llm-preset codex-cli-free
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-preset codex-cli-free
# Local Ollama (requires ollama installation)
pr-resolve apply 123 --llm-preset ollama-local
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-preset ollama-local
# Claude CLI (requires Claude subscription)
pr-resolve apply 123 --llm-preset claude-cli-sonnet
pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-preset claude-cli-sonnet
API-Based Presets (Paid)ο
Require API key via environment variable or CLI flag:
# OpenAI (low-cost)
export OPENAI_API_KEY="sk-..."
pr-resolve apply 123 --llm-preset openai-api-mini
# Anthropic (balanced, with caching)
export ANTHROPIC_API_KEY="sk-ant-..."
pr-resolve apply 123 --llm-preset anthropic-api-balanced
# Or pass API key via CLI flag
pr-resolve apply 123 --llm-preset openai-api-mini --llm-api-key sk-...
Available Presets (Provider-Specific)ο
The following LLM presets are available:
codex-cli-free: Free Codex CLI - Requires GitHub Copilot subscription
Provider: codex-cli
Model: codex
Requires API key: No
ollama-local: Local Ollama - Free, private, offline (recommended: qwen2.5-coder:7b)
Provider: ollama
Model: qwen2.5-coder:7b
Requires API key: No
claude-cli-sonnet: Claude CLI with Sonnet 4.5 - Requires Claude subscription
Provider: claude-cli
Model: claude-sonnet-4-5
Requires API key: No
openai-api-mini: OpenAI GPT-4o-mini - Low-cost API (requires API key)
Provider: openai
Model: gpt-4o-mini
Requires API key: Yes
Cost budget: $5.00
anthropic-api-balanced: Anthropic Claude Haiku 4 - Balanced cost/performance (requires API key)
Provider: anthropic
Model: claude-haiku-4
Requires API key: Yes
Cost budget: $5.00
Privacy Considerationsο
Different LLM providers have significantly different privacy characteristics. Understanding these differences is crucial for choosing the right provider for your use case.
Privacy Comparisonο
Provider |
LLM Vendor Exposure |
GitHub API Required |
Best For |
|---|---|---|---|
Ollama |
β None (localhost) |
β οΈ Yes |
Reducing third-party exposure, compliance |
OpenAI |
β OpenAI (US) |
β οΈ Yes |
Cost-effective, production |
Anthropic |
β Anthropic (US) |
β οΈ Yes |
Quality, caching benefits |
Claude CLI |
β Anthropic (US) |
β οΈ Yes |
Interactive, convenience |
Codex CLI |
β GitHub/OpenAI |
β οΈ Yes |
GitHub integration |
Note: All options require GitHub API access (internet required). The privacy difference is whether an LLM vendor also sees your review comments.
Ollama: Reduced Third-Party Exposure πο
When using Ollama (ollama-local preset):
β Local LLM processing - Review comments processed locally (no LLM vendor)
β No LLM vendor exposure - OpenAI/Anthropic never see your comments
β Simpler compliance - One fewer data processor (no LLM vendor BAA/DPA)
β Zero LLM costs - Free after hardware investment
β No LLM API keys required - No credential management
β οΈ GitHub API required - Internet needed to fetch PR data (not offline/air-gapped)
Reality Check:
β οΈ Code is on GitHub (required for PR workflow)
β οΈ CodeRabbit has access (required for reviews)
β LLM vendor does NOT have access (eliminated)
Recommended for:
Reducing third-party LLM vendor exposure
Regulated industries wanting simpler compliance chain (GDPR, HIPAA, SOC2)
Organizations with policies against cloud LLM services
Cost-conscious usage (no per-request LLM fees)
Learn more:
Privacy Architecture - Detailed privacy analysis
Local LLM Operation Guide - Setup instructions
Privacy FAQ - Common privacy questions
API Providers: Convenience vs. Privacy Trade-offο
When using API providers (OpenAI, Anthropic, etc.):
β οΈ Data transmitted to cloud - Review comments and code sent via HTTPS
β οΈ Third-party data policies - Subject to providerβs retention and usage policies
β οΈ Internet required - No offline operation
β οΈ Costs per request - Ongoing usage fees
β οΈ Compliance complexity - May require Data Processing Agreements (DPA), Business Associate Agreements (BAA)
Acceptable for:
Open source / public code repositories
Organizations with enterprise LLM agreements
Use cases where privacy trade-off is acceptable
When highest model quality is required (GPT-4, Claude Opus)
Privacy safeguards:
β Data encrypted in transit (HTTPS/TLS)
β API keys never logged by pr-resolve
β Anthropic: No training on API data by default
β OpenAI: Can opt out of training data usage
Privacy Verificationο
Verify Ollamaβs local-only operation:
# Run privacy verification script
./scripts/verify_privacy.sh
# Expected output
# β
Privacy Verification: PASSED
# β
No external network connections detected
# β
All Ollama traffic is localhost-only
The script monitors network traffic during Ollama inference and confirms no external LLM vendor connections are made.
Note: GitHub API connections will still appear (required for PR workflow).
See: Local LLM Operation Guide - Privacy Verification for details.
Making the Right Choiceο
Choose Ollama if you need:
β Reduced third-party exposure (eliminate LLM vendor)
β Simpler compliance chain (one fewer data processor)
β Zero ongoing LLM costs
β οΈ NOT for: Offline/air-gapped operation (requires GitHub API)
Choose API providers if:
β Privacy trade-off is acceptable (public/open-source code)
β Enterprise agreements are in place (DPA, BAA)
β Highest model quality is priority
β Budget is available for per-request fees
For detailed privacy analysis and compliance considerations, see the Privacy Architecture documentation.
Environment Variable Interpolationο
Configuration files support ${VAR_NAME} syntax for injecting environment variables at runtime.
Syntaxο
llm:
api_key: ${ANTHROPIC_API_KEY}
model: ${LLM_MODEL:-claude-haiku-4} # With default value (not yet supported)
Behaviorο
Found: Variable is replaced with its value
Not Found: Placeholder remains (
${VAR_NAME}) with warning loggedSecurity: Only
${VAR}syntax is allowed for API keys in config files
Examplesο
Basic Interpolationο
llm:
provider: ${LLM_PROVIDER}
model: ${LLM_MODEL}
api_key: ${OPENAI_API_KEY}
export LLM_PROVIDER="openai"
export LLM_MODEL="gpt-4o-mini"
export OPENAI_API_KEY="sk-..."
pr-resolve apply 123 --config config.yaml
Multiple Variablesο
[llm]
provider = "${PROVIDER}"
api_key = "${API_KEY}"
[llm.ollama]
base_url = "${OLLAMA_URL}"
Nested Structuresο
llm:
enabled: true
provider: anthropic
api_key: ${ANTHROPIC_API_KEY}
cache:
enabled: ${CACHE_ENABLED}
ttl: ${CACHE_TTL}
Configuration Precedenceο
Configuration sources are applied in this order (highest to lowest priority):
CLI Flags - Command-line arguments (
--llm-provider openai)Environment Variables -
CR_LLM_*variablesConfiguration File - YAML/TOML file (
--config config.yaml)LLM Presets - Preset via
--llm-presetflagDefault Values - Built-in defaults
Example: Layering Configurationο
# Start with preset
export LLM_PRESET="openai-api-mini"
# Override with env vars
export CR_LLM_MODEL="gpt-4"
export CR_LLM_MAX_TOKENS=4000
# Override with CLI flags
pr-resolve apply 123 \
--llm-preset openai-api-mini \
--llm-model gpt-4o \
--llm-api-key sk-...
# Result
# - provider: openai (from preset)
# - model: gpt-4o (CLI flag overrides env var)
# - api_key: sk-... (CLI flag)
# - max_tokens: 4000 (env var)
# - cost_budget: 5.0 (preset default)
Precedence Tableο
Setting |
CLI Flag |
Env Var |
Config File |
Preset |
Default |
|---|---|---|---|---|---|
Priority |
1 (highest) |
2 |
3 |
4 |
5 (lowest) |
Scope |
Single run |
Session |
Project |
Quick setup |
Fallback |
Use Case |
Testing, overrides |
Personal settings |
Team config |
Zero-config |
Sensible defaults |
API Key Securityο
Security Rulesο
Never commit API keys to version control
API keys MUST use environment variables
Config files MUST use
${VAR}syntax for API keysDirect API keys in config files are rejected
Valid Configurationο
β Allowed - Environment variable reference:
llm:
api_key: ${ANTHROPIC_API_KEY} # β
Valid
[llm]
api_key = "${OPENAI_API_KEY}" # β
Valid
β Rejected - Direct API key:
llm:
api_key: sk-ant-real-key-12345 # β REJECTED
[llm]
api_key = "sk-openai-real-key" # β REJECTED
Error Messageο
When a real API key is detected in a config file:
ConfigError: SECURITY: API keys must NOT be stored in configuration files (config.yaml).
Use environment variables: CR_LLM_API_KEY or ${OPENAI_API_KEY}.
Example: api_key: ${ANTHROPIC_API_KEY}
Supported environment variables:
* CR_LLM_API_KEY (generic)
* OPENAI_API_KEY (OpenAI)
* ANTHROPIC_API_KEY (Anthropic)
Best Practicesο
Use
.envfile for local development:# .env (add to .gitignore) OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-...
Reference in config file:
llm: api_key: ${OPENAI_API_KEY}
Load environment variables:
source .env pr-resolve apply 123 --config config.yaml
Ollama Auto-Download Featureο
The Ollama provider supports automatic model downloading for streamlined setup.
Quick Setup with Scriptsο
Use the automated setup scripts for the easiest Ollama installation:
# 1. Install and setup Ollama
./scripts/setup_ollama.sh
# 2. Download recommended model
./scripts/download_ollama_models.sh
# 3. Use with pr-resolve
pr-resolve apply 123 --llm-preset ollama-local
See the Ollama Setup Guide for comprehensive documentation.
Auto-Download via Python APIο
Enable automatic model downloads in Python code:
from review_bot_automator.llm.providers.ollama import OllamaProvider
# Auto-download enabled - model will be downloaded if not available
provider = OllamaProvider(
model="qwen2.5-coder:7b",
auto_download=True # Downloads model automatically (may take several minutes)
)
# Get model recommendations
models = OllamaProvider.list_recommended_models()
for model in models:
print(f"{model['name']}: {model['description']}")
Benefits:
No manual
ollama pullrequiredAutomated CI/CD setup
Seamless model switching
Note: Auto-download is not currently exposed via CLI flags. Use the interactive scripts or manual ollama pull for CLI usage.
Examplesο
Example 1: Free Local Setup (Ollama)ο
Quick Setup (Recommended):
# Automated setup
./scripts/setup_ollama.sh
./scripts/download_ollama_models.sh
# Use preset
pr-resolve apply 123 --llm-preset ollama-local
Manual Setup:
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull model
ollama pull qwen2.5-coder:7b
Option A: Preset:
pr-resolve apply 123 --llm-preset ollama-local
Option B: Config File:
# config.yaml
llm:
enabled: true
provider: ollama
model: qwen2.5-coder:7b
pr-resolve apply 123 --config config.yaml
See Ollama Setup Guide for detailed installation instructions, model recommendations, and troubleshooting.
Example 2: Paid API Setup (OpenAI)ο
config.yaml:
llm:
enabled: true
provider: openai
model: gpt-4o-mini
api_key: ${OPENAI_API_KEY}
cost_budget: 5.0
cache_enabled: true
fallback_to_regex: true
.env:
OPENAI_API_KEY=sk-...
Usage:
source .env
pr-resolve apply 123 --config config.yaml
Example 3: Team Configurationο
team-config.yaml (committed to repo):
llm:
enabled: true
provider: anthropic
model: claude-haiku-4
api_key: ${ANTHROPIC_API_KEY} # Each dev sets their own key
fallback_to_regex: true
cache_enabled: true
max_tokens: 2000
cost_budget: 10.0
Each developer:
# Set personal API key
export ANTHROPIC_API_KEY="sk-ant-..."
# Use team config
pr-resolve apply 123 --config team-config.yaml
Example 4: Override Preset Settingsο
# Start with preset, override specific settings
pr-resolve apply 123 \
--llm-preset openai-api-mini \
--llm-model gpt-4 \
--llm-max-tokens 4000 \
--llm-cost-budget 10.0
Example 5: Multi-Environment Setupο
dev.yaml:
llm:
enabled: true
provider: ollama
model: qwen2.5-coder:7b
staging.yaml:
llm:
enabled: true
provider: anthropic
model: claude-haiku-4
api_key: ${STAGING_API_KEY}
cost_budget: 5.0
prod.yaml:
llm:
enabled: true
provider: anthropic
model: claude-sonnet-4-5
api_key: ${PROD_API_KEY}
cost_budget: 20.0
Usage:
# Development
pr-resolve apply 123 --config dev.yaml
# Staging
export STAGING_API_KEY="sk-ant-staging-..."
pr-resolve apply 123 --config staging.yaml
# Production
export PROD_API_KEY="sk-ant-prod-..."
pr-resolve apply 123 --config prod.yaml
Retry & Resilience Configurationο
Rate Limit Retry (Phase 5)ο
Configure automatic retry behavior for rate limit and transient errors:
Variable |
Default |
Description |
|---|---|---|
|
|
Enable retry on rate limit errors |
|
|
Maximum retry attempts (>=1) |
|
|
Base delay in seconds for exponential backoff |
Example YAML configuration:
llm:
retry_on_rate_limit: true
retry_max_attempts: 5
retry_base_delay: 3.0
Exponential backoff formula:
delay = base_delay * 2^attempt + random_jitter
For example, with retry_base_delay: 2.0:
Attempt 1: ~2s delay
Attempt 2: ~4s delay
Attempt 3: ~8s delay
Circuit Breakerο
Prevents cascading failures by temporarily disabling failing providers:
Variable |
Default |
Description |
|---|---|---|
|
|
Enable circuit breaker pattern |
|
|
Consecutive failures before circuit opens |
|
|
Seconds before attempting recovery |
Circuit breaker states:
CLOSED (normal): Requests pass through
OPEN (failing): Requests fail immediately without calling provider
HALF_OPEN (recovery): Single test request to check if provider recovered
Cache Warmingο
Pre-populate the cache for cold start optimization:
from review_bot_automator.llm.cache.prompt_cache import PromptCache
cache = PromptCache()
entries = [
{
"prompt": "Parse this CodeRabbit comment...",
"provider": "anthropic",
"model": "claude-sonnet-4-5",
"response": "..."
},
# ... more entries
]
loaded, skipped = cache.warm_cache(entries)
print(f"Loaded {loaded} entries, skipped {skipped}")
Benefits:
Eliminates cold start latency
O(n) bulk loading (optimized, no per-entry eviction checks)
Skips duplicates automatically
Thread-safe for concurrent access
Via CachingProvider:
from review_bot_automator.llm.providers.caching_provider import CachingProvider
cached_provider = CachingProvider(base_provider)
loaded, skipped = cached_provider.warm_up(entries)
Troubleshootingο
Environment Variable Not Interpolatedο
Symptom: Config shows ${VAR_NAME} instead of value.
Cause: Environment variable not set.
Solution:
# Check if variable is set
echo $ANTHROPIC_API_KEY
# Set the variable
export ANTHROPIC_API_KEY="sk-ant-..."
# Verify
pr-resolve apply 123 --config config.yaml --dry-run
API Key Rejected in Config Fileο
Error:
ConfigError: SECURITY: API keys must NOT be stored in configuration files
Cause: Real API key in config file instead of ${VAR} syntax.
Solution:
# β Wrong
llm:
api_key: sk-ant-real-key
# β
Correct
llm:
api_key: ${ANTHROPIC_API_KEY}
Preset Not Foundο
Error:
ConfigError: Unknown preset 'invalid-preset'
Solution: List available presets:
pr-resolve config show-presets
Configuration Not Appliedο
Symptom: Settings from config file ignored.
Cause: CLI flags or environment variables have higher precedence.
Solution: Check precedence order:
Remove conflicting CLI flags
Unset conflicting environment variables (
unset CR_LLM_PROVIDER)Verify config file syntax (
--config config.yaml --dry-run)
LLM Still Disabled After Configurationο
Cause: API-based provider without API key.
Solution:
# Check configuration
pr-resolve config show
# Ensure API key is set
export OPENAI_API_KEY="sk-..."
# Or use CLI-based preset (no API key needed)
pr-resolve apply 123 --llm-preset codex-cli-free
Ollama Connection Failedο
Error:
LLMProviderError: Failed to connect to Ollama at http://localhost:11434
Solution:
# Check Ollama is running
ollama list
# Start Ollama if needed
ollama serve
# Or specify custom URL
export OLLAMA_BASE_URL="http://ollama-server:11434"
pr-resolve apply 123 --config config.yaml
Performance Considerationsο
Choosing the Right Providerο
Different LLM providers have different characteristics in terms of latency, cost, accuracy, and privacy. Consider these factors when choosing a provider:
For Speed-Critical Applications:
OpenAI and Anthropic APIs typically offer the lowest latency (1-3s mean)
Best for real-time workflows and interactive use cases
For Cost-Sensitive Deployments:
Ollama (local) has zero per-request cost but requires hardware
OpenAIβs gpt-4o-mini offers good balance of cost and performance
Anthropic with prompt caching can reduce costs by 50-90%
For Privacy-First Requirements:
Ollama eliminates LLM vendor exposure (no OpenAI/Anthropic)
Simplifies compliance for HIPAA, GDPR (one fewer data processor)
Note: GitHub/CodeRabbit still have access (required)
Trade-off: Higher latency, especially on CPU-only systems
For High-Volume Production:
Anthropic with prompt caching (50-90% cost reduction on repeated prompts)
Connection pooling and retry logic built-in for all providers
Performance Benchmarkingο
Comprehensive performance benchmarks comparing all providers are available in the Performance Benchmarks document. The benchmarks measure:
Latency: Mean, median, P95, P99 response times
Throughput: Requests per second
Accuracy: Parsing success rates vs ground truth
Cost: Per-request and monthly estimates at scale
Run your own benchmarks:
# Benchmark all providers (requires API keys)
python scripts/benchmark_llm.py --iterations 100
# Benchmark specific providers
python scripts/benchmark_llm.py --providers ollama openai --iterations 50
# Custom dataset
python scripts/benchmark_llm.py --dataset my_comments.json --output my_report.md
See python scripts/benchmark_llm.py --help for all options.
See Alsoο
LLM Provider Guide - Provider comparison and selection guide
Circuit Breaker - Resilience pattern for handling provider failures
Metrics Guide - Understanding LLM metrics and export options
Cost Estimation - Pre-run cost estimation and budget configuration
Confidence Threshold - Tuning LLM confidence for accuracy/coverage balance
Performance Benchmarks - Detailed performance comparison of all providers
Ollama Setup Guide - Comprehensive Ollama installation and setup guide
Main Configuration Guide - Basic LLM setup and provider documentation
Getting Started Guide - Quick start with LLM features
Troubleshooting - Common issues and solutions
API Reference - Configuration API documentation
Security Architecture - Security best practices