# LLM Configuration Guide > **✅ Production Status**: All 5 LLM providers are production-ready (Phase 2 Complete - Nov 9, 2025) > > * OpenAI API, Anthropic API, Claude CLI, Codex CLI, Ollama > * All providers support retry logic, cost tracking, and health checks > * GPU acceleration available for Ollama (NVIDIA, AMD, Apple Silicon) > * HTTP connection pooling and model auto-download features included This guide covers advanced LLM configuration features including configuration files, presets, and environment variable interpolation. > **Note**: LLM features are supported by both `apply` and `analyze` commands with identical configuration options. > **See Also**: [Main Configuration Guide](configuration.md#llm-provider-configuration) for basic LLM setup and provider-specific documentation. ## Table of Contents * [Configuration File Support](#configuration-file-support) * [LLM Effort Level](#llm-effort-level) * [LLM Presets](#llm-presets) * [Environment Variable Interpolation](#environment-variable-interpolation) * [Configuration Precedence](#configuration-precedence) * [API Key Security](#api-key-security) * [Examples](#examples) * [Troubleshooting](#troubleshooting) ## Configuration File Support The resolver supports YAML and TOML configuration files for LLM settings. This allows you to: * Store non-sensitive configuration in version control * Share team-wide LLM settings * Manage complex configurations more easily * Use environment variable interpolation for secrets ### YAML Configuration Create a `config.yaml` file: ```yaml llm: enabled: true provider: anthropic model: claude-sonnet-4-5 api_key: ${ANTHROPIC_API_KEY} # Environment variable reference fallback_to_regex: true cache_enabled: true max_tokens: 2000 confidence_threshold: 0.6 # Reject changes below 60% confidence cost_budget: 5.0 # Note: cost_budget is advisory and not currently enforced (see [Sub-Issue #225](../planning/ROADMAP.md)) ``` Use with: ```bash # With apply command pr-resolve apply 123 --config config.yaml # With analyze command pr-resolve analyze --pr 123 --owner myorg --repo myrepo --config config.yaml ``` ### TOML Configuration Create a `config.toml` file: ```toml [llm] enabled = true provider = "openai" model = "gpt-4o-mini" api_key = "${OPENAI_API_KEY}" # Environment variable reference fallback_to_regex = true cache_enabled = true max_tokens = 2000 confidence_threshold = 0.6 # Reject changes below 60% confidence cost_budget = 5.0 # Note: cost_budget is advisory and not currently enforced. # This field allows users to express intended spending limits and # serves as a placeholder for future enforcement/alerts (see [Sub-Issue #225](../planning/ROADMAP.md)). ``` Use with: ```bash pr-resolve apply 123 --config config.toml ``` ### Configuration File Schema | Field | Type | Default | Description | | ------- | ------ | --------- | ------------- | | `llm.enabled` | boolean | `false` | Enable LLM-powered features | | `llm.provider` | string | `claude-cli` | Provider name (`claude-cli`, `codex-cli`, `ollama`, `openai`, `anthropic`) | | `llm.model` | string | provider-specific | Model identifier (e.g., `claude-sonnet-4-5`, `gpt-4o-mini`) | | `llm.api_key` | string | `null` | **Must use `${VAR}` syntax** - direct keys are rejected | | `llm.fallback_to_regex` | boolean | `true` | Fall back to regex parsing if LLM fails | | `llm.cache_enabled` | boolean | `true` | Enable response caching | | `llm.max_tokens` | integer | `2000` | Maximum tokens per LLM request | | `llm.confidence_threshold` | float | `0.5` | Minimum LLM confidence (0.0-1.0) required to accept changes | | `llm.cost_budget` | float | `null` | Cost budget configuration (advisory only, not currently enforced). This field allows users to express intended spending limits and serves as a placeholder for future enforcement/alerts (see [Sub-Issue #225](../planning/ROADMAP.md)). | | `llm.ollama_base_url` | string | `http://localhost:11434` | Ollama server URL (Ollama only) | | `llm.effort` | string | `null` | Effort level for speed/cost vs accuracy tradeoff (`none`, `low`, `medium`, `high`) | ## LLM Effort Level The `--llm-effort` option controls the speed/cost vs accuracy tradeoff for LLM providers that support extended reasoning capabilities. ### Effort Levels | Level | Description | Use Case | | ------- | ------------- | ---------- | | `none` | Fastest, minimal reasoning | Quick parsing, cost-sensitive | | `low` | Light reasoning | Balanced speed/accuracy | | `medium` | Moderate reasoning | Complex comments | | `high` | Most thorough reasoning | Maximum accuracy, complex parsing | ### Provider Support | Provider | Parameter | Notes | | ---------- | ----------- | ------- | | **OpenAI** | `reasoning_effort` | Supported on GPT-5.x models | | **Anthropic** | `effort` | Supported on Claude Opus 4.5 | | **Ollama** | Not supported | Uses standard inference | | **Claude CLI** | Not supported | Uses standard inference | | **Codex CLI** | Not supported | Uses standard inference | ### Usage **CLI flag:** ```bash # Fast parsing (minimal reasoning) pr-resolve apply 123 --llm-effort none # Maximum accuracy (thorough reasoning) pr-resolve apply 123 --llm-effort high # With analyze command pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-effort medium ``` **Environment variable:** ```bash export CR_LLM_EFFORT=medium pr-resolve apply 123 ``` **Configuration file:** ```yaml llm: enabled: true provider: anthropic model: claude-opus-4-5 api_key: ${ANTHROPIC_API_KEY} effort: high # Maximum reasoning for complex parsing ``` ### Cost Considerations Higher effort levels typically increase: * Response latency (more reasoning time) * Token usage (reasoning tokens counted) * Per-request cost For cost-sensitive deployments, use `none` or `low` effort levels. Reserve `high` for complex parsing tasks where accuracy is critical. ## LLM Presets Presets provide zero-config LLM setup with sensible defaults for common use cases. ### Available Presets | Preset | Provider | Model | Status | Cost | Requires | | -------- | ---------- | ------- | -------- | ------ | ---------- | | `codex-cli-free` | Codex CLI | `codex` | ✅ Production | Free | GitHub Copilot subscription | | `ollama-local` | Ollama | `qwen2.5-coder:7b` | ✅ Production + GPU | Free | Local Ollama + GPU (optional) | | `claude-cli-sonnet` | Claude CLI | `claude-sonnet-4-5` | ✅ Production | Free | Claude subscription | | `openai-api-mini` | OpenAI API | `gpt-4o-mini` | ✅ Production | ~$0.15/1M tokens | API key ($5 budget) | | `anthropic-api-balanced` | Anthropic API | `claude-haiku-4` | ✅ Production | ~$0.25/1M tokens | API key ($5 budget) | ### Using Presets #### CLI-Based Presets (Free) No API key required: ```bash # GitHub Codex (requires Copilot subscription) pr-resolve apply 123 --llm-preset codex-cli-free pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-preset codex-cli-free # Local Ollama (requires ollama installation) pr-resolve apply 123 --llm-preset ollama-local pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-preset ollama-local # Claude CLI (requires Claude subscription) pr-resolve apply 123 --llm-preset claude-cli-sonnet pr-resolve analyze --pr 123 --owner myorg --repo myrepo --llm-preset claude-cli-sonnet ``` #### API-Based Presets (Paid) Require API key via environment variable or CLI flag: ```bash # OpenAI (low-cost) export OPENAI_API_KEY="sk-..." pr-resolve apply 123 --llm-preset openai-api-mini # Anthropic (balanced, with caching) export ANTHROPIC_API_KEY="sk-ant-..." pr-resolve apply 123 --llm-preset anthropic-api-balanced # Or pass API key via CLI flag pr-resolve apply 123 --llm-preset openai-api-mini --llm-api-key sk-... ``` ### Available Presets (Provider-Specific) The following LLM presets are available: 1. **codex-cli-free**: Free Codex CLI - Requires GitHub Copilot subscription * Provider: codex-cli * Model: codex * Requires API key: No 2. **ollama-local**: Local Ollama - Free, private, offline (recommended: qwen2.5-coder:7b) * Provider: ollama * Model: qwen2.5-coder:7b * Requires API key: No 3. **claude-cli-sonnet**: Claude CLI with Sonnet 4.5 - Requires Claude subscription * Provider: claude-cli * Model: claude-sonnet-4-5 * Requires API key: No 4. **openai-api-mini**: OpenAI GPT-4o-mini - Low-cost API (requires API key) * Provider: openai * Model: gpt-4o-mini * Requires API key: Yes * Cost budget: $5.00 5. **anthropic-api-balanced**: Anthropic Claude Haiku 4 - Balanced cost/performance (requires API key) * Provider: anthropic * Model: claude-haiku-4 * Requires API key: Yes * Cost budget: $5.00 ## Privacy Considerations Different LLM providers have significantly different privacy characteristics. Understanding these differences is crucial for choosing the right provider for your use case. ### Privacy Comparison | Provider | LLM Vendor Exposure | GitHub API Required | Best For | | ---------- | --------------------- | --------------------- | ---------- | | **Ollama** | ✅ **None** (localhost) | ⚠️ Yes | Reducing third-party exposure, compliance | | **OpenAI** | ❌ OpenAI (US) | ⚠️ Yes | Cost-effective, production | | **Anthropic** | ❌ Anthropic (US) | ⚠️ Yes | Quality, caching benefits | | **Claude CLI** | ❌ Anthropic (US) | ⚠️ Yes | Interactive, convenience | | **Codex CLI** | ❌ GitHub/OpenAI | ⚠️ Yes | GitHub integration | **Note**: All options require GitHub API access (internet required). The privacy difference is whether an LLM vendor also sees your review comments. ### Ollama: Reduced Third-Party Exposure 🔒 **When using Ollama** (`ollama-local` preset): * ✅ **Local LLM processing** - Review comments processed locally (no LLM vendor) * ✅ **No LLM vendor exposure** - OpenAI/Anthropic never see your comments * ✅ **Simpler compliance** - One fewer data processor (no LLM vendor BAA/DPA) * ✅ **Zero LLM costs** - Free after hardware investment * ✅ **No LLM API keys required** - No credential management * ⚠️ **GitHub API required** - Internet needed to fetch PR data (not offline/air-gapped) **Reality Check**: * ⚠️ Code is on GitHub (required for PR workflow) * ⚠️ CodeRabbit has access (required for reviews) * ✅ LLM vendor does NOT have access (eliminated) **Recommended for**: * Reducing third-party LLM vendor exposure * Regulated industries wanting simpler compliance chain (GDPR, HIPAA, SOC2) * Organizations with policies against cloud LLM services * Cost-conscious usage (no per-request LLM fees) **Learn more**: * [Privacy Architecture](privacy-architecture.md) - Detailed privacy analysis * [Local LLM Operation Guide](local-llm-operation-guide.md) - Setup instructions * [Privacy FAQ](privacy-faq.md) - Common privacy questions ### API Providers: Convenience vs. Privacy Trade-off **When using API providers** (OpenAI, Anthropic, etc.): * ⚠️ **Data transmitted to cloud** - Review comments and code sent via HTTPS * ⚠️ **Third-party data policies** - Subject to provider's retention and usage policies * ⚠️ **Internet required** - No offline operation * ⚠️ **Costs per request** - Ongoing usage fees * ⚠️ **Compliance complexity** - May require Data Processing Agreements (DPA), Business Associate Agreements (BAA) **Acceptable for**: * Open source / public code repositories * Organizations with enterprise LLM agreements * Use cases where privacy trade-off is acceptable * When highest model quality is required (GPT-4, Claude Opus) **Privacy safeguards**: * ✅ Data encrypted in transit (HTTPS/TLS) * ✅ API keys never logged by pr-resolve * ✅ Anthropic: No training on API data by default * ✅ OpenAI: Can opt out of training data usage ### Privacy Verification Verify Ollama's local-only operation: ```bash # Run privacy verification script ./scripts/verify_privacy.sh # Expected output # ✅ Privacy Verification: PASSED # ✅ No external network connections detected # ✅ All Ollama traffic is localhost-only ``` The script monitors network traffic during Ollama inference and confirms no external LLM vendor connections are made. **Note**: GitHub API connections will still appear (required for PR workflow). **See**: [Local LLM Operation Guide - Privacy Verification](local-llm-operation-guide.md#privacy-verification) for details. ### Making the Right Choice **Choose Ollama if you need**: * ✅ Reduced third-party exposure (eliminate LLM vendor) * ✅ Simpler compliance chain (one fewer data processor) * ✅ Zero ongoing LLM costs * ⚠️ **NOT for**: Offline/air-gapped operation (requires GitHub API) **Choose API providers if**: * ✅ Privacy trade-off is acceptable (public/open-source code) * ✅ Enterprise agreements are in place (DPA, BAA) * ✅ Highest model quality is priority * ✅ Budget is available for per-request fees For detailed privacy analysis and compliance considerations, see the [Privacy Architecture](privacy-architecture.md) documentation. ## Environment Variable Interpolation Configuration files support `${VAR_NAME}` syntax for injecting environment variables at runtime. ### Syntax ```yaml llm: api_key: ${ANTHROPIC_API_KEY} model: ${LLM_MODEL:-claude-haiku-4} # With default value (not yet supported) ``` ### Behavior * **Found**: Variable is replaced with its value * **Not Found**: Placeholder remains (`${VAR_NAME}`) with warning logged * **Security**: Only `${VAR}` syntax is allowed for API keys in config files ### Examples #### Basic Interpolation ```yaml llm: provider: ${LLM_PROVIDER} model: ${LLM_MODEL} api_key: ${OPENAI_API_KEY} ``` ```bash export LLM_PROVIDER="openai" export LLM_MODEL="gpt-4o-mini" export OPENAI_API_KEY="sk-..." pr-resolve apply 123 --config config.yaml ``` #### Multiple Variables ```toml [llm] provider = "${PROVIDER}" api_key = "${API_KEY}" [llm.ollama] base_url = "${OLLAMA_URL}" ``` #### Nested Structures ```yaml llm: enabled: true provider: anthropic api_key: ${ANTHROPIC_API_KEY} cache: enabled: ${CACHE_ENABLED} ttl: ${CACHE_TTL} ``` ## Configuration Precedence Configuration sources are applied in this order (highest to lowest priority): 1. **CLI Flags** - Command-line arguments (`--llm-provider openai`) 2. **Environment Variables** - `CR_LLM_*` variables 3. **Configuration File** - YAML/TOML file (`--config config.yaml`) 4. **LLM Presets** - Preset via `--llm-preset` flag 5. **Default Values** - Built-in defaults ### Example: Layering Configuration ```bash # Start with preset export LLM_PRESET="openai-api-mini" # Override with env vars export CR_LLM_MODEL="gpt-4" export CR_LLM_MAX_TOKENS=4000 # Override with CLI flags pr-resolve apply 123 \ --llm-preset openai-api-mini \ --llm-model gpt-4o \ --llm-api-key sk-... # Result # - provider: openai (from preset) # - model: gpt-4o (CLI flag overrides env var) # - api_key: sk-... (CLI flag) # - max_tokens: 4000 (env var) # - cost_budget: 5.0 (preset default) ``` ### Precedence Table | Setting | CLI Flag | Env Var | Config File | Preset | Default | | --------- | ---------- | --------- | ------------- | -------- | --------- | | **Priority** | 1 (highest) | 2 | 3 | 4 | 5 (lowest) | | **Scope** | Single run | Session | Project | Quick setup | Fallback | | **Use Case** | Testing, overrides | Personal settings | Team config | Zero-config | Sensible defaults | ## API Key Security ### Security Rules 1. **Never commit API keys to version control** 2. **API keys MUST use environment variables** 3. **Config files MUST use `${VAR}` syntax for API keys** 4. **Direct API keys in config files are rejected** ### Valid Configuration ✅ **Allowed** - Environment variable reference: ```yaml llm: api_key: ${ANTHROPIC_API_KEY} # ✅ Valid ``` ```toml [llm] api_key = "${OPENAI_API_KEY}" # ✅ Valid ``` ❌ **Rejected** - Direct API key: ```yaml llm: api_key: sk-ant-real-key-12345 # ❌ REJECTED ``` ```toml [llm] api_key = "sk-openai-real-key" # ❌ REJECTED ``` ### Error Message When a real API key is detected in a config file: ```text ConfigError: SECURITY: API keys must NOT be stored in configuration files (config.yaml). Use environment variables: CR_LLM_API_KEY or ${OPENAI_API_KEY}. Example: api_key: ${ANTHROPIC_API_KEY} Supported environment variables: * CR_LLM_API_KEY (generic) * OPENAI_API_KEY (OpenAI) * ANTHROPIC_API_KEY (Anthropic) ``` ### Best Practices 1. **Use `.env` file for local development**: ```bash # .env (add to .gitignore) OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... ``` 2. **Reference in config file**: ```yaml llm: api_key: ${OPENAI_API_KEY} ``` 3. **Load environment variables**: ```bash source .env pr-resolve apply 123 --config config.yaml ``` ## Ollama Auto-Download Feature The Ollama provider supports automatic model downloading for streamlined setup. ### Quick Setup with Scripts Use the automated setup scripts for the easiest Ollama installation: ```bash # 1. Install and setup Ollama ./scripts/setup_ollama.sh # 2. Download recommended model ./scripts/download_ollama_models.sh # 3. Use with pr-resolve pr-resolve apply 123 --llm-preset ollama-local ``` See the [Ollama Setup Guide](ollama-setup.md) for comprehensive documentation. ### Auto-Download via Python API Enable automatic model downloads in Python code: ```python from review_bot_automator.llm.providers.ollama import OllamaProvider # Auto-download enabled - model will be downloaded if not available provider = OllamaProvider( model="qwen2.5-coder:7b", auto_download=True # Downloads model automatically (may take several minutes) ) # Get model recommendations models = OllamaProvider.list_recommended_models() for model in models: print(f"{model['name']}: {model['description']}") ``` **Benefits**: * No manual `ollama pull` required * Automated CI/CD setup * Seamless model switching **Note**: Auto-download is not currently exposed via CLI flags. Use the interactive scripts or manual `ollama pull` for CLI usage. ## Examples ### Example 1: Free Local Setup (Ollama) **Quick Setup (Recommended)**: ```bash # Automated setup ./scripts/setup_ollama.sh ./scripts/download_ollama_models.sh # Use preset pr-resolve apply 123 --llm-preset ollama-local ``` **Manual Setup**: ```bash # Install Ollama curl -fsSL https://ollama.ai/install.sh | sh # Pull model ollama pull qwen2.5-coder:7b ``` **Option A: Preset**: ```bash pr-resolve apply 123 --llm-preset ollama-local ``` **Option B: Config File**: ```yaml # config.yaml llm: enabled: true provider: ollama model: qwen2.5-coder:7b ``` ```bash pr-resolve apply 123 --config config.yaml ``` See [Ollama Setup Guide](ollama-setup.md) for detailed installation instructions, model recommendations, and troubleshooting. ### Example 2: Paid API Setup (OpenAI) **config.yaml**: ```yaml llm: enabled: true provider: openai model: gpt-4o-mini api_key: ${OPENAI_API_KEY} cost_budget: 5.0 cache_enabled: true fallback_to_regex: true ``` **.env**: ```bash OPENAI_API_KEY=sk-... ``` **Usage**: ```bash source .env pr-resolve apply 123 --config config.yaml ``` ### Example 3: Team Configuration **team-config.yaml** (committed to repo): ```yaml llm: enabled: true provider: anthropic model: claude-haiku-4 api_key: ${ANTHROPIC_API_KEY} # Each dev sets their own key fallback_to_regex: true cache_enabled: true max_tokens: 2000 cost_budget: 10.0 ``` **Each developer**: ```bash # Set personal API key export ANTHROPIC_API_KEY="sk-ant-..." # Use team config pr-resolve apply 123 --config team-config.yaml ``` ### Example 4: Override Preset Settings ```bash # Start with preset, override specific settings pr-resolve apply 123 \ --llm-preset openai-api-mini \ --llm-model gpt-4 \ --llm-max-tokens 4000 \ --llm-cost-budget 10.0 ``` ### Example 5: Multi-Environment Setup **dev.yaml**: ```yaml llm: enabled: true provider: ollama model: qwen2.5-coder:7b ``` **staging.yaml**: ```yaml llm: enabled: true provider: anthropic model: claude-haiku-4 api_key: ${STAGING_API_KEY} cost_budget: 5.0 ``` **prod.yaml**: ```yaml llm: enabled: true provider: anthropic model: claude-sonnet-4-5 api_key: ${PROD_API_KEY} cost_budget: 20.0 ``` **Usage**: ```bash # Development pr-resolve apply 123 --config dev.yaml # Staging export STAGING_API_KEY="sk-ant-staging-..." pr-resolve apply 123 --config staging.yaml # Production export PROD_API_KEY="sk-ant-prod-..." pr-resolve apply 123 --config prod.yaml ``` ## Retry & Resilience Configuration ### Rate Limit Retry (Phase 5) Configure automatic retry behavior for rate limit and transient errors: | Variable | Default | Description | |----------|---------|-------------| | `CR_LLM_RETRY_ON_RATE_LIMIT` | `true` | Enable retry on rate limit errors | | `CR_LLM_RETRY_MAX_ATTEMPTS` | `3` | Maximum retry attempts (>=1) | | `CR_LLM_RETRY_BASE_DELAY` | `2.0` | Base delay in seconds for exponential backoff | **Example YAML configuration:** ```yaml llm: retry_on_rate_limit: true retry_max_attempts: 5 retry_base_delay: 3.0 ``` **Exponential backoff formula:** ```text delay = base_delay * 2^attempt + random_jitter ``` For example, with `retry_base_delay: 2.0`: * Attempt 1: ~2s delay * Attempt 2: ~4s delay * Attempt 3: ~8s delay ### Circuit Breaker Prevents cascading failures by temporarily disabling failing providers: | Variable | Default | Description | |----------|---------|-------------| | `CR_LLM_CIRCUIT_BREAKER_ENABLED` | `true` | Enable circuit breaker pattern | | `CR_LLM_CIRCUIT_BREAKER_THRESHOLD` | `5` | Consecutive failures before circuit opens | | `CR_LLM_CIRCUIT_BREAKER_COOLDOWN` | `60.0` | Seconds before attempting recovery | **Circuit breaker states:** 1. **CLOSED** (normal): Requests pass through 2. **OPEN** (failing): Requests fail immediately without calling provider 3. **HALF_OPEN** (recovery): Single test request to check if provider recovered ## Cache Warming Pre-populate the cache for cold start optimization: ```python from review_bot_automator.llm.cache.prompt_cache import PromptCache cache = PromptCache() entries = [ { "prompt": "Parse this CodeRabbit comment...", "provider": "anthropic", "model": "claude-sonnet-4-5", "response": "..." }, # ... more entries ] loaded, skipped = cache.warm_cache(entries) print(f"Loaded {loaded} entries, skipped {skipped}") ``` **Benefits:** * Eliminates cold start latency * O(n) bulk loading (optimized, no per-entry eviction checks) * Skips duplicates automatically * Thread-safe for concurrent access **Via CachingProvider:** ```python from review_bot_automator.llm.providers.caching_provider import CachingProvider cached_provider = CachingProvider(base_provider) loaded, skipped = cached_provider.warm_up(entries) ``` ## Troubleshooting ### Environment Variable Not Interpolated **Symptom**: Config shows `${VAR_NAME}` instead of value. **Cause**: Environment variable not set. **Solution**: ```bash # Check if variable is set echo $ANTHROPIC_API_KEY # Set the variable export ANTHROPIC_API_KEY="sk-ant-..." # Verify pr-resolve apply 123 --config config.yaml --dry-run ``` ### API Key Rejected in Config File **Error**: ```text ConfigError: SECURITY: API keys must NOT be stored in configuration files ``` **Cause**: Real API key in config file instead of `${VAR}` syntax. **Solution**: ```yaml # ❌ Wrong llm: api_key: sk-ant-real-key # ✅ Correct llm: api_key: ${ANTHROPIC_API_KEY} ``` ### Preset Not Found **Error**: ```text ConfigError: Unknown preset 'invalid-preset' ``` **Solution**: List available presets: ```bash pr-resolve config show-presets ``` ### Configuration Not Applied **Symptom**: Settings from config file ignored. **Cause**: CLI flags or environment variables have higher precedence. **Solution**: Check precedence order: 1. Remove conflicting CLI flags 2. Unset conflicting environment variables (`unset CR_LLM_PROVIDER`) 3. Verify config file syntax (`--config config.yaml --dry-run`) ### LLM Still Disabled After Configuration **Cause**: API-based provider without API key. **Solution**: ```bash # Check configuration pr-resolve config show # Ensure API key is set export OPENAI_API_KEY="sk-..." # Or use CLI-based preset (no API key needed) pr-resolve apply 123 --llm-preset codex-cli-free ``` ### Ollama Connection Failed **Error**: ```text LLMProviderError: Failed to connect to Ollama at http://localhost:11434 ``` **Solution**: ```bash # Check Ollama is running ollama list # Start Ollama if needed ollama serve # Or specify custom URL export OLLAMA_BASE_URL="http://ollama-server:11434" pr-resolve apply 123 --config config.yaml ``` ## Performance Considerations ### Choosing the Right Provider Different LLM providers have different characteristics in terms of latency, cost, accuracy, and privacy. Consider these factors when choosing a provider: **For Speed-Critical Applications:** * OpenAI and Anthropic APIs typically offer the lowest latency (1-3s mean) * Best for real-time workflows and interactive use cases **For Cost-Sensitive Deployments:** * Ollama (local) has zero per-request cost but requires hardware * OpenAI's gpt-4o-mini offers good balance of cost and performance * Anthropic with prompt caching can reduce costs by 50-90% **For Privacy-First Requirements:** * Ollama eliminates LLM vendor exposure (no OpenAI/Anthropic) * Simplifies compliance for HIPAA, GDPR (one fewer data processor) * Note: GitHub/CodeRabbit still have access (required) * Trade-off: Higher latency, especially on CPU-only systems **For High-Volume Production:** * Anthropic with prompt caching (50-90% cost reduction on repeated prompts) * Connection pooling and retry logic built-in for all providers ### Performance Benchmarking Comprehensive performance benchmarks comparing all providers are available in the [Performance Benchmarks](performance-benchmarks.md) document. The benchmarks measure: * **Latency**: Mean, median, P95, P99 response times * **Throughput**: Requests per second * **Accuracy**: Parsing success rates vs ground truth * **Cost**: Per-request and monthly estimates at scale **Run your own benchmarks:** ```bash # Benchmark all providers (requires API keys) python scripts/benchmark_llm.py --iterations 100 # Benchmark specific providers python scripts/benchmark_llm.py --providers ollama openai --iterations 50 # Custom dataset python scripts/benchmark_llm.py --dataset my_comments.json --output my_report.md ``` See `python scripts/benchmark_llm.py --help` for all options. ## See Also * [LLM Provider Guide](llm-provider-guide.md) - Provider comparison and selection guide * [Circuit Breaker](circuit-breaker.md) - Resilience pattern for handling provider failures * [Metrics Guide](metrics-guide.md) - Understanding LLM metrics and export options * [Cost Estimation](cost-estimation.md) - Pre-run cost estimation and budget configuration * [Confidence Threshold](confidence-threshold.md) - Tuning LLM confidence for accuracy/coverage balance * [Performance Benchmarks](performance-benchmarks.md) - Detailed performance comparison of all providers * [Ollama Setup Guide](ollama-setup.md) - Comprehensive Ollama installation and setup guide * [Main Configuration Guide](configuration.md) - Basic LLM setup and provider documentation * [Getting Started Guide](getting-started.md) - Quick start with LLM features * [Troubleshooting](troubleshooting.md) - Common issues and solutions * [API Reference](api-reference.md) - Configuration API documentation * [Security Architecture](security-architecture.md) - Security best practices