# Cost Estimation Guide

This guide helps you estimate and control LLM costs before running the conflict resolver.

## Cost Overview

LLM costs depend on:

* **Provider**: API-based providers charge per token
* **Model**: Larger models cost more
* **Comment count**: More comments = more API calls
* **Cache hit rate**: Cached responses are free or cheaper

## Provider Cost Comparison

### Per-Comment Estimates (Typical)

| Provider | Model | Cost/Comment | Notes |
|----------|-------|--------------|-------|
| **Ollama** | qwen2.5-coder:7b | $0.0000 | Free (local) |
| **Claude CLI** | claude-sonnet-4-5 | $0.0000 | Subscription-based |
| **Codex CLI** | codex | $0.0000 | Subscription-based |
| **OpenAI API** | gpt-5-nano | ~$0.0001 | Cheapest API option |
| **OpenAI API** | gpt-5-mini | ~$0.0003 | Best value (Nov 2025) |
| **OpenAI API** | gpt-4o-mini | ~$0.0002 | Low-cost API |
| **Anthropic API** | claude-haiku-4-5 | ~$0.0008 | Low-cost API |
| **Anthropic API** | claude-sonnet-4-5 | ~$0.0030 | Balanced |
| **Anthropic API** | claude-opus-4-5 | ~$0.0050 | Flagship (67% cheaper than 4.1!) |
| **OpenAI API** | gpt-5.1 | ~$0.0015 | Latest flagship |

### Monthly Cost Projections

Assuming 100 PRs/month with 20 comments each (2,000 comments):

| Provider | Model | Monthly Cost |
|----------|-------|--------------|
| Ollama | qwen2.5-coder:7b | $0.00 |
| Claude CLI | claude-sonnet-4-5 | $0.00 (subscription) |
| OpenAI API | gpt-5-nano | ~$0.20 |
| OpenAI API | gpt-5-mini | ~$0.60 |
| OpenAI API | gpt-4o-mini | ~$0.40 |
| Anthropic API | claude-haiku-4-5 | ~$1.60 |
| OpenAI API | gpt-5.1 | ~$3.00 |
| Anthropic API | claude-sonnet-4-5 | ~$6.00 |
| Anthropic API | claude-opus-4-5 | ~$10.00 |

## Pre-Run Cost Estimation

### 1. Estimate Comment Count

```bash
# Count comments in a PR using GitHub CLI
gh pr view 123 --json comments --jq '.comments | length'
```

### 2. Calculate Estimated Cost

```text
estimated_cost = comment_count × cost_per_comment × (1 - expected_cache_hit_rate)
```

Example for 50 comments with Anthropic Haiku:

```text
estimated_cost = 50 × $0.0008 × (1 - 0.30) = $0.028
```

### 3. Use Dry-Run Mode

Run with `--mode dry-run` to analyze without making API calls:

```bash
pr-resolve apply 123 --mode dry-run --llm-enabled
```

This shows the comment count without incurring costs.

## Budget Configuration

### Setting a Budget

Prevent cost overruns with `CR_LLM_COST_BUDGET`:

```bash
# Limit to $5 per run
export CR_LLM_COST_BUDGET=5.0
```

Or in config:

```yaml
llm:
  cost_budget: 5.0
```

### Budget Status

The cost tracker has three states:

| Status | Description | Action |
|--------|-------------|--------|
| `OK` | Under 80% of budget | Continue normally |
| `WARNING` | 80-99% of budget | Log warning, continue |
| `EXCEEDED` | 100%+ of budget | Block new requests |

### Warning Threshold

Customize when warnings appear:

```yaml
llm:
  cost_budget: 5.0
  # Warn at 70% instead of default 80%
  # (configured in code, not yet exposed)
```

## Cost Optimization Strategies

### 1. Enable Prompt Caching

Cache identical prompts to avoid redundant API calls:

```yaml
llm:
  cache_enabled: true  # Default: true
```

**Impact**: 30-50% cost reduction typical

### 2. Use Cost-Effective Models

For most CodeRabbit comments, smaller models work well:

```yaml
# Anthropic: Use Haiku instead of Sonnet
llm:
  provider: anthropic
  model: claude-haiku-4-20250514

# OpenAI: Use GPT-4o-mini instead of GPT-4o
llm:
  provider: openai
  model: gpt-4o-mini
```

### 3. Use Free Providers

For development or cost-sensitive environments:

```yaml
# Local Ollama (completely free)
llm:
  provider: ollama
  model: qwen2.5-coder:7b

# Claude CLI (subscription, no per-use cost)
llm:
  provider: claude-cli
  model: claude-sonnet-4-5
```

### 4. Increase Confidence Threshold

Reject low-confidence parses to reduce follow-up calls:

```yaml
llm:
  confidence_threshold: 0.7  # Default: 0.5
```

### 5. Limit Parallel Workers

Reduce concurrent API calls:

```yaml
parallel:
  enabled: true
  max_workers: 2  # Default: 4
```

## Monitoring Costs

### Real-Time Tracking

Enable metrics to see costs after each run:

```bash
pr-resolve apply 123 --llm-enabled --show-metrics
```

Output includes:

```text
Total cost: $0.0523
Cache hit rate: 35.7%
```

### Export for Analysis

```bash
pr-resolve apply 123 --llm-enabled --show-metrics --metrics-output costs.json
```

Review in JSON:

```json
{
  "summary": {
    "total_cost": 0.0523,
    "cost_per_comment": 0.00124,
    "cache_savings": 0.0187
  }
}
```

## Cost Alerts

When budget reaches warning threshold (80% by default):

```text
WARNING: LLM cost budget warning: $4.12 of $5.00 used (82.4%)
```

When budget is exceeded:

```text
ERROR: LLM cost budget exceeded: $5.23 of $5.00 used
```

New requests are blocked until the run completes.

## Recommended Budgets

| Use Case | Budget | Rationale |
|----------|--------|-----------|
| Development/Testing | $1.00 | Low risk while iterating |
| CI/CD per-PR | $2.00 | Typical PR costs < $0.50 |
| Large Monorepo | $10.00 | Higher comment counts |
| Batch Processing | $50.00 | Multiple PRs in sequence |

## Cost Tracking API

For programmatic cost tracking:

```python
from review_bot_automator.llm.cost_tracker import CostTracker, CostStatus

# Initialize with budget
tracker = CostTracker(budget=5.00, warning_threshold=0.8)

# Before each API call
if tracker.should_block_request():
    raise Exception("Budget exceeded")

# After API call
status = tracker.add_cost(0.0012)
if status == CostStatus.WARNING:
    print(tracker.get_warning_message())
elif status == CostStatus.EXCEEDED:
    print("Budget exceeded!")

# Check current state
print(f"Remaining: ${tracker.remaining_budget:.4f}")
print(f"Utilization: {tracker.budget_utilization * 100:.1f}%")
```

## See Also

* [LLM Configuration](llm-configuration.md) - Full configuration reference
* [Metrics Guide](metrics-guide.md) - Understanding metrics output
* [Performance Tuning](performance-tuning.md) - Optimizing performance and cost