The AI content system uses four LLM providers in a priority cascade. If the primary provider is down or slow, the worker automatically falls back to the next available provider. Circuit breakers prevent repeated calls to failing providers.
Why This Exists
On April 15, 2026, Anthropic experienced a 1h 32m outage (89.6% availability over 30 days at that point). With single-provider dependency, all AI content generation would have stopped. The multi-provider fallback ensures content generation continues even during extended outages.
Provider Priority Order
| Priority | Provider | Model | Notes |
|---|
| 1 | Anthropic | claude-sonnet-4-20250514 | Primary — best quality for compliance narratives |
| 2 | OpenAI | gpt-4.1 | First fallback — strong general performance |
| 3 | Google | gemini-2.5-pro | Second fallback |
| 4 | Groq | llama-3.3-70b-versatile | Last resort — fastest, lowest cost, open-source model |
OpenAI, Google, and Groq API keys are configured but their quotas may need activation if they haven’t been used recently. Check Operations for how to verify provider health.
Circuit Breaker
Each provider has an independent circuit breaker tracked in the ai_provider_health table.
States
| State | Behavior |
|---|
| Closed | Normal operation — provider receives requests |
| Open | Provider is skipped — too many recent failures |
| Half-open | One probe request allowed to test if provider has recovered |
Thresholds
| Parameter | Value |
|---|
| Failures to open | 3 consecutive failures |
| Open duration | 5 minutes |
| Half-open probe | 1 request after open period expires |
| Recovery | 1 successful probe closes the circuit |
Flow
When a circuit opens, the ai-alert-circuit-open cron sends a Slack notification (if webhook is configured). The alert is deduplicated via ai_alert_log to prevent spam.
Adding API Keys
API keys are stored as Supabase Edge Function secrets, not in the database.
Open Supabase Dashboard
Navigate to your project’s Project Settings > Edge Functions > Manage Secrets.
Add or update secrets
Set the following keys as needed:| Secret Name | Provider |
|---|
ANTHROPIC_API_KEY | Anthropic |
OPENAI_API_KEY | OpenAI |
GOOGLE_AI_API_KEY | Google |
GROQ_API_KEY | Groq |
Verify
After adding a key, check provider health:SELECT provider_key, is_enabled, circuit_state, consecutive_failures
FROM ai_provider_health
ORDER BY priority;
Cost Comparison
Per-generation costs vary significantly by provider. These are approximate costs for a typical gap narrative (~1,500 input tokens, ~800 output tokens).
| Provider | Model | Input $/1M | Output $/1M | Est. Cost/Narrative |
|---|
| Anthropic | claude-sonnet-4-20250514 | $3.00 | $15.00 | ~$0.016 |
| OpenAI | gpt-4.1 | $2.00 | $8.00 | ~$0.009 |
| Google | gemini-2.5-pro | $1.25 | $5.00 | ~$0.006 |
| Groq | llama-3.3-70b-versatile | $0.59 | $0.79 | ~$0.002 |
Anthropic is the default because it produces the best compliance-specific narratives. Fallback providers produce acceptable but slightly less tailored output.
Full Model Pricing Reference
| Provider | Model | Input $/1M | Output $/1M |
|---|
| Anthropic | claude-sonnet-4-20250514 | $3.00 | $15.00 |
| Anthropic | claude-opus-4-7 | $15.00 | $75.00 |
| Anthropic | claude-haiku-4-5-20251001 | $0.80 | $4.00 |
| OpenAI | gpt-4.1 | $2.00 | $8.00 |
| OpenAI | gpt-4.1-mini | $0.40 | $1.60 |
| Google | gemini-2.5-pro | $1.25 | $5.00 |
| Google | gemini-2.5-flash | $0.30 | $2.50 |
| Groq | llama-3.3-70b-versatile | $0.59 | $0.79 |
Monitoring
Dashboard Views
| View | What It Shows |
|---|
v_ai_queue_health | Queue depth, jobs by status, avg processing time |
v_ai_provider_usage | Per-provider call counts, success rates, costs |
Slack Alerts
Two cron jobs send Slack alerts when configured:
ai-alert-circuit-open (every 5 min) — fires when any provider circuit opens
ai-alert-budget-warnings (hourly) — fires when a tenant reaches 80% of monthly budget
Configure the Slack webhook in Edge Function secrets as SLACK_WEBHOOK_URL.
The Slack webhook URL is not yet configured in production. Alerts will be silently skipped until it is set. This is optional — the system operates normally without it.