Skip to main content
The AI content system uses four LLM providers in a priority cascade. If the primary provider is down or slow, the worker automatically falls back to the next available provider. Circuit breakers prevent repeated calls to failing providers.

Why This Exists

On April 15, 2026, Anthropic experienced a 1h 32m outage (89.6% availability over 30 days at that point). With single-provider dependency, all AI content generation would have stopped. The multi-provider fallback ensures content generation continues even during extended outages.

Provider Priority Order

PriorityProviderModelNotes
1Anthropicclaude-sonnet-4-20250514Primary — best quality for compliance narratives
2OpenAIgpt-4.1First fallback — strong general performance
3Googlegemini-2.5-proSecond fallback
4Groqllama-3.3-70b-versatileLast resort — fastest, lowest cost, open-source model
OpenAI, Google, and Groq API keys are configured but their quotas may need activation if they haven’t been used recently. Check Operations for how to verify provider health.

Circuit Breaker

Each provider has an independent circuit breaker tracked in the ai_provider_health table.

States

StateBehavior
ClosedNormal operation — provider receives requests
OpenProvider is skipped — too many recent failures
Half-openOne probe request allowed to test if provider has recovered

Thresholds

ParameterValue
Failures to open3 consecutive failures
Open duration5 minutes
Half-open probe1 request after open period expires
Recovery1 successful probe closes the circuit

Flow

When a circuit opens, the ai-alert-circuit-open cron sends a Slack notification (if webhook is configured). The alert is deduplicated via ai_alert_log to prevent spam.

Adding API Keys

API keys are stored as Supabase Edge Function secrets, not in the database.
1

Open Supabase Dashboard

Navigate to your project’s Project Settings > Edge Functions > Manage Secrets.
2

Add or update secrets

Set the following keys as needed:
Secret NameProvider
ANTHROPIC_API_KEYAnthropic
OPENAI_API_KEYOpenAI
GOOGLE_AI_API_KEYGoogle
GROQ_API_KEYGroq
3

Verify

After adding a key, check provider health:
SELECT provider_key, is_enabled, circuit_state, consecutive_failures
FROM ai_provider_health
ORDER BY priority;

Cost Comparison

Per-generation costs vary significantly by provider. These are approximate costs for a typical gap narrative (~1,500 input tokens, ~800 output tokens).
ProviderModelInput $/1MOutput $/1MEst. Cost/Narrative
Anthropicclaude-sonnet-4-20250514$3.00$15.00~$0.016
OpenAIgpt-4.1$2.00$8.00~$0.009
Googlegemini-2.5-pro$1.25$5.00~$0.006
Groqllama-3.3-70b-versatile$0.59$0.79~$0.002
Anthropic is the default because it produces the best compliance-specific narratives. Fallback providers produce acceptable but slightly less tailored output.

Full Model Pricing Reference

ProviderModelInput $/1MOutput $/1M
Anthropicclaude-sonnet-4-20250514$3.00$15.00
Anthropicclaude-opus-4-7$15.00$75.00
Anthropicclaude-haiku-4-5-20251001$0.80$4.00
OpenAIgpt-4.1$2.00$8.00
OpenAIgpt-4.1-mini$0.40$1.60
Googlegemini-2.5-pro$1.25$5.00
Googlegemini-2.5-flash$0.30$2.50
Groqllama-3.3-70b-versatile$0.59$0.79

Monitoring

Dashboard Views

ViewWhat It Shows
v_ai_queue_healthQueue depth, jobs by status, avg processing time
v_ai_provider_usagePer-provider call counts, success rates, costs

Slack Alerts

Two cron jobs send Slack alerts when configured:
  • ai-alert-circuit-open (every 5 min) — fires when any provider circuit opens
  • ai-alert-budget-warnings (hourly) — fires when a tenant reaches 80% of monthly budget
Configure the Slack webhook in Edge Function secrets as SLACK_WEBHOOK_URL.
The Slack webhook URL is not yet configured in production. Alerts will be silently skipped until it is set. This is optional — the system operates normally without it.