Skip to main content

Architecture

The integration engine is designed to be provider-agnostic. Each provider follows the same lifecycle: connect, sync entities, detect findings, materialize evidence.
OAuth Connect → trigger-sync → sync-{provider} → Entities → Findings → Evidence

Provider Registry

integration_providers and integration_provider_display define available providers

Connections

integration_connections stores OAuth tokens per tenant per provider

Sync Engine

Edge Functions pull data and upsert entities/findings

Evidence

rpc_materialize_integration_evidence converts findings to evidence

Provider Registry

integration_providers (
  id            uuid PRIMARY KEY,
  slug          text UNIQUE,   -- 'azure-ad', 'google-workspace', 'aws'
  name          text,
  category      text,          -- 'identity', 'cloud', 'endpoint'
  is_active     boolean
)

integration_provider_display (
  provider_id   uuid REFERENCES integration_providers,
  icon_url      text,
  description   text,
  setup_docs_url text
)

OAuth Connect Flow

1

User initiates connection

The Integration Hub UI redirects the user to the provider’s OAuth consent screen with the appropriate scopes.
2

Callback processes token

The redirect URI receives the authorization code, exchanges it for tokens, and stores them securely.
3

Tokens stored in Vault

OAuth tokens are encrypted using Supabase Vault via rpc_integration_set_secret(). The integration_connections table stores a vault reference, not raw tokens.
4

Initial sync triggered

After successful connection, trigger-sync is called to perform the first data pull.
The INTEGRATION_STATE_SECRET environment variable is required to sign and verify the OAuth state parameter, preventing CSRF attacks during the connect flow.

Sync Edge Functions

trigger-sync (Dispatcher)

The trigger-sync function is the orchestrator. It:
  1. Looks up the connection and provider
  2. Retrieves decrypted OAuth tokens from Vault
  3. Dispatches to the appropriate provider-specific function
  4. Creates an integration_sync_jobs record to track progress

Provider Functions

Syncs from Microsoft Entra ID (Azure AD):
  • Users and group memberships
  • Conditional Access policies
  • MFA registration status
  • Security defaults configuration

Entity Upsert Pattern

All providers follow the same upsert pattern for storing synced data:
await supabase.from('integration_entities').upsert(
  {
    connection_id,
    entity_type: 'user',
    external_id: user.id,
    display_name: user.displayName,
    raw_data: user,           // Full API response stored as JSONB
    last_synced_at: new Date().toISOString(),
  },
  { onConflict: 'connection_id,entity_type,external_id' }
);
Always use upsert with the unique constraint on (connection_id, entity_type, external_id) to avoid duplicate records on reconnect or re-sync.

Finding Detection

After entity sync, each provider runs finding detection logic:
// Example: detect users without MFA
const usersWithoutMfa = entities.filter(e =>
  e.entity_type === 'user' && !e.raw_data.mfaRegistered
);

for (const user of usersWithoutMfa) {
  await supabase.from('integration_findings').upsert({
    connection_id,
    finding_type: 'user_no_mfa',
    severity: 'high',
    entity_id: user.id,
    detail: { userName: user.display_name },
  });
}

Evidence Materialization

After sync completes, evidence is materialized:
SELECT rpc_materialize_integration_evidence(p_tenant_id := $1);
This converts integration findings into integration_evidence_items and links them to relevant controls via control_evidence_links.

Adding a New Provider

For step-by-step instructions on adding a new integration provider, see the internal guide at docs/adding-new-integration.md in the repository. The high-level steps are:
  1. Insert a row in integration_providers and integration_provider_display
  2. Create a new Edge Function sync-{provider-slug}
  3. Implement the entity sync and finding detection logic
  4. Add OAuth configuration (client ID, scopes, redirect URI)
  5. Register the provider in the trigger-sync dispatcher
  6. Add finding-to-control mappings for evidence materialization

Job Queue

Sync operations are tracked in integration_sync_jobs:
integration_sync_jobs (
  id              uuid PRIMARY KEY,
  connection_id   uuid REFERENCES integration_connections,
  status          text,  -- 'pending', 'running', 'completed', 'failed'
  started_at      timestamptz,
  completed_at    timestamptz,
  entities_synced int,
  findings_count  int,
  error_message   text
)