Skip to content

Data Flow

This document describes the end-to-end data pipeline that transforms raw access data from source systems into decay scores, review decisions, and automated remediations. All inter-service communication flows through Apache Kafka (KRaft mode).


End-to-End Sequence

The following sequence diagram traces a single access event from ingestion through remediation.

sequenceDiagram
    autonumber
    participant SRC as Source System<br/>(e.g. Entra ID)
    participant CON as Connector
    participant K1 as Kafka<br/>verity.events.raw.{platform}
    participant IW as Ingest Worker
    participant K2 as Kafka<br/>verity.events.normalised
    participant NE as Normalise Engine
    participant PG as PostgreSQL<br/>(TimescaleDB)
    participant DE as Decay Engine
    participant K3 as Kafka<br/>verity.scores.updated
    participant RG as Review Generator
    participant K4 as Kafka<br/>verity.reviews.created
    participant WE as Workflow Engine<br/>(Temporal)
    participant K5 as Kafka<br/>verity.reviews.decided
    participant RE as Remediation Executor
    participant K6 as Kafka<br/>verity.remediations.completed
    participant AW as Audit Writer
    participant CH as ClickHouse

    SRC->>CON: Poll / webhook
    CON->>K1: Produce raw event
    K1->>IW: Consume raw event
    IW->>IW: Validate & deduplicate
    IW->>K2: Produce validated event
    K2->>NE: Consume validated event
    NE->>NE: Map to canonical model
    NE->>PG: Upsert Principal, Asset, AccessGrant, AccessEvent
    NE->>K2: Produce normalised event
    K2->>DE: Consume normalised event
    DE->>PG: Read grants & peer groups
    DE->>DE: Compute 6-factor decay score
    DE->>PG: Write access_scores
    DE->>K3: Produce score update
    K3->>RG: Consume score update
    RG->>RG: Evaluate policy thresholds
    RG->>PG: Create review_packet
    RG->>K4: Produce review created
    K4->>WE: Consume review created
    WE->>WE: Start Temporal workflow
    WE->>WE: Assign → Notify → Await decision
    WE->>K5: Produce review decided
    K5->>RE: Consume decided review
    RE->>SRC: Execute remediation action
    RE->>PG: Update grant status
    RE->>K6: Produce remediation completed

    Note over AW,CH: All services also emit to verity.audit.all
    K6->>AW: Consume audit events
    AW->>CH: Append to verity_audit

Kafka Topic Catalogue

All topics use the verity. prefix and follow the naming convention verity.<domain>.<event-type>[.<qualifier>].

Raw Event Topics

Topic Producer Consumer(s) Key Partitions Description
verity.events.raw.azure connector-azure-ad ingest-worker tenant_id 6 Raw users, groups, roles, and sign-in logs from Entra ID.
verity.events.raw.snowflake connector-snowflake ingest-worker account_id 6 Raw grants and query history from Snowflake.
verity.events.raw.databricks connector-databricks ingest-worker workspace_id 6 Raw permissions and audit logs from Databricks.
verity.events.raw.fabric connector-fabric ingest-worker workspace_id 6 Raw workspace items and permissions from Fabric.

Pipeline Topics

Topic Producer Consumer(s) Key Partitions Description
verity.events.normalised normalise-engine decay-engine principal_id 12 Canonical AccessEvent records after identity resolution and asset classification.
verity.scores.updated decay-engine review-generator, dashboard-ui grant_id 12 Score change events emitted after every scoring run.
verity.reviews.created review-generator workflow-engine review_id 6 New review packets awaiting assignment.
verity.reviews.decided workflow-engine remediation-executor review_id 6 Reviews with an approved decision (revoke, downgrade, or maintain).
verity.remediations.completed remediation-executor audit-writer grant_id 6 Confirmation of remediation actions executed in source systems.

Infrastructure Topics

Topic Producer Consumer(s) Key Partitions Description
verity.audit.all All services audit-writer entity_id 12 Unified audit stream — every significant action across the platform.
verity.identity.resolve normalise-engine normalise-engine email 6 Identity-resolution requests for cross-platform principal matching.
verity.asset.classify normalise-engine normalise-engine asset_id 6 Asset-classification requests for sensitivity and data-classification tagging.

Topic Configuration Defaults

Setting Value Rationale
replication.factor 3 Durability across broker failures.
min.insync.replicas 2 Strong consistency guarantee.
retention.ms 604 800 000 (7 days) Pipeline topics — sufficient for replay.
retention.ms (audit) 220 752 000 000 (7 years) verity.audit.all — regulatory retention.
cleanup.policy delete Default; compacted topics noted individually.
compression.type zstd Best compression ratio for JSON payloads.

Data Lifecycle

Access Events

Phase Store Retention Details
Raw ingestion Kafka (verity.events.raw.*) 7 days Replay buffer; retained for re-processing.
Normalised PostgreSQL access_events (hypertable) 7 years 1-day chunks; compressed after 7 days.
Audit copy ClickHouse verity_audit 7 years Partitioned monthly; TTL-enforced.

Access Scores

Phase Store Retention Details
Score history PostgreSQL access_scores (hypertable) 1 year 7-day chunks; used for trend analysis.
Latest scores PostgreSQL access_scores_latest (continuous aggregate) Indefinite Materialised view; refreshed continuously.
Score events Kafka (verity.scores.updated) 7 days Consumed by review-generator and dashboard.

Reviews

Phase Store Retention Details
Active reviews PostgreSQL review_packets Indefinite Until closed; archived after 1 year.
Decisions PostgreSQL review_decisions 7 years Regulatory requirement — who decided what.
Workflow state Temporal Configurable Temporal manages its own retention.

Access Review Workflow

The review lifecycle is implemented as a Temporal workflow in the workflow-engine service. The following state machine describes all possible transitions.

stateDiagram-v2
    [*] --> CREATED: Review packet generated
    CREATED --> ASSIGNED: Reviewer matched by policy
    ASSIGNED --> NOTIFIED: Email / Slack notification sent

    NOTIFIED --> DECIDED: Reviewer submits decision
    NOTIFIED --> ESCALATED: SLA breached (no response)

    ESCALATED --> DECIDED: Escalated reviewer decides

    DECIDED --> REMEDIATED: Decision = Revoke or Downgrade
    DECIDED --> CLOSED: Decision = Maintain

    REMEDIATED --> CLOSED: Remediation confirmed

    CLOSED --> [*]

    state DECIDED {
        [*] --> Revoke
        [*] --> Downgrade
        [*] --> Maintain
    }

Workflow States

State Description SLA Escalation
CREATED Review packet generated by review-generator when score exceeds threshold.
ASSIGNED Reviewer identified by ownership rules or round-robin assignment.
NOTIFIED Notification sent via configured channel (email, Slack, Teams).
DECIDED Reviewer has submitted a decision: Revoke, Downgrade, or Maintain. 5 business days Escalates to manager after SLA breach.
ESCALATED Original reviewer did not respond within SLA; escalated to next-level approver. 3 business days Auto-revoke if second SLA breached.
REMEDIATED Remediation action executed in the source system. 1 hour Alert on failure.
CLOSED Terminal state — review complete, all actions confirmed.

Temporal Workflow Details

Property Value
Task Queue verity-reviews
Workflow ID review-{review_id}
Execution Timeout 30 days
Run Timeout 30 days
Retry Policy 3 attempts, 60 s initial interval, 2.0 backoff coefficient
Signal: submit_decision Reviewer submits a decision (revoke / downgrade / maintain + justification).
Signal: reassign Admin reassigns the review to a different reviewer.
Query: get_status Returns current workflow state and metadata.

Error Handling & Dead-Letter Queues

Every consumer implements a retry-then-DLQ pattern:

flowchart LR
    A[Kafka Topic] --> B{Consumer}
    B -->|Success| C[Process & Commit]
    B -->|Transient Error| D[Retry — exponential backoff<br/>max 3 attempts]
    D -->|Recovered| C
    D -->|Exhausted| E[Dead-Letter Topic<br/>verity.dlq.{original-topic}]
    E --> F[Manual Investigation]
Setting Value
Max retries 3
Initial backoff 1 second
Max backoff 30 seconds
Backoff multiplier 2.0
DLQ topic naming verity.dlq.{original-topic-name}

Dead-letter messages retain the original headers (x-original-topic, x-retry-count, x-error-message) for investigation and replay.