Skip to content

Decay Engine

Path: services/analytics/decay-engine/ · Type: Worker

The Decay Engine is the core analytical service of the Verity platform. It computes an Access Decay Score for every principal–asset grant by combining six weighted factors that capture recency, trend, organisational context, data sensitivity, peer behaviour, and review freshness.

Architecture

graph LR
    PG[(PostgreSQL)] --> DE[Decay Engine]
    CH[(ClickHouse)] --> DE
    DE -->|verity.scores.updated| K{{Kafka}}
    DE --> PG

The Decay Engine runs as a scheduled batch job. Each cycle:

  1. Reads all active grants from PostgreSQL.
  2. Queries activity and audit data from PostgreSQL and ClickHouse.
  3. Computes scores using Polars lazy evaluation.
  4. Writes scores back to PostgreSQL.
  5. Publishes score-change events to Kafka.

Scoring Model

Factor Definitions

The Access Decay Score is composed of six factors (F1–F6), each measuring a distinct dimension of access relevance.

F1 — Recency (weight: 30%)

Exponential decay based on inactivity:

$$ F_1 = e^{-\frac{\text{days_inactive}}{\text{half_life_days}}} $$

Parameter Default Description
half_life_days 90 Number of days for the score to halve

Examples:

Days Inactive F1 Value
0 1.000
30 0.716
90 0.368
180 0.135
365 0.018

F2 — Trend (weight: 20%)

Activity trend comparing recent vs. prior period:

$$ F_2 = \frac{\text{events_last_90d}}{\text{events_prior_90d}} $$

  • Capped at [0.0, 2.0] to prevent outlier inflation.
  • If events_prior_90d = 0, F2 defaults to 1.0 (neutral).

F3 — Organisational Context (weight: 20%)

Captures structural changes that indicate access may no longer be needed:

Condition Multiplier
Team / department changed 0.60×
Project ended 0.50×
No change 1.00×

When multiple conditions apply, the lowest multiplier is used.

F4 — Sensitivity Multiplier

Scales the score based on asset sensitivity — more sensitive data decays faster:

Sensitivity Level Multiplier
PII 0.70
FINANCIAL 0.75
CONFIDENTIAL 0.85
INTERNAL 0.95
PUBLIC 1.00

F5 — Peer Deviation (weight: 10%)

Compares the principal's activity to the 80th percentile of peers with the same role and grant:

$$ F_5 = \frac{\text{principal_activity}}{\text{peer_p80_activity}} $$

  • Capped at [0.0, 2.0].
  • Peers are defined as principals with the same role accessing the same asset.

F6 — Review Recency

Boost or penalty based on the last human review:

Last Review Multiplier
≤ 30 days ago 1.10
≤ 90 days ago 1.05
> 90 days ago 0.95
Never reviewed 0.90

Score Formula

Step 1 — Compute base score from the weighted factors:

$$ \text{base} = \frac{F_1 \times 0.30 + F_2 \times 0.20 + F_3 \times 0.20 + F_5 \times 0.10}{0.80} $$

Step 2 — Apply multipliers and scale:

$$ \text{score} = \text{base} \times F_4 \times F_6 \times 100 $$

The final score is an integer in the range 0–100.

flowchart LR
    F1[F1: Recency\n30%] --> BASE
    F2[F2: Trend\n20%] --> BASE
    F3[F3: Org Context\n20%] --> BASE
    F5[F5: Peer Deviation\n10%] --> BASE[Base Score\nweighted sum / 0.80]
    BASE --> MUL[Apply Multipliers]
    F4[F4: Sensitivity] --> MUL
    F6[F6: Review Recency] --> MUL
    MUL --> SCORE["score × 100\n(0–100)"]

Risk Levels & SLAs

Score Range Risk Level SLA (hours) SLA (human-readable)
0–20 CRITICAL 48 2 days
21–40 HIGH 168 7 days
41–60 MEDIUM 720 30 days
61–80 LOW 2,160 90 days
81–100 No action Access considered healthy

Grants scoring ≤ 80 trigger the review workflow. Grants scoring > 80 are considered healthy and do not generate review packets.


Batch Computation

The engine processes all active grants in a single batch using Polars with lazy evaluation for memory efficiency:

sequenceDiagram
    participant Scheduler
    participant Engine as Decay Engine
    participant PG as PostgreSQL
    participant CH as ClickHouse
    participant Kafka

    Scheduler->>Engine: Trigger batch run
    Engine->>PG: Load active grants
    Engine->>PG: Load principal metadata
    Engine->>CH: Query activity aggregates
    Engine->>Engine: Compute scores (Polars lazy)
    Engine->>PG: Bulk upsert scores
    Engine->>Kafka: Publish verity.scores.updated
    Engine->>Scheduler: Batch complete

Performance characteristics:

  • Lazy evaluation ensures constant memory usage regardless of grant count.
  • Partitioned by principal for parallelism.
  • Typical throughput: ~100k grants/minute on a 4-vCPU worker.

Configuration

Variable Required Default Description
DECAY_DATABASE_URL Yes PostgreSQL connection string
DECAY_CLICKHOUSE_URL Yes ClickHouse HTTP URL
DECAY_KAFKA_BOOTSTRAP Yes Kafka bootstrap servers
DECAY_HALF_LIFE_DAYS No 90 Exponential decay half-life
DECAY_BATCH_SCHEDULE No 0 2 * * * Cron schedule (default: 2 AM daily)
DECAY_PARALLELISM No 4 Number of parallel partitions
DECAY_LOG_LEVEL No INFO Python log level

Observability

Metric Type Description
decay_batch_duration_seconds Histogram Total batch computation time
decay_grants_scored_total Counter Number of grants scored per batch
decay_scores_by_risk_level Gauge Current grant count per risk level
decay_score_distribution Histogram Score value distribution
decay_kafka_publish_errors_total Counter Failed Kafka publishes