Decay Engine¶
Path:
services/analytics/decay-engine/· Type: Worker
The Decay Engine is the core analytical service of the Verity platform. It computes an Access Decay Score for every principal–asset grant by combining six weighted factors that capture recency, trend, organisational context, data sensitivity, peer behaviour, and review freshness.
Architecture¶
graph LR
PG[(PostgreSQL)] --> DE[Decay Engine]
CH[(ClickHouse)] --> DE
DE -->|verity.scores.updated| K{{Kafka}}
DE --> PG
The Decay Engine runs as a scheduled batch job. Each cycle:
- Reads all active grants from PostgreSQL.
- Queries activity and audit data from PostgreSQL and ClickHouse.
- Computes scores using Polars lazy evaluation.
- Writes scores back to PostgreSQL.
- Publishes score-change events to Kafka.
Scoring Model¶
Factor Definitions¶
The Access Decay Score is composed of six factors (F1–F6), each measuring a distinct dimension of access relevance.
F1 — Recency (weight: 30%)¶
Exponential decay based on inactivity:
$$ F_1 = e^{-\frac{\text{days_inactive}}{\text{half_life_days}}} $$
| Parameter | Default | Description |
|---|---|---|
half_life_days |
90 | Number of days for the score to halve |
Examples:
| Days Inactive | F1 Value |
|---|---|
| 0 | 1.000 |
| 30 | 0.716 |
| 90 | 0.368 |
| 180 | 0.135 |
| 365 | 0.018 |
F2 — Trend (weight: 20%)¶
Activity trend comparing recent vs. prior period:
$$ F_2 = \frac{\text{events_last_90d}}{\text{events_prior_90d}} $$
- Capped at
[0.0, 2.0]to prevent outlier inflation. - If
events_prior_90d = 0, F2 defaults to1.0(neutral).
F3 — Organisational Context (weight: 20%)¶
Captures structural changes that indicate access may no longer be needed:
| Condition | Multiplier |
|---|---|
| Team / department changed | 0.60× |
| Project ended | 0.50× |
| No change | 1.00× |
When multiple conditions apply, the lowest multiplier is used.
F4 — Sensitivity Multiplier¶
Scales the score based on asset sensitivity — more sensitive data decays faster:
| Sensitivity Level | Multiplier |
|---|---|
PII |
0.70 |
FINANCIAL |
0.75 |
CONFIDENTIAL |
0.85 |
INTERNAL |
0.95 |
PUBLIC |
1.00 |
F5 — Peer Deviation (weight: 10%)¶
Compares the principal's activity to the 80th percentile of peers with the same role and grant:
$$ F_5 = \frac{\text{principal_activity}}{\text{peer_p80_activity}} $$
- Capped at
[0.0, 2.0]. - Peers are defined as principals with the same role accessing the same asset.
F6 — Review Recency¶
Boost or penalty based on the last human review:
| Last Review | Multiplier |
|---|---|
| ≤ 30 days ago | 1.10 |
| ≤ 90 days ago | 1.05 |
| > 90 days ago | 0.95 |
| Never reviewed | 0.90 |
Score Formula¶
Step 1 — Compute base score from the weighted factors:
$$ \text{base} = \frac{F_1 \times 0.30 + F_2 \times 0.20 + F_3 \times 0.20 + F_5 \times 0.10}{0.80} $$
Step 2 — Apply multipliers and scale:
$$ \text{score} = \text{base} \times F_4 \times F_6 \times 100 $$
The final score is an integer in the range 0–100.
flowchart LR
F1[F1: Recency\n30%] --> BASE
F2[F2: Trend\n20%] --> BASE
F3[F3: Org Context\n20%] --> BASE
F5[F5: Peer Deviation\n10%] --> BASE[Base Score\nweighted sum / 0.80]
BASE --> MUL[Apply Multipliers]
F4[F4: Sensitivity] --> MUL
F6[F6: Review Recency] --> MUL
MUL --> SCORE["score × 100\n(0–100)"]
Risk Levels & SLAs¶
| Score Range | Risk Level | SLA (hours) | SLA (human-readable) |
|---|---|---|---|
| 0–20 | CRITICAL | 48 | 2 days |
| 21–40 | HIGH | 168 | 7 days |
| 41–60 | MEDIUM | 720 | 30 days |
| 61–80 | LOW | 2,160 | 90 days |
| 81–100 | No action | — | Access considered healthy |
Grants scoring ≤ 80 trigger the review workflow. Grants scoring > 80 are considered healthy and do not generate review packets.
Batch Computation¶
The engine processes all active grants in a single batch using Polars with lazy evaluation for memory efficiency:
sequenceDiagram
participant Scheduler
participant Engine as Decay Engine
participant PG as PostgreSQL
participant CH as ClickHouse
participant Kafka
Scheduler->>Engine: Trigger batch run
Engine->>PG: Load active grants
Engine->>PG: Load principal metadata
Engine->>CH: Query activity aggregates
Engine->>Engine: Compute scores (Polars lazy)
Engine->>PG: Bulk upsert scores
Engine->>Kafka: Publish verity.scores.updated
Engine->>Scheduler: Batch complete
Performance characteristics:
- Lazy evaluation ensures constant memory usage regardless of grant count.
- Partitioned by principal for parallelism.
- Typical throughput: ~100k grants/minute on a 4-vCPU worker.
Configuration¶
| Variable | Required | Default | Description |
|---|---|---|---|
DECAY_DATABASE_URL |
Yes | — | PostgreSQL connection string |
DECAY_CLICKHOUSE_URL |
Yes | — | ClickHouse HTTP URL |
DECAY_KAFKA_BOOTSTRAP |
Yes | — | Kafka bootstrap servers |
DECAY_HALF_LIFE_DAYS |
No | 90 |
Exponential decay half-life |
DECAY_BATCH_SCHEDULE |
No | 0 2 * * * |
Cron schedule (default: 2 AM daily) |
DECAY_PARALLELISM |
No | 4 |
Number of parallel partitions |
DECAY_LOG_LEVEL |
No | INFO |
Python log level |
Observability¶
| Metric | Type | Description |
|---|---|---|
decay_batch_duration_seconds |
Histogram | Total batch computation time |
decay_grants_scored_total |
Counter | Number of grants scored per batch |
decay_scores_by_risk_level |
Gauge | Current grant count per risk level |
decay_score_distribution |
Histogram | Score value distribution |
decay_kafka_publish_errors_total |
Counter | Failed Kafka publishes |