Skip to content

Operations Guide

This section covers the day-to-day operation of the Verity platform in staging and production environments.

Overview

graph LR
    subgraph Observe
        PROM["Prometheus<br/>Metrics"]
        GRAFANA["Grafana<br/>Dashboards"]
        LOGS["Log<br/>Aggregation"]
        ALERTS["Alert<br/>Manager"]
    end

    subgraph Respond
        RUNBOOKS["Runbooks"]
        TROUBLESHOOT["Troubleshooting<br/>Guide"]
    end

    subgraph Act
        RESTART["Service<br/>Restart"]
        SCALE["Scale<br/>Workers"]
        MAINTAIN["Database<br/>Maintenance"]
    end

    PROM --> GRAFANA
    PROM --> ALERTS
    ALERTS --> RUNBOOKS
    RUNBOOKS --> RESTART & SCALE & MAINTAIN
    LOGS --> TROUBLESHOOT
    TROUBLESHOOT --> RESTART & SCALE & MAINTAIN

    style PROM fill:#7c3aed,color:#fff
    style GRAFANA fill:#7c3aed,color:#fff
    style ALERTS fill:#ef4444,color:#fff
    style RUNBOOKS fill:#f59e0b,color:#000

Key Operational Areas

Area Guide Description
Monitoring Monitoring & Alerting Prometheus metrics, alerting rules, Grafana dashboards
Runbooks Operational Runbooks Step-by-step procedures for common operational tasks
Troubleshooting Troubleshooting Common errors, debugging techniques, and solutions

Health Check Endpoints

All Verity services expose health check endpoints:

Endpoint Purpose
/health Basic liveness check
/health/ready Readiness check (includes dependency health)
/v1/metrics Prometheus metrics endpoint

Quick Commands

# Check pod status
kubectl get pods -n verity -o wide

# View recent logs for a service
kubectl logs -n verity -l app.kubernetes.io/component=api-gateway --tail=100

# Check Prometheus alerts firing
kubectl port-forward -n monitoring svc/prometheus 9090:9090
# Then visit http://localhost:9090/alerts

# Check Kafka consumer lag
kubectl exec -n verity -it deploy/verity-ingestion -- \
  kafka-consumer-groups.sh --bootstrap-server $KAFKA_BOOTSTRAP_SERVERS \
  --group decay-engine --describe

# Scale a service
kubectl scale -n verity deployment/verity-api-gateway --replicas=4