Skip to content

Platform Connectors

Path: services/connectors/ · Type: Worker

Connectors are the data-ingestion edge of Verity. Each connector extracts identity, access, and activity data from a specific platform and publishes raw events to Kafka. All connectors extend BaseConnector and share a common lifecycle.

Connector Architecture

graph LR
    subgraph Source Systems
        AAD[Azure AD]
        SF[Snowflake]
        DB[Databricks]
        FB[Microsoft Fabric]
    end

    subgraph Connectors
        C1[connector-azure-ad]
        C2[connector-snowflake]
        C3[connector-databricks]
        C4[connector-fabric]
    end

    AAD --> C1
    SF --> C2
    DB --> C3
    FB --> C4

    C1 -->|verity.events.raw.azure_ad| K{{Kafka}}
    C2 -->|verity.events.raw.snowflake| K
    C3 -->|verity.events.raw.databricks| K
    C4 -->|verity.events.raw.fabric| K

BaseConnector Contract

Every connector implements the BaseConnector abstract class:

class BaseConnector(ABC):
    @abstractmethod
    async def connect(self) -> None: ...

    @abstractmethod
    async def extract(self) -> AsyncIterator[RawEvent]: ...

    @abstractmethod
    async def disconnect(self) -> None: ...

Shared behaviour:

  • Automatic retry with exponential back-off on transient failures.
  • Publishes to verity.events.raw.{platform} Kafka topics.
  • Emits Prometheus metrics: connector_events_total, connector_errors_total, connector_extract_duration_seconds.
  • Graceful shutdown on SIGTERM.

Azure AD

Path: services/connectors/connector-azure-ad/

Extracts identity and access data from Microsoft Entra ID (Azure AD) using the Microsoft Graph API.

Data Extracted

Entity Graph Endpoint Description
Users /v1.0/users All user principals with profile attributes
Groups /v1.0/groups Security & Microsoft 365 groups with membership
App registrations /v1.0/applications OAuth app permissions and API scopes
Service principals /v1.0/servicePrincipals Enterprise app identities
Directory roles /v1.0/directoryRoles Role assignments (Global Admin, etc.)

Configuration

Variable Required Description
AZURE_AD_TENANT_ID Yes Azure AD tenant ID
AZURE_AD_CLIENT_ID Yes App registration client ID
AZURE_AD_CLIENT_SECRET Yes App registration client secret
AZURE_AD_GRAPH_SCOPES No Graph API scopes (default: .default)

Snowflake

Path: services/connectors/connector-snowflake/

Extracts grants and usage data from Snowflake by querying the ACCOUNT_USAGE schema.

Data Extracted

View / Table Description
ACCOUNT_USAGE.GRANTS_TO_USERS Role grants assigned to users
ACCOUNT_USAGE.GRANTS_TO_ROLES Role-to-role hierarchy
ACCOUNT_USAGE.GRANTS_ON_OBJECTS Object-level privileges (tables, schemas, warehouses)
ACCOUNT_USAGE.QUERY_HISTORY Query execution history for activity analysis
ACCOUNT_USAGE.LOGIN_HISTORY Login events for recency tracking

Configuration

Variable Required Description
SNOWFLAKE_ACCOUNT Yes Snowflake account identifier
SNOWFLAKE_USER Yes Service account username
SNOWFLAKE_PASSWORD Yes Service account password
SNOWFLAKE_WAREHOUSE No Warehouse to use for queries
SNOWFLAKE_ROLE No Role to assume (default: ACCOUNTADMIN)

Databricks

Path: services/connectors/connector-databricks/

Extracts permissions and audit data from Databricks workspaces via the Unity Catalog and workspace APIs.

Data Extracted

Source Description
Unity Catalog — Grants Table, schema, and catalog-level grants
Unity Catalog — Principals Users, groups, and service principals
Workspace permissions Notebook, cluster, and job-level permissions
Audit logs Access and admin events from the audit log API

Configuration

Variable Required Description
DATABRICKS_HOST Yes Workspace URL (e.g., https://adb-xxx.azuredatabricks.net)
DATABRICKS_TOKEN Yes Personal access token or service principal token
DATABRICKS_ACCOUNT_ID No Account-level ID (for account-level APIs)

Fabric

Path: services/connectors/connector-fabric/

Extracts permissions from Microsoft Fabric workspaces, lakehouses, and warehouses via the Fabric REST API.

Data Extracted

Resource Description
Workspaces Workspace metadata and role assignments
Lakehouses Lakehouse-level permissions and sharing
Warehouses SQL analytics endpoint permissions
Datasets / Semantic models Dataset-level access control

Configuration

Variable Required Description
FABRIC_TENANT_ID Yes Azure AD tenant ID
FABRIC_CLIENT_ID Yes App registration client ID
FABRIC_CLIENT_SECRET Yes App registration client secret
FABRIC_WORKSPACE_IDS No Comma-separated workspace IDs (default: all)

Adding a New Connector

  1. Create a new directory under services/connectors/connector-{platform}/.
  2. Implement the BaseConnector interface.
  3. Define a Kafka topic: verity.events.raw.{platform}.
  4. Register the connector in the service catalog and Docker Compose.
  5. Add connector-specific configuration with the platform name as prefix.
sequenceDiagram
    participant Scheduler
    participant Connector
    participant Source as Source System
    participant Kafka

    Scheduler->>Connector: Trigger extraction
    Connector->>Source: Authenticate (OAuth / token)
    Source-->>Connector: Auth token
    loop Paginated extraction
        Connector->>Source: Fetch page
        Source-->>Connector: Data page
        Connector->>Kafka: Publish RawEvent batch
    end
    Connector->>Scheduler: Extraction complete