Platform Connectors¶
Path:
services/connectors/· Type: Worker
Connectors are the data-ingestion edge of Verity. Each connector extracts identity, access, and activity data from a specific platform and publishes raw events to Kafka. All connectors extend BaseConnector and share a common lifecycle.
Connector Architecture¶
graph LR
subgraph Source Systems
AAD[Azure AD]
SF[Snowflake]
DB[Databricks]
FB[Microsoft Fabric]
end
subgraph Connectors
C1[connector-azure-ad]
C2[connector-snowflake]
C3[connector-databricks]
C4[connector-fabric]
end
AAD --> C1
SF --> C2
DB --> C3
FB --> C4
C1 -->|verity.events.raw.azure_ad| K{{Kafka}}
C2 -->|verity.events.raw.snowflake| K
C3 -->|verity.events.raw.databricks| K
C4 -->|verity.events.raw.fabric| K
BaseConnector Contract¶
Every connector implements the BaseConnector abstract class:
class BaseConnector(ABC):
@abstractmethod
async def connect(self) -> None: ...
@abstractmethod
async def extract(self) -> AsyncIterator[RawEvent]: ...
@abstractmethod
async def disconnect(self) -> None: ...
Shared behaviour:
- Automatic retry with exponential back-off on transient failures.
- Publishes to
verity.events.raw.{platform}Kafka topics. - Emits Prometheus metrics:
connector_events_total,connector_errors_total,connector_extract_duration_seconds. - Graceful shutdown on
SIGTERM.
Azure AD¶
Path:
services/connectors/connector-azure-ad/
Extracts identity and access data from Microsoft Entra ID (Azure AD) using the Microsoft Graph API.
Data Extracted¶
| Entity | Graph Endpoint | Description |
|---|---|---|
| Users | /v1.0/users |
All user principals with profile attributes |
| Groups | /v1.0/groups |
Security & Microsoft 365 groups with membership |
| App registrations | /v1.0/applications |
OAuth app permissions and API scopes |
| Service principals | /v1.0/servicePrincipals |
Enterprise app identities |
| Directory roles | /v1.0/directoryRoles |
Role assignments (Global Admin, etc.) |
Configuration¶
| Variable | Required | Description |
|---|---|---|
AZURE_AD_TENANT_ID |
Yes | Azure AD tenant ID |
AZURE_AD_CLIENT_ID |
Yes | App registration client ID |
AZURE_AD_CLIENT_SECRET |
Yes | App registration client secret |
AZURE_AD_GRAPH_SCOPES |
No | Graph API scopes (default: .default) |
Snowflake¶
Path:
services/connectors/connector-snowflake/
Extracts grants and usage data from Snowflake by querying the ACCOUNT_USAGE schema.
Data Extracted¶
| View / Table | Description |
|---|---|
ACCOUNT_USAGE.GRANTS_TO_USERS |
Role grants assigned to users |
ACCOUNT_USAGE.GRANTS_TO_ROLES |
Role-to-role hierarchy |
ACCOUNT_USAGE.GRANTS_ON_OBJECTS |
Object-level privileges (tables, schemas, warehouses) |
ACCOUNT_USAGE.QUERY_HISTORY |
Query execution history for activity analysis |
ACCOUNT_USAGE.LOGIN_HISTORY |
Login events for recency tracking |
Configuration¶
| Variable | Required | Description |
|---|---|---|
SNOWFLAKE_ACCOUNT |
Yes | Snowflake account identifier |
SNOWFLAKE_USER |
Yes | Service account username |
SNOWFLAKE_PASSWORD |
Yes | Service account password |
SNOWFLAKE_WAREHOUSE |
No | Warehouse to use for queries |
SNOWFLAKE_ROLE |
No | Role to assume (default: ACCOUNTADMIN) |
Databricks¶
Path:
services/connectors/connector-databricks/
Extracts permissions and audit data from Databricks workspaces via the Unity Catalog and workspace APIs.
Data Extracted¶
| Source | Description |
|---|---|
| Unity Catalog — Grants | Table, schema, and catalog-level grants |
| Unity Catalog — Principals | Users, groups, and service principals |
| Workspace permissions | Notebook, cluster, and job-level permissions |
| Audit logs | Access and admin events from the audit log API |
Configuration¶
| Variable | Required | Description |
|---|---|---|
DATABRICKS_HOST |
Yes | Workspace URL (e.g., https://adb-xxx.azuredatabricks.net) |
DATABRICKS_TOKEN |
Yes | Personal access token or service principal token |
DATABRICKS_ACCOUNT_ID |
No | Account-level ID (for account-level APIs) |
Fabric¶
Path:
services/connectors/connector-fabric/
Extracts permissions from Microsoft Fabric workspaces, lakehouses, and warehouses via the Fabric REST API.
Data Extracted¶
| Resource | Description |
|---|---|
| Workspaces | Workspace metadata and role assignments |
| Lakehouses | Lakehouse-level permissions and sharing |
| Warehouses | SQL analytics endpoint permissions |
| Datasets / Semantic models | Dataset-level access control |
Configuration¶
| Variable | Required | Description |
|---|---|---|
FABRIC_TENANT_ID |
Yes | Azure AD tenant ID |
FABRIC_CLIENT_ID |
Yes | App registration client ID |
FABRIC_CLIENT_SECRET |
Yes | App registration client secret |
FABRIC_WORKSPACE_IDS |
No | Comma-separated workspace IDs (default: all) |
Adding a New Connector¶
- Create a new directory under
services/connectors/connector-{platform}/. - Implement the
BaseConnectorinterface. - Define a Kafka topic:
verity.events.raw.{platform}. - Register the connector in the service catalog and Docker Compose.
- Add connector-specific configuration with the platform name as prefix.
sequenceDiagram
participant Scheduler
participant Connector
participant Source as Source System
participant Kafka
Scheduler->>Connector: Trigger extraction
Connector->>Source: Authenticate (OAuth / token)
Source-->>Connector: Auth token
loop Paginated extraction
Connector->>Source: Fetch page
Source-->>Connector: Data page
Connector->>Kafka: Publish RawEvent batch
end
Connector->>Scheduler: Extraction complete