Databricks
Schema-driven source documentation.
DATABRICKS43 fields2 examples
Commonly Asked Questions
Assistant knowledge mapped to this source type from
assistant_knowledge.json.Required
Fields required for a valid configuration payload under `config.required`.
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| required | object | Yes | — | — | — |
Masked
Sensitive fields under `config.masked` (secrets/credentials).
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| masked | object | Yes | — | — | — |
Optional
Optional configuration fields under `config.optional`.
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| optional | object | No | — | — | no extra properties |
| optional.connection | object | No | Databricks API and SQL statement execution tuning options. | — | no extra properties |
| optional.connection.max_statement_polls | integer | No | Maximum polling attempts when waiting for SQL statement completion | 30 | min 1, max 120 |
| optional.connection.statement_timeout_seconds | integer | No | Maximum wait timeout for SQL statement execution | 60 | min 5, max 600 |
| optional.connection.timeout_seconds | integer | No | HTTP timeout for Databricks API calls | 30 | min 5, max 300 |
| optional.extraction | object | No | Databricks Unity Catalog extraction feature flags. | — | no extra properties |
| optional.extraction.include_column_lineage | boolean | No | Attempt to fetch column-level lineage metadata | false | — |
| optional.extraction.include_notebooks | boolean | No | Extract workspace notebook metadata as additional assets | false | — |
| optional.extraction.include_pipelines | boolean | No | Extract Delta Live Tables pipeline metadata as additional assets | false | — |
| optional.extraction.include_table_lineage | boolean | No | Include table-level lineage links between Unity Catalog tables | true | — |
| optional.scope | object | No | Databricks Unity Catalog scope filters. | — | no extra properties |
| optional.scope.exclude_catalogs | array | No | Catalog denylist (exact catalog names) | [] | — |
| optional.scope.exclude_catalogs[] | string | No | — | — | — |
| optional.scope.exclude_schemas | array | No | Schema denylist. Accepted forms: schema or catalog.schema | ["information_schema"] | — |
| optional.scope.exclude_schemas[] | string | No | — | — | — |
| optional.scope.include_catalogs | array | No | Optional catalog allowlist (exact catalog names) | — | — |
| optional.scope.include_catalogs[] | string | No | — | — | — |
| optional.scope.include_hive_metastore | boolean | No | Include hive_metastore catalog in extraction | false | — |
| optional.scope.include_schemas | array | No | Optional schema allowlist. Accepted forms: schema or catalog.schema | — | — |
| optional.scope.include_schemas[] | string | No | — | — | — |
| optional.scope.include_tables | array | No | Optional table allowlist. Accepted forms: table, schema.table, or catalog.schema.table | — | — |
| optional.scope.include_tables[] | string | No | — | — | — |
| optional.scope.table_limit_per_schema | integer | No | Optional cap on number of Unity Catalog tables extracted per schema | — | min 1 |
Examples
Reference payloads generated from shared source examples JSON.
Databricks Unity Catalog with PAT token
Extract Unity Catalog tables and lineage with PAT authentication and optional notebook/pipeline metadata
Schedule
{
"enabled": true,
"preset": "nightly",
"cron": "26 0 * * *",
"timezone": "UTC"
}Config Payload
{
"type": "DATABRICKS",
"required": {
"auth_mode": "PAT_TOKEN",
"workspace_url": "https://adb-3018287583848948.8.azuredatabricks.net",
"warehouse_id": "85a0db1067b31560"
},
"masked": {
"token": "dapi533087dfbc1a9b17eaa95bbe01440726-2"
},
"optional": {
"scope": {
"include_catalogs": [
"main"
]
},
"extraction": {
"include_table_lineage": true,
"include_column_lineage": false,
"include_notebooks": true,
"include_pipelines": true
}
},
"sampling": {
"strategy": "RANDOM",
"limit": 20,
"max_columns": 20,
"max_cell_chars": 512
}
}Databricks Unity Catalog with service principal
Use service principal auth for scheduled Databricks ingestion with recency-focused sampling
Schedule
{
"enabled": true,
"preset": "weekday_business",
"cron": "8 15 * * 1-5",
"timezone": "UTC"
}Config Payload
{
"type": "DATABRICKS",
"required": {
"auth_mode": "SERVICE_PRINCIPAL",
"workspace_url": "https://adb-3018287583848948.8.azuredatabricks.net",
"warehouse_id": "85a0db1067b31560",
"client_id": "service-principal-client-id"
},
"masked": {
"client_secret": "service-principal-client-secret"
},
"optional": {
"scope": {
"include_catalogs": [
"main",
"finance"
],
"include_hive_metastore": false
},
"extraction": {
"include_table_lineage": true,
"include_column_lineage": true,
"include_notebooks": false,
"include_pipelines": false
}
},
"sampling": {
"strategy": "LATEST",
"limit": 30,
"order_by_column": "updated_at",
"fallback_to_random": true
}
}