Hive
Schema-driven source documentation.
HIVE38 fields2 examples
Commonly Asked Questions
Assistant knowledge mapped to this source type from
assistant_knowledge.json.Required
Fields required for a valid configuration payload under `config.required`.
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| required | object | Yes | — | — | no extra properties |
| required.host | string | Yes | Hive host endpoint | localhost | — |
| required.port | integer | Yes | Hive TCP port | 10000 | min 1, max 65535 |
Masked
Sensitive fields under `config.masked` (secrets/credentials).
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| masked | object | Yes | — | — | no extra properties |
| masked.password | string | Yes | Hive login password | — | — |
| masked.username | string | Yes | Hive login username | — | — |
Optional
Optional configuration fields under `config.optional`.
| Path | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| optional | object | No | — | — | no extra properties |
| optional.connection | object | No | Hive connection transport and authentication options. | — | no extra properties |
| optional.connection.connect_args | object | No | Additional PyHive connection arguments (e.g. auth, kerberos_service_name, http_path). | {} | — |
| optional.connection.scheme | enum | No | Hive transport and driver scheme Allowed values: hive, hive+http, hive+https, sparksql, databricks+pyhive | — | — |
| optional.scope | object | No | Hive database and object selection scope. | — | no extra properties |
| optional.scope.database | string | No | Single Hive database to scan (optional when include_all_databases is true) | — | — |
| optional.scope.exclude_databases | array | No | Database denylist (exact database names) | ["information_schema","sys"] | — |
| optional.scope.exclude_databases[] | string | No | — | — | — |
| optional.scope.include_all_databases | boolean | No | Scan all visible Hive databases except excluded system databases | false | — |
| optional.scope.include_objects | array | No | Optional object allowlist. Accepted forms: table or database.table | — | — |
| optional.scope.include_objects[] | string | No | — | — | — |
| optional.scope.include_tables | boolean | No | Include table assets in extraction | true | — |
| optional.scope.include_views | boolean | No | Include view assets in extraction | true | — |
| optional.scope.table_limit | integer | No | Optional cap on number of table/view assets extracted per database | — | min 1 |
Examples
Reference payloads generated from shared source examples JSON.
Scan Hive default database via LDAP
Extract Hive tables and views from one database using LDAP authentication settings
Schedule
{
"enabled": true,
"preset": "weekday_morning",
"cron": "21 7 * * 1-5",
"timezone": "UTC"
}Config Payload
{
"type": "HIVE",
"required": {
"host": "some-test-company.example.com",
"port": 10000
},
"masked": {
"username": "hive",
"password": "hive"
},
"optional": {
"connection": {
"scheme": "hive",
"connect_args": {
"auth": "LDAP"
}
},
"scope": {
"database": "default",
"include_tables": true,
"include_views": true
}
},
"sampling": {
"strategy": "RANDOM",
"limit": 25,
"max_columns": 20,
"max_cell_chars": 512
}
}Scan all Hive databases with latest sampling
Enumerate visible Hive databases and prioritize latest records when possible
Schedule
{
"enabled": true,
"preset": "nightly",
"cron": "44 2 * * *",
"timezone": "UTC"
}Config Payload
{
"type": "HIVE",
"required": {
"host": "some-test-company.example.com",
"port": 10000
},
"masked": {
"username": "hive",
"password": "hive"
},
"optional": {
"connection": {
"scheme": "hive+https"
},
"scope": {
"include_all_databases": true,
"exclude_databases": [
"information_schema"
],
"include_tables": true,
"include_views": false
}
},
"sampling": {
"strategy": "LATEST",
"limit": 30,
"order_by_column": "updated_at",
"fallback_to_random": true
}
}