Skip to Content
Unified docs shell with shared Classifyre tokens and acid-green highlight accents.

Hive

Schema-driven source documentation.

HIVE38 fields2 examples
Commonly Asked Questions
Assistant knowledge mapped to this source type from assistant_knowledge.json.

Required
Fields required for a valid configuration payload under `config.required`.
PathTypeRequiredDescriptionDefaultConstraints
requiredobjectYesno extra properties
required.hoststringYesHive host endpointlocalhost
required.portintegerYesHive TCP port10000min 1, max 65535
Masked
Sensitive fields under `config.masked` (secrets/credentials).
PathTypeRequiredDescriptionDefaultConstraints
maskedobjectYesno extra properties
masked.passwordstringYesHive login password
masked.usernamestringYesHive login username
Optional
Optional configuration fields under `config.optional`.
PathTypeRequiredDescriptionDefaultConstraints
optionalobjectNono extra properties
optional.connectionobjectNoHive connection transport and authentication options.no extra properties
optional.connection.connect_argsobjectNoAdditional PyHive connection arguments (e.g. auth, kerberos_service_name, http_path).{}
optional.connection.schemeenumNoHive transport and driver scheme Allowed values: hive, hive+http, hive+https, sparksql, databricks+pyhive
optional.scopeobjectNoHive database and object selection scope.no extra properties
optional.scope.databasestringNoSingle Hive database to scan (optional when include_all_databases is true)
optional.scope.exclude_databasesarrayNoDatabase denylist (exact database names)["information_schema","sys"]
optional.scope.exclude_databases[]stringNo
optional.scope.include_all_databasesbooleanNoScan all visible Hive databases except excluded system databasesfalse
optional.scope.include_objectsarrayNoOptional object allowlist. Accepted forms: table or database.table
optional.scope.include_objects[]stringNo
optional.scope.include_tablesbooleanNoInclude table assets in extractiontrue
optional.scope.include_viewsbooleanNoInclude view assets in extractiontrue
optional.scope.table_limitintegerNoOptional cap on number of table/view assets extracted per databasemin 1
Examples
Reference payloads generated from shared source examples JSON.
Scan Hive default database via LDAP
Extract Hive tables and views from one database using LDAP authentication settings

Schedule

{
  "enabled": true,
  "preset": "weekday_morning",
  "cron": "21 7 * * 1-5",
  "timezone": "UTC"
}

Config Payload

{
  "type": "HIVE",
  "required": {
    "host": "some-test-company.example.com",
    "port": 10000
  },
  "masked": {
    "username": "hive",
    "password": "hive"
  },
  "optional": {
    "connection": {
      "scheme": "hive",
      "connect_args": {
        "auth": "LDAP"
      }
    },
    "scope": {
      "database": "default",
      "include_tables": true,
      "include_views": true
    }
  },
  "sampling": {
    "strategy": "RANDOM",
    "limit": 25,
    "max_columns": 20,
    "max_cell_chars": 512
  }
}
Scan all Hive databases with latest sampling
Enumerate visible Hive databases and prioritize latest records when possible

Schedule

{
  "enabled": true,
  "preset": "nightly",
  "cron": "44 2 * * *",
  "timezone": "UTC"
}

Config Payload

{
  "type": "HIVE",
  "required": {
    "host": "some-test-company.example.com",
    "port": 10000
  },
  "masked": {
    "username": "hive",
    "password": "hive"
  },
  "optional": {
    "connection": {
      "scheme": "hive+https"
    },
    "scope": {
      "include_all_databases": true,
      "exclude_databases": [
        "information_schema"
      ],
      "include_tables": true,
      "include_views": false
    }
  },
  "sampling": {
    "strategy": "LATEST",
    "limit": 30,
    "order_by_column": "updated_at",
    "fallback_to_random": true
  }
}