Skip to Content
Unified docs shell with shared Classifyre tokens and acid-green highlight accents.
SourcesAzure Blob Storage

Azure Blob Storage

Schema-driven source documentation.

AZURE_BLOB_STORAGE42 fields1 examples
Commonly Asked Questions
Assistant knowledge mapped to this source type from assistant_knowledge.json.

Required
Fields required for a valid configuration payload under `config.required`.
PathTypeRequiredDescriptionDefaultConstraints
requiredobjectYesno extra properties
required.account_urlstringYesAzure Blob account URL (for example, https://<account>.blob.core.windows.net)format uri
required.containerstringYesAzure Blob container name
Masked
Sensitive fields under `config.masked` (secrets/credentials).
PathTypeRequiredDescriptionDefaultConstraints
maskedobjectNoOptional Azure credentials. Leave empty to use managed identity/default credential chain.no extra properties
masked.azure_account_keystringNoAzure storage account key
masked.azure_client_idstringNoAzure Entra client ID (service principal auth)
masked.azure_client_secretstringNoAzure Entra client secret (service principal auth)
masked.azure_connection_stringstringNoAzure storage connection string (takes precedence over other auth fields)
masked.azure_sas_tokenstringNoAzure SAS token
masked.azure_tenant_idstringNoAzure Entra tenant ID (service principal auth)
Optional
Optional configuration fields under `config.optional`.
PathTypeRequiredDescriptionDefaultConstraints
optionalobjectNono extra properties
optional.connectionobjectNono extra properties
optional.connection.max_keys_per_pageintegerNoMaximum blobs requested per list page200min 1, max 1000
optional.connection.max_object_bytesintegerNoMaximum bytes downloaded per blob for MIME detection and text extraction5242880min 1024, max 52428800
optional.connection.request_timeout_secondsnumberNoNetwork timeout in seconds for list/download operations30min 1, max 300
optional.scopeobjectNoObject scope and filtering controls.no extra properties
optional.scope.exclude_extensionsarrayNoOptional extension denylist
optional.scope.exclude_extensions[]stringNo
optional.scope.include_content_previewbooleanNoDownload object bytes to infer MIME and extract detector-ready text previewstrue
optional.scope.include_empty_objectsbooleanNoInclude zero-byte objects in extraction resultsfalse
optional.scope.include_extensionsarrayNoOptional extension allowlist (for example, .pdf, .csv, .parquet)
optional.scope.include_extensions[]stringNo
optional.scope.include_object_metadatabooleanNoAttach provider metadata (etag, size, content-type hints, timestamps) to asset checksumstrue
optional.scope.prefixstringNoObject key prefix filter (for example, exports/2026/)
Examples
Reference payloads generated from shared source examples JSON.
Azure Blob container validation scan
Validate Azure Blob extraction with a low random sample and connection string auth

Schedule

{
  "enabled": true,
  "preset": "weekly",
  "cron": "17 3 * * 0",
  "timezone": "UTC"
}

Config Payload

{
  "type": "AZURE_BLOB_STORAGE",
  "required": {
    "account_url": "https://acmestorage.blob.core.windows.net",
    "container": "finance-archive"
  },
  "masked": {
    "azure_connection_string": "DefaultEndpointsProtocol=https;AccountName=acmestorage;AccountKey=<key>;EndpointSuffix=core.windows.net"
  },
  "optional": {
    "scope": {
      "prefix": "2026/",
      "exclude_extensions": [
        ".png",
        ".jpg"
      ],
      "include_empty_objects": false
    },
    "connection": {
      "request_timeout_seconds": 45,
      "max_keys_per_page": 250
    }
  },
  "sampling": {
    "strategy": "RANDOM",
    "limit": 25
  }
}