Custom
Schema-driven detector documentation.
CUSTOMactiveP062 params18 examples
Detector Metadata
Capability catalog entry from
all_detectors.json.Categories
CLASSIFICATIONCOMPLIANCE
Supported Asset Types
TXTTABLEURLIMAGE
Recommended Model
mDeBERTa-v3 + SetFit + GLiNERNotes
User-defined detector that supports ruleset, few-shot classification, and entity extraction methods.
Parameters
Configuration parameters for the Custom detector. Shared from `CustomDetectorConfig`.
| Parameter | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
| enabled_patterns | array | null | No | List of enabled pattern names | null | — |
| severity_threshold | enum | null | No | Minimum severity to report | null | — |
| confidence_threshold | number | No | Minimum confidence to report (0-1) | 0.7 | min 0, max 1 |
| max_findings | integer | null | No | Maximum number of findings to return | null | — |
| custom_detector_key | string | Yes | Stable key used to identify one custom detector instance | — | — |
| name | string | Yes | User-facing name of custom detector | — | — |
| description | string | No | — | — | — |
| method | enum | Yes | Execution method for custom detector logic Allowed values: RULESET, CLASSIFIER, ENTITY | RULESET | — |
| languages | array | No | — | ["de","en"] | — |
| languages[] | string | No | — | — | — |
| ruleset | object | No | — | — | no extra properties |
| ruleset.regex_rules | array | No | — | [] | — |
| ruleset.regex_rules[] | object | No | — | — | no extra properties |
| ruleset.regex_rules[].id | string | Yes | Stable ID for this regex rule | — | — |
| ruleset.regex_rules[].name | string | Yes | Display name for this regex rule | — | — |
| ruleset.regex_rules[].pattern | string | Yes | Regular expression pattern | — | — |
| ruleset.regex_rules[].flags | string | No | Regex flags (for example i, m, s) | — | |
| ruleset.regex_rules[].severity | enum | No | Severity level of finding Allowed values: critical, high, medium, low, info | — | — |
| ruleset.keyword_rules | array | No | — | [] | — |
| ruleset.keyword_rules[] | object | No | — | — | no extra properties |
| ruleset.keyword_rules[].id | string | Yes | Stable ID for this keyword rule | — | — |
| ruleset.keyword_rules[].name | string | Yes | Display name for this keyword rule | — | — |
| ruleset.keyword_rules[].keywords | array | Yes | Keyword set to match | — | min items 1 |
| ruleset.keyword_rules[].keywords[] | string | Yes | — | — | — |
| ruleset.keyword_rules[].case_sensitive | boolean | No | Whether keyword matching is case-sensitive | false | — |
| ruleset.keyword_rules[].severity | enum | No | Severity level of finding Allowed values: critical, high, medium, low, info | — | — |
| classifier | object | No | — | — | no extra properties |
| classifier.labels | array | No | — | [] | — |
| classifier.labels[] | object | No | — | — | no extra properties |
| classifier.labels[].id | string | Yes | — | — | — |
| classifier.labels[].name | string | Yes | — | — | — |
| classifier.labels[].description | string | No | — | — | — |
| classifier.zero_shot_model | string | No | — | MoritzLaurer/mDeBERTa-v3-base-mnli-xnli | — |
| classifier.hypothesis_template | string | No | — | This text contains {}. | — |
| classifier.training_examples | array | No | — | [] | — |
| classifier.training_examples[] | object | No | — | — | no extra properties |
| classifier.training_examples[].text | string | Yes | — | — | — |
| classifier.training_examples[].label | string | Yes | — | — | — |
| classifier.training_examples[].accepted | boolean | No | — | true | — |
| classifier.training_examples[].source | string | No | Origin of this example (editor/feedback/import) | editor | — |
| classifier.min_examples_per_label | integer | No | — | 8 | min 1 |
| classifier.setfit_model | string | No | — | sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | — |
| entity | object | No | — | — | no extra properties |
| entity.entity_labels | array | No | — | [] | — |
| entity.entity_labels[] | string | No | — | — | — |
| entity.model | string | No | — | urchade/gliner_multi-v2.1 | — |
| extractor | object | No | Optional structured extraction — runs when detector fires | — | no extra properties |
| extractor.enabled | boolean | No | — | true | — |
| extractor.fields | array | Yes | — | — | min items 1 |
| extractor.fields[] | object | Yes | One output field in the extraction schema | — | no extra properties |
| extractor.fields[].name | string | Yes | Output field name — becomes a key in extracted_data JSON | — | — |
| extractor.fields[].description | string | No | Human-readable hint for what this field captures | — | — |
| extractor.fields[].type | enum | No | Allowed values: string, number, boolean, list[string], list[number] | string | — |
| extractor.fields[].entity_label | string | No | GLiNER entity label (ENTITY and CLASSIFIER methods) | — | — |
| extractor.fields[].regex_pattern | string | No | Regex with one named capture group (?P<value>...) for RULESET method | — | — |
| extractor.fields[].regex_flags | string | No | Regex flags: i=case-insensitive, m=multiline, s=dotall | i | — |
| extractor.fields[].aggregate | enum | No | How to aggregate multiple matches Allowed values: first, last, list, join, count | list | — |
| extractor.fields[].join_separator | string | No | — | , | — |
| extractor.fields[].min_confidence | number | No | Minimum GLiNER confidence for this field | 0.4 | min 0, max 1 |
| extractor.fields[].required | boolean | No | If true, skip saving extraction when this field is empty | false | — |
| extractor.gliner_model | string | No | — | urchade/gliner_multi-v2.1 | — |
| extractor.content_limit | integer | No | Chars of content to pass to extractor (classifier matched_content is only 320 chars) | 4000 | min 320, max 8192 |