Skip to main content

CREATE VALIDATION

Description

Creates a new Data Quality Validation to be used in Stream Captures. When you add a Data Quality Validation to a Stream Capture, records that fail the Validation will be sent to a quarantine table.

Note that the SQL statement does not end with ;

Syntax

CREATE VALIDATION <name>
TYPE = { 'SCHEMA' | 'TIMESTAMP' }
WITH 'key1' = 'value1', 'key2' = 'value2' ....

Example

CREATE VALIDATION validate_schema
TYPE = 'SCHEMA'
WITH 'schema.target_schema_name' = 'kafka_schema'

Required parameters

  • name: Identifier for the Validation
    • Must be unique for your account
    • Must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (e.g. "Schema Validation")
    • Case-sensitive
  • TYPE: Specifies the type of Validation to create

Special parameters

Include special parameters after WITH as type String.

Schema

  • schema.target_schema_name: Specify the expected target schema for your records. Schemas are configured in the Source.
    • If your Schema Registry uses Schema Contexts, use the format: ':.context.subcontext:schemaName'. Example: ':.prod-us:schema1'

Timestamp

  • timestamp.field: Specify the timestamp field to validate.
  • timestamp.format: Specify the expected format of values in the timestamp field. Valid values are below:
    "yyyy-MM-dd",
    "yyyy-MM-dd hh:mm:ss",
    "yyyy-MM-dd HH:mm:ss",
    "yyyy-MM-dd hh:mm:ss.SSSSSS",
    "yyyy-MM-dd HH:mm:ss.SSSSSS",
    "yyyy-MM-dd hh:mm:ss.SSS",
    "yyyy-MM-dd HH:mm:ss.SSS",
    "yyyy-MM-dd'T'hh:mm:ss.SSSZ",
    "yyyy-MM-dd'T'HH:mm:ss.SSSZ",
    "yyyy-MM-dd'T'HH:mm:ss.SSSXXX",
    "yyyy-MM-dd'T'hh:mm:ss.SSSSSSZ",
    "yyyy-MM-dd'T'HH:mm:ss.SSSSSSX",
    "yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ",
    "yyyy-MM-dd'T'hh:mm:ssZ",
    "yyyy-MM-dd'T'HH:mm:ssZ",
    "EPOCH_SECONDS",
    "epoch_seconds",
    "EPOCH_MILLIS",
    "epoch_millis",
    "EPOCH_MICROS",
    "epoch_micros"
  • timestamp.expected_range.type: Specify the type of expected range for values in the timestamp field. Valid values are: 'relative', 'specific', 'no_range'.

Relative timestamp filter

  • timestamp.expected_range.relative.filter: Specify the type of relative filter to use. Valid values are 'is_not_future' and 'is_in_last'.
"Is in last" timestamp filter
  • timestamp.expected_range.relative.is_in_last.quantity: Specify the quantity for the expected timestamp range.
  • timestamp.expected_range.relative.is_in_last.unit: Specify the unit for the expected timestamp range. Valid values are 'hours', 'days', 'months', 'years'.

Specific timestamp filter

  • timestamp.expected_range.specific.specific.filter: Specify the operator to filter on. Valid values are 'LT', 'LEQ', 'GT', 'GEQ'.
  • timestamp.expected_range.specific.specific.date: Specify the date to filter on. Expected format is yyyy-MM-dd.