Mask Data
Description
Mask data for privacy by hashing or truncating a field. Supports the following masking types:
- Hash: Hashes all values using xxHash64.
- Truncate: Sets all values to
NULL
.
Parameters
- Fields to Mask: Specify a comma-separated list of fields to mask.
- Masking Type: Specify the type of masking to apply.
Input Requirements
Input schema may be any format.
Expected Output
Masked fields' types will remain the same as the input and will be nullable.
Usage Notes
- Hashing will produce a 64-bit hash value. If the hash value is incompatible with the input format, the pipeline will throw an error. Ensure you only hash fields with types that can hold a 64-bit value, such as String, Long, or Double.
Examples
Truncate
// Input
// Fields to Mask = "ssn"
// Masking Type = Truncate
{
"name": "Bill",
"ssn": "123-45-6789"
}
// Output
{
"name": "Bill",
"ssn": null
}
Hash
// Input
// Fields to Mask = "ssn"
// Masking Type = Hash
{
"name": "Bill",
"ssn": "123-45-6789"
}
// Output
{
"name": "Bill",
"ssn": "f52b16bf"
}