Skip to content

All Redshift and Snowflake models: Fix varchar length for pseudonymized fields #121

@agnes-kiss

Description

@agnes-kiss

In case the PII pseudonymization enrichment is enabled and run, the length of the target fields may change depending on the hashing algorithm used. The complete list of fields that may be hashed can be found here. This could present a problem that may prevent the commit steps from running in case there is a mismatch between the character length defined in the models` table definitions vs the incoming data, especially if the source data is longer than the target.

One possible solution is to increase the length of all the possibly impacted columns to fit the highest value of 128 characters (which could result from SHA-512 being used) in case it is less than it is currently defined in the model. Based on this criteria, domain_userid and session_id seem to be the fields that are impacted (Redshift and Snowflake, web and mobile models).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions