Data Security: JSON & Dynamic Data Masking (Part 2 of 2)

Sunny Malik
3 min readDec 9, 2021

Working with Semi-Structured dataset often required securing sensitive data elements which are part of our payload

For example, an API integration comprised of employee records would contain sensitive values like SSN, Bonuses paid out, promotion details, Salary etc. These attributes can be part of a complex hierarchical structure thereby adding to the complexity of the solution we discussed in Part 1 of this blog.

To overcome this challenge lets discuss another take on securing these attributes using Snowflake UDF which not only secure these sensitive attributes but also would traverse through the complex hierarchical structure (as applicable) of your semi-structured dataset.

Let’s get started…

Create a table to host your semi-structured payload. Notice the structure of these two records added to the JSON_TABLE is different (flexible schema) simulating the real world scenario when optional columns may not be populated for every record in the upstream system

Verify the data is loaded into the target table

Now lets create a User defined function (UDF) which would include your choice of encryption/masking logic to be applied on the sensitive attributes

The UDF as described above have 2 components

  • Encryption Logic, defined by the user
  • A recursive function which traverses the semi-structured data hierarchically and applies the encryption logic as defined on the attributes

Upon executing the query to verify the data transformation using the function, we can see that the encryption/masking logic got applied on all the attributes of the payload(at all levels of the hierarchy).

But what if we need to secure only a selective few attributes in our dataset.

We’ve actually got the next level, we’re going to take this up a notch!

We added an additional input argument “mask_list” to the function (optionally can be driven through a metadata table storing the column list per integration which needs to be encrypted/masked) to hide only selective elements from the payload.

Let’s verify the transformation again (selecting only “name” and “p_name”) elements.

A quick comparison of the two versions of the UDF we just created —

  • Version 1-Masks all the elements
  • Version 2 -Takes in the column list and masks only those elements

Finally let’s create a masking policy which leverages this newly created UDF and mask the required sensitive elements for all roles except ACCOUNTADMIN role and attach the policy to our table JSON_TABLE.

Any subsequent queries on this table would now hide the details for sensitive elements as listed in the policy definition

………………It’s Snowing on a SunnyDay…!!! ……………….

--

--