Masking a data file with a billion records

A customer asked for help with masking personal data in their databases.

Read more

The case

In the context of the GDPR (AVG) legislation, it is important that unnecessary personal data is masked as much as possible. It is up to companies and organizations themselves to demonstrate that they comply with privacy legislation. A customer asked us for help with masking personal data in their databases.

The challenge in this case was to mask a very large data file with a billion records from the customer. The client had the wish not to mask certain columns. Not all data had to be masked, as long as the data could not be traced back to the person in question. In addition, the customer wanted the data to remain consistent, so that the test environment would still be usable.

Our solution

We prove the enormous power of our data masking tool by masking a table with more than a billion records. We adjusted the configuration for the customer, which fulfilled the wish not to mask all columns. In addition, we have anonymized the data, so that it is impossible to restore the data to the original data. The data masking tool can be applied to (almost) any database. Masking a table with a billion records means that the tool can handle 99.9% of all tables.