Data mask: testing with anonymised production data

Testing with masked data

To test an application, you prefer to use data that matches reality as much as possible. But a lot of software uses personal data. To properly test (new features of) an application, you prefer not to use this data. This is why we developed the Enshore Data Mask. Testing with masked data for the most realistic results, without the risk of leaking sensitive data.

Applicable to any database

Masking a table with a billion records means that the tool can handle 99.9% of all tables in terms of size. The data masking tool is therefore applicable to (almost) any database. Whereas for this application we chose to fully anonymise the data, we also made it possible to pseudonymise data. In both cases, the test data is still usable, but with the use of pseudonyms, the test data can still be traced back to the original data. We use this function for Logspect. An application that allows organisations to better ensure the privacy of their customer and patient data.

Masking a billion records

The Enshore Datamasker was developed for an organisation that works with large datasets on a daily basis. The challenge was to mask a dataset with a billion records. Only the columns containing personal data needed to be masked. Since the data did not need to be traced back to the original data, we chose to anonymise it completely.

Anonymise

When anonymising personal data, it is impossible to restore the masked data to the original data. This is necessary in situations where testers and developers do not have permission to view personal data. The new AVG states that in these situations, organisations are only compliant if they use anonymised testing environments.

Pseudonymisation

Sometimes it is desirable to be able to retrace data after it has been masked. This pseudonymisation is then done by replacing personal data with so-called synonyms. These synonyms can then be converted back to actual personal data.