As almost everyone is rushing to update their privacy policies and reaffirm your consent, the “other” copies of your data have begun to creep into the discussion and quip the interest of those looking to test the limits of GDPR and companies’ readiness.
Wait? What other copies?! You know, all those secondary copies of data that you keep.
- The copies sitting in your array/cloud snapshot(s)
- Those copies that you replicated to “site B” for disaster recovery
- The sandbox copies you gave to the developers for them to play with and test against
- That copy you put on your laptop because you were going to work on it outside of the office
- Those copies you made of the data for your regular hourly/daily/weekly/yearly backup
- The copies you have sitting on tapes in a vault because you haven’t defined anything more than “keep it all forever” in your retention policy
You’ve never bothered tracking all of these backup and redundant copies before, because the risk was low, and storage cheap, but times have changed. Or have they?
The GDPR police
What happens when the General Data Protection Regulation (GDPR) police come knocking on your door? Well, there aren’t any actual GDPR police, but there are regulators with audit powers and the ability to fine you firmly wedged in their back pockets. So, when someone asks that you forget them, the real challenge begins:
- Do you know where that person’s data is? It could be sitting not just in that one “primary” location happily flipping bits in the data center, but also in any and every one of those copies.
- What now? Can you “see” all these copies? Can you find the specific data in the copies? Can you delete those specific items without compromising your overall recovery capabilities or other compliance/regulatory mandates?
- What happens if you DO manage to delete the data but the forgotten data gets “remembered” by a restore or other recovery event?
The Information Commissioner’s Office in the UK (a member of the European Data Protection Board, EDPB) has indicated that if data can be shown to be beyond normal use (as in a backup), then organizations should consider that removing data from backups is disproportionate to an erasure request. Of course, the organization must have a documented process, with safeguards to ensure this is accomplished and the data in question is not recovered for active processing again.
This does however, lead to other considerations, including:
- What if you have to “re-forget” personal data, it could lengthen your recovery SLAs
- What if your backup retention time plus the time it takes to forget is less than the time you need to operate the whole “forget” process
- What if automating the expiration and clean up of dev and test data isn’t enough
- How do you manage unstructured data and laptops that probably account for 70-80 percent of business data and will still contain large amounts of personal data/PII
A backup software that integrates with service desk software will help you to manage and record many of these actions to ensure that those ‘other’ copies are within compliance. But a few initial additional steps to help with this include:
- Processing forget requests quickly
- Not just automating but also anonymising dev and test data
- Expanding data management outside of just applications to other sources of unstructured data
- Shortening backup retention, combined with content-driven archiving and content indexing
GDPR gives backup even more purpose
There is software that can actually delete backup data if required, even single files on tape, and the combined management and storage of backups and archives with content-based search dramatically strengthens your position with regard to forgetting people, and for data transfers.
However, GDPR means decisions on deleting from backups, backup retention and how you manage data recovery scenarios require very careful consideration and regular review. If your backup software allows you to find the data in question and can perform a deletion from backup copies, this also has to be balanced with a legal decision on whether that data should even be deleted at all.
This is because criminal investigations and many other laws and regulations take precedence over GDPR, so backups could actually provide backup for a new reason.
Ultimately, managing your primary data better will reduce your backup and retention-related risks, and only by profiling what you have, and setting policies accordingly, can you improve your position.
Nigel Tozer is solutions marketing director for EMEA at Commvault