The Biden administration’s push for stronger cloud security, more states enacting privacy laws, new mandates around compliance - these are just more reminders of ongoing efforts to harness, secure and leverage all that data being generated.

And all that data means a lot of data: By 2025, according to one set of estimates, 463 exabytes of data will be created every day. To get a sense of scale, remember that the ‘cloud’ housed 4.4 zettabytes of data in 2019. By 2020, it was 44ZB; by 2025, it will have 200ZB.

However, volume can’t be the determining factor in setting data governance policies. The best ways to effectively use good data — and secure sensitive data — is to dispose of the rest through optimal data minimization. In sum, less is more.

This might sound like heresy, but it isn’t. Data is indeed vital, but that’s only true of some data, often the most sensitive information hackers crave.

For example, the United Nations’ Archives and Records Management Section reports that only 5 percent to 10 percent of the UN’s official records have permanent value; the rest is superfluous, irrelevant, and even dangerous to possess. Data minimization is the discipline that guides the process of limiting the collation and storage of only the data required for operations, regulations and legal proceedings. Just about everything else gets deleted, deliberately and strategically.

Of course, even junk data has issues. For example, customer data carries byzantine challenges, but there are still reasons for minimization. There are data privacy mandates cropping up all over, and most encourage minimization - New York’s SHIELD Act cites measures such as “disposal of data.” All of this makes minimization the perfect best practice and good business.

So why is this still such a challenge? One reason: Most organizations typically rely on employees to manage their own data, and even the most diligent professional might not have the skills, time, or inclination to regularly check, store and delete.

Consider your own corporate mailbox: How many unnecessary messages are there, how far back do they go, and how many carry an attachment (and maybe the same attachment multiple times)? Now add other channels into the mix—different email types, Zoom recordings, Microsoft Teams conferences, etc. The admin keeps warning of size limits, but what’s a gig or two between servers?

In sum, much of the data within a company’s digital boundaries is undiluted garbage—and risky. For example, even when data is on an employee’s personal device, it’s subject to restrictions and/or relevant to legal proceedings. Data Subject Access Request (DSAR) from consumers have a tight timeframe ad harsh penalties for non-compliance. And of course, a data breach can be devastating damaging when it extracts information the organization didn’t know it had. And that’s just a sampling.

From our own experience and research, we fully understand how it’s not always a straight line to minimization. However, there are strategies and technologies now available to identify and store the right data, and dispose of the rest.

The first steps on the road to effective minimization are critical:

  • Establish that all corporate data must be actively managed, even when employees control it 
  • Build an alliance with records management and compliance teams, who will be instrumental in permitting and perhaps scheduling regular data disposal
  • Before deleting anything, create a comprehensive guide to all applicable records management and data privacy regulations, policies, and guidelines; PII mandates alone are hugely complex
  • Enforce all policies around data sync and data capture (remember, employees routinely keep relevant data on a variety of devices)
  • Automate everything: Indexing, capturing, classifying, enriching, categorization and retention/disposition processes, approval workflows, and user access, are cumbersome otherwise.

Again, in the digital era, it can seem counter-intuitive to focus on data minimization. But that’s the point—when the data is defined more by quality than quantity, and stored on purpose rather than by default, it delivers real value. The rest needs to go.