Join leaders in Boston on March 27th for a special evening of networking, insight, and conversation.request an invitation here.
Businesses today have a huge opportunity to use data in new ways, but they must also consider what data they keep and how they use it to avoid potential legal issues. .Even if I grow up Generation AIorganizations have a responsibility to not only protect data, especially personal data, but also to strategically manage and delete obsolete information that poses greater risk than business value.
Forrester predicts: Doubling down on unstructured data In 2024, it will be partially driven by AI. However, as the data landscape evolves and the costs of breaches and privacy violations increase, critical consideration is required about how to create effective and robust data retention and deletion strategies.
Data explosion and increasing breach costs
As the expected amount of data increases, so do its costs data breach and invasion of privacy. Ransomware criminals have taken over highly sensitive medical and government databases, including: hacking Australian courts, Kentucky healthcare company 23andMe, and large corporations such as Infosys, Boeing, and security provider Okta. The costs of these breaches are also becoming higher. IBM found that the average total cost of a breach is: $4.45 million in 2023 — an increase of 15% compared to 2020.
To effectively manage data, organizations must create policies to delete old data. Use generation AI, executives may ask if something should be removed given future opportunities. But the longer a company stores data, the more opportunities there are for data breaches and fines for violating privacy laws. The first step to minimizing this risk is to take a comprehensive look at how your company uses data, as well as the nuanced considerations and tangible benefits of your data retention strategy.
Why delete old data?
Organizations are often forced to delete outdated data due to core legal requirements in data protection law. Regulations require that personal data be retained only for as long as necessary, leading companies to develop retention policies with different lengths for different business areas. Deleting old data not only reduces legal liability, but also reduces storage costs.
Identifying old data
The best way to identify which data is considered obsolete and which adds ongoing business value is to start with: data map This provides an overview of the source and type of incoming data, the fields it contains, and the system or server where the data will be stored. A comprehensive data map allows businesses to understand where personal data resides, what types of personal data are processed, what types of protected or special category data are processed, and the intended purposes of data processing. , it will be possible to understand the geographical location of the processing and applicable systems.
Meaningful data inventory and classification is the foundation of a strong privacy program and helps provide the data lineage needed to understand how data flows through a company's systems.
Once a company maps its corpus of data, legal and technical teams work with business stakeholders to understand how valuable certain data is, what regulatory restrictions apply to storing that data, and and the potential impact if that data is compromised. be compromised or kept longer than necessary.
Most business people are naturally reluctant to remove anything, especially when technology is changing rapidly. Conversations about deletion and retention should focus on what's most useful to your business. As an example, imagine a data analytics team at a financial institution that wants to train a loan eligibility model on as much data as possible. Unfortunately, that approach runs counter to the intent of data protection and privacy law.
The reality is that data from 20 years ago may not accurately assess today's consumers, given the significant changes in interest rates, lending practices, and consumers' individual circumstances. The company may be better off focusing on other sources of recent data, such as updated credit reports, to determine an accurate risk score.
The current commercial real estate market really highlights this challenge. Many risk prediction models were trained on pre-pandemic data, before the systematic transition to online shopping and remote work. To reduce inaccurate forecast changes, talk to: business people How data ages and loses value over time, and which data is most reflective of today's world.
Processing of old data: identification, deletion or anonymization
Determining how long to retain data starts with active legal obligations regarding maintaining financial records or sector-specific regulations regarding transactions involving personal data. Check legal statutes of limitations to determine how long to keep data in case you need to defend against potential lawsuits, such as transaction logs or evidence of user consent, rather than all personal data. We store only the personal data necessary for the protection of Data about individual users.
When it's time to erase low-value information, you can manually delete data based on each piece of data's retention period. data type Defined by retention schedule. Automating the process with purge policies improves reliability. It is also possible to use an anonymization process to remove identifiable personal data or use completely anonymized data, but this adds new challenges.
Truly anonymized data usually falls under an exception in data protection law, but to do this correctly requires removing so much value that there is little left that can be used. Anonymization requires removing not only unique direct identifiers such as SSN and name, but also indirect identifiers including information such as a customer's girlfriend's IP address. For example, to meet HIPAA standards for safe harbor protection, your organization must remove the following list: 18 identifiers. Organizations may want to try this approach to maintain the performance of their analytics or AI models. However, it is important to discuss the benefits and drawbacks with your stakeholders first.
Avoiding common pitfalls
The biggest mistake companies make when dealing with old data is rushing the process and skipping detailed conversations. Project owners must resist the urge to rush work and recognize that appropriate feedback from multiple groups is essential. Companies should work with legal, privacy, and security teams, as well as business leaders, to get feedback on the data they need to retain. You should also avoid retention policies and schedules that inadvertently delete data that your company needs. It's easy to shorten retention periods and reduce the amount of personal data you keep over time, but once it's gone, it's gone, so measure twice and cut once.
As outlined above, dealing with stale data requires several considerations, including basic data mapping and lineage, defining retention criteria, and considering how to implement these policies efficiently. there is. Solving the complex issue of data deletion requires an informed and strategic approach. By understanding the legal, cybersecurity, and financial implications, organizations can develop robust data retention strategies that not only comply with regulations but also effectively protect their digital assets.
Seth Batey is a data protection officer and senior manager at Privacy Advisors. five tran.
data decision maker
Welcome to the VentureBeat community!
DataDecisionMakers is a place where experts, including technologists who work with data, can share data-related insights and innovations.
If you want to read about cutting-edge ideas, updates, best practices, and the future of data and data technology, join DataDecisionMakers.
You may also consider Submit an article It's your own!