Using AutoClassification to Identify & Shield Records from Destruction
A large, global oil & gas corporation was already familiar with Valora’s PowerHouseTM AutoClassification technology, having deployed it across their organization for numerous use cases in Legal, Records and Information Governance. Their newest situation involved hundreds of TBs of data stored in cloud-based, Microsoft Office 365 Online Locations, such as OneDrive (for data repository and storage), SharePoint (for collaboration and storage) and Exchange (email). Their issue? A mandate from the CIO’s office instituting a mandatory, company-wide, Non-Record Disposal activity in which all content that was a) not explicitly classified as a Record or under Legal Hold and b) not utilized in the last 3 years would be automatically wiped out forever as of a certain date. Naturally, this caused considerable panic amongst a Legal team used to storage without classification. Never before had such a mandate been given and the resulting chaos was acute!
Electronic File Processing
Analytics & Data Mining
Download the PDF
As a large organization with numerous offices all over the world, the greatest challenge for this client was the impending “D-Day,” only 90 days away! On that day the automatic sweep would “blindly” wipe out years of collected files, whether correctly or incorrectly, based solely on whether the file was under Legal Hold or officially declared a Record or not. Given the volume of data, and the fact that everyone in the Legal department already had a day job, they needed a robust, reliable solution that would do this critical work for them, accurately and quickly.
Valora put together a comprehensive solution that was based on their already installed PowerHouse AutoClassification system, letting the robust chassis support multiple configurations simultaneously. The new Records analysis solution was custom-configured to identify likely records based on key attributes, followed by proper mapping to the Global Retention Schedule. For all other files, PowerHouse determined Document Type and a host of rich metadata tags, notably the last access or usage date and whether the file was in danger of being erased.
Each custodian was provided with a custom report outlining each and every file in their possession, and its status as a Record (or not), what Retention Schedule and lifecycle management it was subject to, whether or not the file was in danger of being deleted and where and how to manage the content going forward. See sample anonymized report, below.
The Client’s Information Governance team received daily reports and live dashboard access via Valora’s BlackCat Data Visualization platform to a host of important metrics about the content and its classification status and trending. This kept the project team fully informed at all times.
The system processed more than 35,000 files per hour, churning out several custodian reports each day, and shielding well over 4 million files from imminent destruction (85%). The team was shocked to see how much of their content was actually considered Records, per the requirements of their own Records Disposition polices, and how much of that content was seriously at risk of being destroyed within 90 days!
As a natural by-product of its analysis, Valora also identified significant sources of ROT as well as potentially sensitive content containing Personal Data and PII, a major boon to the group in understanding the kind of content they create, hold and need to manage going forward.