Organizing Multiple Divestitures with AutoClassification

Case Study


International Pharmaceutical Company sorts through massive divestiture content for relevance, importance and sensitivity


Valora was engaged by a large, multi-national pharmaceutical company seeking a sophisticated file analysis solution to help sort through thousands of boxes of books and records, as well as volumes of scanned and electronic data, in preparation for several critical strategic divestitures. Having grown mainly by acquisitions and mergers, the client had multiple content storage locations across North America, South America, Europe, and Asia, all with thousands of paper records to be sorted and analyzed for classification, retention and sensitivity, so that their contents may ultimately be appropriately allocated across post-divestiture parties.

Solutions Applied:


Document Analytics

OCR & Text Extraction


Products Used:





Download the PDF


Because this involved multiple divestitures, the client was focused on determining what exactly the documents were, their relevance to specific topics, whether they contained sensitive information such as trade secrets, and how they should be treated prior to production to the to-be-divested parties.


Further complicating the efforts, this client is analyzing content at two levels at once: box-level data and file-level data. First, boxes are analyzed to see if they potentially contain relevant content. Once determined, relevant boxes are further broken down into unit documents (unitization) for a more thorough analysis of applicability, sensitivity/threat and overall classification needs.  


It was clear that this large, multi-step effort would have to support multiple uses within these immediate divestitures and well beyond. First, the team on the ground in each storage location took inventory of the thousands of boxes and what type of content was present in each box. The resulting box indices were then analyzed by Valora’s PowerHouse AutoClassification software to determine content relevance, selection for further processing (scanning, OCR, unitization, classification, and redactions or other special treatments). Unit files were then tagged by PowerHouse’s automated Rules Engine for which content should be produced, kept, or both, post-divestiture. A simultaneous analysis of file-level contents was also utilized to tag content for basic metadata fields, as well as retention, sensitivity and relevance to various production and divestiture requests.

Ultimately, all relevant content was uploaded to PowerHouse for full metadata tagging and analysis, as well as ongoing use as a System of Record, with all data viewable in the Valora BlackCat data visualization platform.


Seeing immediate success on the first two divestitures of many planned, the Client has chosen to expand this process out to all divestitures going forward. Furthermore, the Client is now piloting use of the process for acquisitions, where new content is arriving at the organization and requiring similar analyses, storage, management and governance.

Serving as System of Record, Valora’s PowerHouse and BlackCat will continue to manage ongoing litigation data requests & productions, acquisitions & divestitures, and general information response/retrieval requests from all over the company worldwide.