Maintaining Client Discretion and Data Accuracy

Case Study


Tackling a set of document content so explosive that the parties involved have been in litigation for over two decades


To complete an internal investigation and share the results with the world, Valora’s prominent, public-domain Client required detailed, accurate file analysis, in conjunction with the utmost discretion, for a high-profile, sensitive matter dating from the early 1960’s. With nearly a quarter of a million pages amassed, it quickly became clear that the Client needed highly custom fields, such as victim names and allegations, as well as whom in the hierarchy was aware of incidents, and which disciplinary actions were taken when. With documents ranging from testimony and court filings to ancient decrees in Latin, the content ranged considerably in purpose, scope and tone.

Solutions Applied:


Document Analytics

Electronic File Processing

OCR & Text Extraction

Analytics & Data Mining

Products Used:



Download the PDF


This Client needed, above all else, discretion and maturity in their approach so that they would never have to undertake this effort again. Having battled these sensitive issues for many years, the Client organization was finally prepared to “come clean” and share its internal investigatory findings with its very-public constituency. Ultimately, for this particular Client and associated matters, it was essential to find a solution and team that they could trust.


Because of the severity of the allegations, the Client was understandably concerned with maintaining exceedingly high accuracy on the fielded data extracted. Valora ensured this precision by first performing a benchmark analysis of a stratified sample document set to determine extraction performance through automation alone, vs. what would benefit from further manual review. Valora configured PowerHouse to pull as many of the specific values as possible and then deployed our internal Quality Control team to manually review documents flagged for low accuracy and/or high sensitivity. The combination of automatic and manual coding ensured the highest levels of accuracy and discretion in a quick amount of time.


The unsavory alternative would have meant thousands of hours of outside counsel’s paralegal time to organize and sort hundreds of thousands of files spanning several decades and administrative regimes. The cost would’ve been completely prohibitive for this non-profit organization and taken more than 3 person-years of effort.