FAQ

Find answers to common questions about Valora, our technology platform, and our approach. For more details or assistance, feel free to contact us.

fav-1-1

AutoClassification 101

What is AutoClassification?

AutoClassification is a suite of software that automates the analysis, classification and decisioning of digital content or files – thus AutoClassification.

AutoClassification software uses both pattern-matching algorithms and machine learning to detect file contents and attributes, and assign contextual attributes (rich metadata) and disposition (rules) for each document or file.

AutoClassification answers:  What is this piece of content? Should we even have it? And, how should it be managed throughout its lifecycle?

What is metadata?

Metadata is data about data. There is almost an infinite variety of enriched metadata tags that can be produced with AutoClassification. The quick rule of thumb is, “if you can describe the tag, AutoClassification can produce it.” At Valora we’ve seen everything from the simplest of tags (Keywords) to the most obscure Japanese Showa Date and everything in between. That said, there are a set of basic tags that nearly every AutoClassification effort will produce. They are:

Basic Identifiers, such as Document Type, Title and Date

People Fields, such as Author/Custodian, Recipient, CC/BCC, Audience, Record Holder, Employee Name, Supervisor Name, Signatory, Names Mentioned, etc.

Records Management Fields, such as Record Class, Retention Period, Expiration Date, and ROT

Data Privacy Fields, such as Data Privacy Type, Data Privacy Detail, Sensitivity, Redacted

Attributes Fields such as Keywords, Duplicate, Version, Language, Legal Hold, SKU Number, etc.

From there, the possibilities are endless.

It is easy to create custom AutoClassification tags for certain verticals, document types, and specialized needs that might be for just your organization.

Below are examples of custom AutoClassification tags: 

Custom document type: unique to the client or their industry (ex:  Variation Order, Radiology Report)

Geographic tags: indicating City, Zip Code (also zip + 4), Office or Store Location

Personnel Fields: Employee Termination Date, Employee ID, Promotion Date

Line of Business Tags: Product name/number, Case matter name/number, Requestor, Supervisor

Contractual and Finance Tags: Invoice number and amount, Contract Parties, Contract Terms, Effective and Termination Dates, etc. 

How does AutoClassification work?

AutoClassification uses both pattern-matching algorithms and machine learning to:

  • detect or “read” the contents of a document or file, then
  • assigns contextual attributes (rich metadata) to the file,
  • determines the document type based on these facets, then
  • applies disposition (rules) for each document or file.
What are the benefits of AutoClassification?

AutoClassification reduces risk. By tagging enterprise content and records, you are actively taking responsibility over the data you hold. You are removing unplanned, costly “surprises” that can result in litigation, eDiscovery, data privacy and information security when unplanned events necessitate a deep dive into “what you have or hold.”

AutoClassification makes data collection easy. Because you will know what you have, you can easily and cost-effectively comply with litigation document productions, records retention dispositions, inquiries and investigations, and of course, breach notifications. Anything that requires you to produce specific content on demand can quickly and accurately be produced, without resorting to panic, expense and extra work. 

AutoClassification reduces costs. By identifying and eliminating files you no longer need to hold (because they are duplicative, junk or past their retention period), you can reduce the amount of data held on prem or hosted in the cloud. Most organizations have 40-50% ROT, which results in substantial storage and hosting savings. 

AutoClassification makes important information available to many. Too often RIM, eDiscovery and Compliance professionals operate in a vacuum, often performing the same kind of work and duplicating efforts and expenses. With strong AutoClassification in place, each of these groups shares the work effort and results, without creating siloed knowledge centers. AutoClassification makes the self-serve, Shared Services vision a reality. 

AutoClassification ensures compliance. Whether it’s company policy on data or records management, personnel or customer files, or email message storage, most organizations have a tough time sticking to their established content lifecycle management and records retention policies, let along being able to prove that they are in compliance with regulatory requirements. Long-term AutoClassification ensures compliance with all policies, even as they change over time, and that all information is defensibly managed in accordance with those policies. It is the ultimate proof of compliance. 

AutoClassification reduces human effort and error. AutoClassification performs the task that people used to have to do by hand – determining what type of keywords, attributes and permissions to tag a file with, and then manually input those tags into some of record-keeping application. AutoClassification takes the burden of such attribute and metadata tagging off of human shoulders, eliminates the possibility of human error or oversight and replaces it with intelligent, machine-learning driven software to perform the work consistently, reliably, quickly and cost effectively.

What types of tasks can AutoClassification be used for?

The practical applications and use cases of AutoClassification can be used anywhere documents, files, or content need to be located, identified, analyzed and actioned across one or many data environments. Organizations use AutoClassification for:

Data Classification – Identifies and classifies each digital file, applying rich metadata to assign what it is (document type) and what to do with it (lifecycle management).

ROT & File Clean-up – Identifies and removes content considered Redundant, Obsolete and Trivial, saving time, effort and storage costs.

Records Retention – Applies Records Retention Policies to identify records and manage the lifecycle of all corporate content, including defensible disposition and destruction of records upon expiration.

Legal Hold – Identifies which documents are subject to Legal Hold and can quarantine them to prevent accidental destruction.

eDiscovery – Performs complex searches and production of content by keyword, date, topic, etc.

Data Privacy – Identifies and labels content that contains Personally Identifiable Information (PII) and supports DSAR requests.

Data Security – Identifies data that contains PII or other corporate sensitive data, then informs third-party Data Loss Prevention (DLP) platforms which content to lock down.

Data Migration – Identifies which content needs to migrate and defines the target location by providing Recommended File Location (RFL) or building out target taxonomy.

What types of businesses use AutoClassification?

AutoClassification technology is used by organizations where there are large amounts of disparate content across many data stores, including:

Corporations

  • Consumer goods
  • Retail/National chains
  • Telecom
  • High-tech
  • Professional Services
  • NGO

Regulated Industries

  • Energy, Oil & Gas
  • Pharmaceutical
  • Financial Services
  • Healthcare
  • Retail

Legal

  • Corporate Legal Departments
  • Law firms, Consultancy & Advisors

Government

  • Federal & State
  • City & County

Clearinghouse

  • Data, content and file aggregators
  • Data entry and content analysis providers
  • Business process outsourcing and business transformation providers
  • Managed Service Providers

PowerHouse FAQ

What is Valora's AutoClassification platform?
Valora’s AutoClassification solution has two components:
 
PowerHouse is the AutoClassification processing engine that scans, analyzes, applies customized classification tags and automated disposition rules to all content.
 
BlackCat is a metadata management user interface that displays charts, reports, collaborative workflows and enables manual, bulk, or fully automated disposition.
How does PowerHouse work?

Valora’s PowerHouse platform connects to multiple repositories to scan files, then performs a full text analysis on every file to determine key facets about each file.

It identifies the important facets of each file and applies these attributes, or rich metadata, to the file including Document Type, Title, Date(s), Author/Recipient, RecordClass, Sensitivity, Data Privacy, Keywords, etc.

It assigns important disposition tags (how to treat and/or manage the file based on its contents and other requirements), such as Expiration Date, Legal Hold, and Access Permissions.

It applies automated, manual or hybrid defensible disposition actions across and between these multiple repositories. 

PowerHouse performs delta scans of each repository to look for new and edited content and processes in perpetuity.

Where is PowerHouse deployed?

PowerHouse can be deployed in Valora’s cloud, the Client’s private or public cloud and even on-premise, if need be.

 
What kinds of systems can PowerHouse scan?

Valora PowerHouse connects to structured and unstructured data systems to crawl and analyze content. Some examples of systems we connect to include:

On-prem unstructured systems: Windows fileshares.

On-prem structured systems: ECMs, databases, etc.

Cloud unstructured systems: SharePoint online, Dropbox, Box, etc.

Cloud structured systems: Data lakes/warehouse, Oracle, MSSQL, etc.

Could applications: Salesforce, Workday, SAP, NetSuite, etc.

Email systems: Microsoft 365, Google mail, etc.

How does PowerHouse connect to these disparate systems?

Valora’s Development team builds custom connectors using APIs or other direct database extraction methods to access the content at source. Some examples of the technology used for custom connector development include:

Open APIs – application programming interface made publicly available to software developers by the application or repositories.

Microsoft Graph API – a RESTful web API that enables access to all Microsoft Cloud service resources.

JDBC (Java Database Connectivity) – the Java API that manages connecting to a database, issuing queries and commands, and handling result sets obtained from the database.

Open Database Connectivity (ODBC) interface – a C programming language interface that makes it possible for applications to access data from a variety of database management systems (DBMSs).

How long does processing take?

Short answer: It depends.

Long answer: It depends on how many, what types and the size of the files PowerHouse is processing and analyzing. For example: A 2-page document processes faster than a 250-page document. A single email with one attachment processes faster than a .pst file with 100,000 embedded email messages and attachments.

The system will OCR (Optical Charater Recognition) PDFs or other scanned images and turn it into readable text as it goes. It can AutoTranscribe non-English documents into English (or other languages) and can AutoTranscribe audio and video files into text. These are other examples of “heavy lifting” processing that takes a little longer than “reading” a straight-up Word file.

For digitally native files we ballpark processing at about 1/2 GB per processor per hour. In the first few weeks after set-up and configuration we benchmark how fast things are processing based on your actual content so we can better forecast going forward.

Where do the resulting metadata tags go?

There are many options for what PowerHouse does with the tags, rules and dispositions that it creates. All of this meta-information lives within its internal database, and is available for push or promotion to any/all of the following destinations:

  • As inputs to the BlackCat data visualization/graphical representation platform (or other dashboard)
  • As reports or data files (csv, excel, etc.)
  • As database/repository/DMS/ECM fields
  • As SharePoint or other collaborative file-share fields or metadata
  • As file tags, appended to the file metadata (as supported by the file type) or as part of a file naming convention

BlackCat FAQ

How does BlackCat work?

If PowerHouse is the engine, BlackCat is the cockpit. BlackCat is Valora’s proprietary software platform and is referred to as a Metadata Management Interface. BlackCat brings all enterprise content into one “single pane of glass” view where users can access, view, search, analyze and report on the extracted text and applied metadata that PowerHouse has processed.

What types of metadata can I see in BlackCat?
What types of metadata can I see in BlackCat?

Most Valora clients have 50-60 metadata values they search for and populate. There are a set of basic tags that nearly every AutoClassification effort will produce. They are: 

Basic Identifiers, such as Document Type, Title & Date  

People Fields such as Author/Custodian, Recipient, CC/BCC, Audience, Record Holder, Employee Name, Supervisor Name, Signatory, Names Mentioned, etc. 

Records Management Fields, such as Record Class, Retention Period, Expiration Date, and ROT 

Data Privacy Fields, such as Data Privacy Type, Data Privacy Detail, Sensitivity, Redacted 

Attributes Fields such as Keywords, Duplicate, Version, Language, Legal Hold, SKU Number, etc. 

From there, the possibilities are endless.  It is easy to create custom AutoClassification tags for certain verticals, document types, and specialized needs that might be for just your organization.  Below are examples of custom AutoClassification tags: 

Custom document types unique to the client or their industry (ex:  Variation Order, Radiology Report) 

Geographic tags indicating City, Zip Code (also zip + 4), Office or Store Location 

Personnel Fields:  Employee Termination Date, Employee ID, Promotion Date 

Line of Business Tags:  Product name/number, Case matter name/number, Requestor, Supervisor 

Contractual and Finance Tags:  Invoice number and amount, Contract Parties, Contract Terms, Effective and Termination Dates, etc. 

What can users do in BlackCat?

After PowerHouse has scanned and processed data, users can interact with all that data in BlackCat. If a metadata value has been applied, it (and its fielded content) can be searched and reported on in BlackCat.

There are a variety of interactive dashboards and reporting tools in BlackCat including:

  • Dashboard view: interactive point-and-click drill down searching
  • List view:  a list of documents with all fielded metadata values
  • Document preview: preview documents without opening the native file
  • On-demand reporting: point-and-click ROT and DSAR reporting (customizable)
  • Chronological reporting: a chronological report of all content in a subset
  • Heatmap reporting: by state, by city (ex. content , data volume, PII risk)

BlackCat users can also:

  • manually edit and apply additional metadata values
  • drill down or multi-value data searches to create subsets or “selection sets” of data
  • collaborate with other users to support approvals and business workflows
  • perform advanced search and document production for DSAR requests, breach reporting, Legal Hold and eDiscovery
  • defensibly delete content directly from BlackCat
How do users collaborate in BlackCat?

There are several ways BlackCat users can collaborate with each other and notify others outside of BlackCat. If pulling ROT reports, PII reports or specific content searches, there are several ways users can collaborate and notify:

Human-generated email with link: an email is sent from one user to another with a link to the report for review. This is best for collaborating IG users in BlackCat, the recipient is sent directly to the content view of the sender.

System-generated or human-generated email with report: an email is sent to the user with a report attachment in Excel. This is best for employees that do not have access to BlackCat directly. Recipients can see entire list of files and locations and “approve” the disposition of documents with a click in the excel that sends a notification back to the system.

System-generated email with link: a notification can be sent to users when a variety of triggers are met. This is best for recurring notifications by content volume (ex. notify me when it gets to 100GB), by date (ex. notify me every 60 days) or by obsolete reason (ex. notify me when documents expire).

Implementation, Services & Licensing

What is Valora's pricing model?
Following one-time set-up and configuration fees, Valora is an annual Software-as-a-Service (SaaS) licensing model with ongoing Professional Services to support the implementation and the Client.
 
What is the process of implementing Valora's AutoClassification platform?

We approach an implementation and roll-out in two phases:

  • Set-up & Configuration
  • GoForward

Set-up & Configuration can be between 8-12 weeks and includes all subject matter expert consulting, project planning and management, infrastructure deployment, connector development, custom configuration and software licensing for both PowerHouse and BlackCat during the set-up phase. We make sure it’s doing what you need it to do and you’re seeing what you need to see. We makes adjustments along the way to optimize the machine learning algorithms as it processes your data.

The GoForward phase are annual agreements in which we ramp to process all data in repositories. The order in which we process can be flexible and we can reprioritize as we go, ramp up and scale back where we need to.

Do you offer a Pilot or Proof of Concept?

Yes, we can offer a paid Pilot. A Pilot is usually deployed in Valora’s cloud with a small subset of your data sent to Valora, rather than direct-connect to source repositories. This subset of data is usually around 20-30GB. During this phase we deploy and implement a live system as we would in a regular Set-up & Configuration phase (see implementation FAQ above).

When the Pilot is successful, we move into a GoForward phase and perform the custom connector development at that time. We then deploy the connectors to the source repositories to begin processing in place.

Why is it an annual license and not project-based?

Any scan is a snapshot of your data stores at that particular moment in time. Organizations are editing existing and creating new content every day. Valora perpetually scans for any new and edited content and applies the appropriate classification and disposition rules. This ensures ongoing compliance.

Data Discovery & Classification FAQ

How fast does Valora crawl and scan repositories?

The first baseline run – scanning and full text analysis of every file – runs at about 0.5 GB per processor-hour of uncompressed/expanded data. Valora’s AutoClassification engine, PowerHouse, is offered in three tiers. The higher the tier, the higher the number processors, the faster the processing.

  • PowerHouse Starter – can process 2.5GB, or approximately 6,250 files per hour
  • PowerHouse Foundation – can process 7.5GB, or approximately 18,750 files per hour
  • PowerHouse Enterprise – can process 20GB, or approximately 50,000 files per hour

Subsequent data processing runs (for data updates, new configurations or handling rules, etc.) typically run at about 1.5 – 5 GB per processor-hour.

Can I inventory my content without AutoClassifying every file?

Yes. Some clients opt for Lite Processing – a fast, thorough scan of the file inventory of a file share or other repository. Lite Processing results in a comprehensive file listing, including all identical duplicates, their size, full path, last accessed date, and last modified date. This approach is often used to create a starting point for risk assessment, gap analysis, and processing recommendations. The final result of Lite Processing  yields an automated, rules-driven recommendation for remediating each share analyzed per its resulting risk profile.

Will it be able to classify document types or formats that are unique to my company?

Yes. While we have processed and identified thousands of different Document Types over the years, there may be Document Types unique to your organization or industry that we have not seen before. In such cases, Valora trains the system to identify your unique Document Type formats and attributes for accurate classification going forward.

Can Valora integrate and analyze my physical records?

Yes. There are 2 ways Valora integrates with physical records. 

  1. Valora inherits digitized physical documents and metadata from your document storage provider.  Each scanned file acts, appears, and is handled like its electronically-stored siblings., including full-text analysis, enriched metadata, and automated handling rules.
  2. For boxes and documents still in physical format, we integrate with your storage vendor’s inventory tracking systems, representing each physical box (or file) as a “Mockument” – a record placeholder inside BlackCat used to report on or trigger actions at the Box or Document level.
Can it AutoClassify files in other languages?

Yes. Some languages are supported with native, in-language processing, such as French, Spanish, and other Roman character-set languages.

For other language analysis, Valora identifies non-English documents and AutoTranslates them into whatever language your team is comfortable with. We integrate with Google Translate, and its support of over 240 world languages. We create your own Google API key and download languages to the same location as Valora systems are deployed (our cloud, your cloud, on on-prem). This allows for translating “onsite” without the content going to Google’s cloud.

What is the setup impact on my team? Do we need to “train” the system to recognize our files?

For initial set-up, if you know you have certain files types or document you want to train the system on, it could be helpful for your team to provide a description of the document or format, or better yet provide 2-3 examples of specific files. This guidance data will help us to train the system on what to look for and how to identify your unique Document Types.

Other than providing us a handful of templates (if you have them), the impact to your team is minimal.  We will occasionally have tagging questions for your InfoGov or content-holder teams, and we will require basic, service account access and permissions from your IT team.

Can Valora utilize and/or maintain previously tagged data?

Yes. Valora can read, inherit and apply existing metadata during the analysis process to maintain the metadata values already applied to your data.  Furthermore, we will incorporate prior-tagged data into the resulting taxonomy and any subsequent rules processing.

Do Valora’s tags integrate with other systems that use tags, such as SharePoint, DMS systems and archival/preservation storage systems?

Yes. Valora integrates with third-party systems that use metadata tags. With the correct write permissions, Valora and can send or migrate the file itself and its associated metadata to the target system.

ROT & File Clean-up FAQ

How much storage space can I save by removing ROT?

In our experience, most organizations find that between 40-50% of their stored data is Redundant, Obsolete or Trivial (ROT) and can be defensibly removed. Depending on your stored data volume, this can easily represent millions of dollars of potential cost savings!

Can we customize what the system considers ROT?

Yes. Organizations can fully customize what the system identifies as ROT.

Clients may dictate their Trivial values based on any combination of metadata attributes, such as: File Extension (.tmp, .exe, .dll), Document Type (out of office email, receipt notice, etc.), File Path and Last Accessed Date (abandoned file share last accessed in 1998), and presence/lack of Key Words and phrases (Draft, Revision, Copy, etc.).

Redundant values can be customized to include/exclude identical duplicates (SHA-256 hash match), functional duplicates (Word saved to PDF), or near duplicates (similar or near versions of the same contents or metadata values).

Obsolete values are usually based on an organization’s retention policies or workflows, but can be customized as new content is discovered or changes to policies are required. 

Can we tell which files are ROT without a full-text analysis?

Yes, to a degree.  Without  full-text analysis, Valora  reduces ROT determinations to identical duplicates detection only, as identified by the SHA-256 Hash code matching algorithm. SHA-matches are 100% forensically identical duplicate files, the single biggest contributor to redundancy.

For complete ROT analysis, full-text processing is required to identify functional duplicates (where the content is 100% match, but the files are different, for example: saving a Word doc as a PDF), and near duplicates (where content is similar, but expressly not identical).

For a group of identical duplicates, can we determine which ones are ROT and which ones are not (and should be kept)?

Yes. Clients may dictate which identical duplicate(s) should be deemed the group master(s) or “team captain(s)” and therefore be retained. This determination can be by Source, Author, File Path, Custodian, or any other data element that PowerHouse creates.

How do I get my organization to actually agree to delete ROT?

We get it, actually deleting content is hard.  

Some people may resist deleting their data for a few reasons, including:

  • Perception of value: Believing that seemingly redundant or outdated information could be useful in the future, even if it hasn’t been accessed in years.
  • Lack of ownership: Not knowing who “owns” specific content, leading to hesitation in deleting it.
  • A “save everything” mindset: Some organizations have a hoarding culture where everything is kept “just in case.”
  • Regulatory concerns: Misunderstandings about legal or compliance obligations may lead to over-retention of records or sensitive data.
  • Fear of change: Natural resistance to change, even when it involves improving efficiency or meeting compliance obligations.
  • Uncertainty about outcomes: Concerns about how new content management practices might disrupt workflows.

Addressing these concerns often requires a mix of clear communication, effective tools, robust processes, and a cultural shift toward valuing clean, useful, and organized information. Valora can help.

Records Management FAQ

We are early in the process of building our Retention Schedule, can we still use Valora?

Yes. While having a Records Retention Schedule in place is helpful in determining retention periods and handling, some organizations have used Valora to help inform their Record Class and Disposition requirements  by first identifying what kind of content is present in their data environments – so they can create policies based on what they actually have.

How do we implement changes and/or updates to our Retention Schedule?

Valora integrates directly with an organization’s Records Retention Schedule, other business policies, and guidance data, so that changes or updates are propagated automatically across the organization at the time of change.

Valora offers native connectivity to cloud-based Retention Schedule applications (such as Policy Center, FilersKeepers, etc.) for those clients who are utilizing such systems.

How does Valora help with regulatory compliance?

Valora supports regulatory compliance by automating the identification, categorization, repair, and management of enterprise data and records in alignment with legal and regulatory requirements. PowerHouse ensures accurate and consistent tagging of information, assigning appropriate retention and disposition policies, and facilitating quick retrieval of records for eDiscovery, audits or investigations.

For instance, in the financial sector, Valora helps identify and securely store financial transaction records for the required 5-10 years to comply with SOX or FINRA regulations, while ensuring older data is disposed of when permitted.

By leveraging AutoClassification, Valora clients comply with complex regulatory requirements more efficiently while also improving overall data management and reducing compliance-related risks.

How can I account for my physical records with Valora?

There are 2 ways Valora integrates with physical records. 

  1. Valora inherits digitized physical documents and metadata from your document storage provider. Each scanned file acts, appears, and is handled like its electronically-stored siblings, including full-text analysis, enriched metadata, and automated handling rules. 
  2. For boxes and documents still in physical format, we integrate with your storage vendor’s inventory tracking systems, representing each physical box (or file) as a “Mockument” – a record placeholder inside BlackCat used to report on or trigger actions at the Box or Document level.
Can Valora send out notifications to people when their files/records are expiring?

Yes. BlackCat sends users automated email notifications and/or in-app notifications when their files or records are expiring. Valora customizes these workflows based on client business processes, for instance, requesting approval by certain users to delete or migrate content. All approvals, decisions, and actions are tracked and auditable inside BlackCat.

Can Valora track when custodians have (or haven’t) given their approval for file deletion?

Yes. Valora automates these business workflows and alerts the necessary people when retention approvals have been made or decision deadlines have been missed. The cadence of these reminder notifications can be customized to user, Source Repository, Document Type, or Record Class.

Can Valora move records from the wrong locations (ex: fileshare) to the right locations (ex: SharePoint or long-term preservation)?

Yes. When Valora’s connector agents have been granted write permissions to the target repositories, PowerHouse will identify, tag, log, and physically move a file from the inappropriate location, and to the proper target repository location, including navigating file trees or taxonomies, as appropriate.

Data Privacy & GRC FAQ

Can my organization customize what is considered personal or sensitive data?

Yes. While Valora automatically tags for standardized personal data elements (PII, PHI, PCI), clients may also customize their sensitive data designations to align with their content, business objectives or policies.

Example customizations may include: Employee ID, work or building location, Intellectual Property (IP), or information about mergers and acquisitions.

In Pharmaceutical & Life Sciences industries, how does Valora identify & protect unblinding data?
Valora is a powerful tool in the pharmaceutical industry for identifying and managing unblinding data, which is crucial for maintaining the integrity of clinical trials and ensuring regulatory compliance.  

One of the key benefits is the ability to proactively flag or quarantine documents containing unblinding data. Integrated with workflows, Valora alerts relevant teams to potential risks, ensuring that unblinded data is only accessible to authorized personnel. Additionally, Valora organizes and tags sensitive information, enabling seamless segregation of unblinded and blinded data. This makes it easier to maintain proper access controls and ensures data is handled in compliance with regulatory requirements.

We use OneTrust and/or Varonis for our Data Privacy & DLP, how does Valora fit with these systems?

Valora integrates with and can be configured to send data to most Data Loss Prevention (DLP) and Data Privacy platforms, including Varonis and OneTrust, among others.

As data is created, modified, or shared, it is AutoClassified by Valora to identify the document type and flagged as containing important, sensitive or personal information. Sending this information directly to a DLP informs the system of the content type, location and sensitivity class of each document and ensures policies to enforce encryption, access controls, or exfiltration blocking are implemented without delay and that sensitive data remains secure across its lifecycle.

By systematically classifying data and mapping labels and content to DLP systems, organizations can demonstrate consistent security practices and provide detailed audit trails during assessments.

Can Valora distinguish between Employee Personal Data vs. Customer Personal Data, even though they might both contain SSNs or Home address?

Yes. To distinguish whether personal data belongs to an employee (past or present) or a customer, Valora integrates with and pulls data from third-party systems or information sources (what we call guidance data) to inform Valora on who’s who.

In determining whether personal data belongs to an employee, Valora integrates with an organization’s HR system, determining active and inactive employees, flagging the respective files encountered as belonging to the relevant employee.

In determining whether personal data belongs to a customer, Valora integrates with various CRMs or a customer list to determine whether the individual (or entity) is an active or former customer. Valora then flags the respective files as belonging to the relevant customer.

Can Valora prevent our AI engines and enterprise search tools from accessing or using sensitive data in their analysis or results?
Yes. Valora prevents AI engines and enterprise search tools from accessing or using sensitive data by analyzing content to identify sensitive information and applying classification labels (ex. “Confidential” or “Restricted”). These labels enforce access control policies, ensuring only authorized users or systems can view or process the data. Additionally, Valora can redact or anonymize sensitive information, allowing non-sensitive parts to be analyzed securely. This approach ensures compliance with data protection policies while maintaining operational efficiency.
Can Valora handle multiple and conflicting jurisdictional requirements on our enterprise data?

Yes. Valora’s flexible deployment allows for cloud and/or on-prem hosting options. A single organization’s system can be set-up in multiple environments and across multiple geographic locations. This is important for international organizations where data residency requirements dictate that data must be housed within certain jurisdictions.

Multiple systems can live independently, or roll-up into a single view to allow global Information Governance teams to manage content without it leaving its jurisdiction.

Legal Hold & eDiscovery FAQ

1. How is Valora’s approach different from traditional eDiscovery processing?

Valora considers eDiscovery efforts to be one of many inter-related ways of understanding, protecting, managing and utilizing enterprise data.  That is, we believe eDiscovery is one of the 8 Key Pillars of Total Information Governance. Yes, eDiscovery has its own unique challenges, as do Records Management, Data Privacy, AI Readiness and more.  Our goal is to be as efficient as possible in preparing for litigation, utilizing prior work product and processing efforts wherever possible, to mitigate costs and reduce turn times, as eDiscovery matters are often extremely time-sensitive. 

We also strive to provide the maximum value possible for each unique matter, with customized data tags, processing, reporting, and production options.

Our flat monthly licensing fees eliminate the need for indiscriminate custodian or data culling simply for cost’s or effort’s sake, and instead focus on the materiality of the matter, ultimately providing the  best and most relevant content to prove your case.

2. Are there additional charges for eDiscovery processing?

No. As long as you are a  Valora Total Governance licensee, you will have access to all eDiscovery and Legal Hold functionality at all times, for all data, and across all matters.  There are no additional charges for processing (or re-processing), hosting, or culling.  In fact, your monthly fee also includes access to our Professional Services team for operational (managed services) support, technical support and case support.

3. Can we make litigation productions directly from BlackCat?

Yes.  BlackCat has built-in functionality for native file productions (with or without redactions), reports, timelines, and data tables.

4. Can Valora maintain a separate database of files per matter?

Yes.  We can similarly restrict access to different data populations for different users.

Many of our clients still manage their holds this way.  Fortunately PowerHouse can parse (read) your spreadsheet and map your custodians and data under hold to the data it is analyzing and managing.  Either send Valora your updated spreadsheet(s) and/or designate a fixed location for it that PowerHouse can automatically scan on a regular cadence or on-demand.

Yes, BlackCat has built-in notifications for many functions and states, including issuance (and reminders) for legal hold.  Hold notices are typically sent as a tracked, return-receipt email to the custodian, though other methods are also available (such as text, direct messaging, in-app notices, etc.)

AI Readiness FAQ

What kind of AI technology does Valora apply in its own platform?

Valora’s PowerHouse platform encompasses state-of-the-art content analysis and AutoClassification, incorporating elements of probabilistic systems, Bayesian learning, natural language processing (NLP) and machine learning (AI).

It does not incorporate any Generative AI (creating new data), but uses extractive algorithms to “read” and “pull out” data elements for automated classification of content from existing data sets.

What kinds of AI does Valora support?

Valora supports all formats of extractive and generative AI models, using industry-standard classification and output formats.

Do you have native integrations with Open AI/Chat GPT, Anthropic/Claude, etc.?

No. The typical technique involves using Valora to pre- or live-curate enterprise data, and then feed the approved corpus to an AI-accessible data lake or directly to the AI engine.

Can Valora exclude specific data from being accessed by AI engines?

Yes. Valora can prevent AI engines and enterprise search tools from accessing or using sensitive data by analyzing content to identify sensitive information and applying classification labels (ex. “Confidential” or “Restricted”). These labels enforce access control policies, ensuring only authorized users or systems can view or process the data. Additionally, Valora can redact or anonymize sensitive information, allowing non-sensitive parts to be analyzed securely. This approach ensures compliance with data protection policies while maintaining operational efficiency.

Have more questions?

Do you have a question we haven’t covered here? Let us know!