Data Classification
Organize, analyze and tag enterprise documents based on file content, not just file type.
Classification of documents assigns contextual attributes (rich metadata) based on the content of the file, not just the file type. By tagging or classifying enterprise data and understanding the context of the content, organizations can better make decisions on what to do with it, where it belongs, who should have access to it and where to store it.
File metadata defines the physical attributes of the files themselves:
- Creation date
- Author
- Last modified date
- File type, file size, file path
- Hash value
Rich metadata defines the contextual attributes based on the content of document and can include:
- Risk level: based on types of PII present
- Document category: contract, blueprint, health record, mortgage application, etc.
- Keyword identification: for complex contextual searches
- Defining disposition: expiration or destruction dates
The only way to understand what your data is, is to classify what it is.
Valora’s technology platform automates the classification or tagging of rich metadata and custom metadata to expedite processing and eliminate human error or omission.
Automating Classification = AutoClassification
Valora’s approach to automating the classification of data combines the machine-learning functionality of Valora’s PowerHouse AutoClassification Platform with a proven 5-step methodology for locating, identifying, analyzing, actioning and monitoring content across multiple data stores.
1. Scan & Locate - Where is it?
- Scan one or many (100,000,000+) documents
- Single and multiple shared drives
- Email repositories
- Document Management & Enterprise Content Management Systems (ECM)
- eDiscovery repositories
- HR, ERP & billing systems
- On-prem and cloud-based document repositories
2. Search & Identify - What is it?
- Search by file metadata: doc type, size, date, revisions
- Search by keywords or pattern-matching (regular expression)
- Search by creator or custodian
- Identify duplicates and near-duplicates
- Identify meaningful content vs. Redundant, Obsolete & Trivial content (ROT)
- Identify different types of content (contracts, blueprints,
- Identify PII (Personally Identifiable Information)
3. Analyze & Understand - What am I looking at?
- Preview documents
- OCR unreadable files (images, PDFs, audio files)
- Translate foreign content into English
- Transcribe audio files into text
- Produce reports (high level and drill-down)
4. Decide & Action - What do I do with it?
- Rules-based and machine learning automation for disposition
- Apply rich metadata
- Apply retention schedules
- Apply security access controls
- Migrate on demand
- Delete and sequester
5. Monitor & Audit - How often do I update?
- Set customized refresh and retention schedules based on content type or location
- Crawls, identifies and actions only new or edited data
- Runs in the background with no performance draw on systems or repositories
- Ensures retention schedules and compliance requirements are executed on time
Data Privacy
See how Valora locates and actions Personally Identifiable Information (PII) across the enterprise.
Records Management
Discover how Records and Information Management professional leverage Valora’s technology.
ROT & File Clean-Up
See how Valora reduces the amount Redundant, Obsolete and Trivial data across the enterprise.
Related Resources
Explore Valora Technologies’ Resource Library for helpful articles, videos, presentations, white papers, blog posts and more.
BlackCat MetaData Management & Data Visualization
Easily manage data and document control, reporting and analytics in a secure, private web browser environment.
PowerHouse™ AutoClassification to Manage Content from 30,000 Employees Worldwide
This technology and AI leader has a large and diverse workforce, with many new hires and terminations each year…
Automating Enriched Metadata Creation & Tagging
Learn the methodologies, tools and best practices for automating the creation, assignment and application of metadata across enterprise information silos…
How to Manage Data Privacy While Managing Records
Learn how best practices around records retention naturally lend themselves to proper and defensible data privacy protection.