Coding & Indexing
Custom data extraction, coding, document review and disposition without the need for large temp “armies” or offshore providers. Discover what fully and semi-automated solutions can do.
Without a doubt, Valora is the undisputed industry leader in onshore, domestic document coding, whether for Litigation & eDiscovery, Information Governance, or Records & Knowledge Management efforts. With a fifteen year, 22 billion page history providing document coding for some of the world’s most respected organizations and governments, Valora has earned a strong reputation as a high quality, rapid-turnaround, cost-efficient provider of strategic data capture and document disposition services.
Utilizing our best of breed, PowerHouse™ automated services technology platform, combined with our decades of best practices and US security-cleared professional staff, Valora delivers document coding and indexing solutions like no one else. Imagine being able to automatically redact your population at the touch of a button, or to add 50+ issue codes to your Document Review protocol, or to automatically recognize hundreds of unique document types from a single, heterogeneous document population.
Valora has been proving the future is now with its automated solutions for document and data capture for over sixteen years. Come see why Valora was voted Best Document Coding solution by our clients nine years in a row! Valora’s Indexing & Coding Services are available in fully-automated, semi-automated, and fully manual approaches. Contact us to learn which method is right for your project.
VALORA QUICK ANSWERS
Understanding Coding
What is Coding and Indexing?
Example Usage Scenarios
Important to Know & Ask
More Information
Case Studies
Short Presentation Online
Frequently Asked Questions
You Might Also Like
Clients Needing Coding Typically Also Need…
Autocoding Has Its Day, Valora White Paper
Measuring Coding Accuracy the DOJ Way
“Whether it’s AutoCoding a single field, Basic Bib coding for litigation, or complex Issue Coding for Document Review, the Valora approach simply can’t be beat for efficiency, price and accuracy.”
Managing Attorney, US federal agency
What is Coding and Indexing?
Document coding is the practice of determining important information about or from a document and then populating a fielded database with that extracted information. Document coding is typically utilized when there is limited other data available about a document or file (no metadata), so that a database can easily be searched or sorted by the extracted “codes.” Document coding is frequently used in the legal, records, financial and medical communities to quickly organize and classify records for later use.
Indexing is a similar activity, but more like simple data entry in nature and typically used to tag documents for storage and retrieval, rather than active use. Thus, document coding is a more informed activity, and requires some skill and understanding about the matter at hand. At Valora, we tend to use these phrases interchangeably.
Documents are indexed for an almost unlimited list of possible fields. There are fields for:
- people associated with a document, transaction or matter (Ex: Authors, Recipients, Copyees, Blind Copyees, Signatories, Co-Signatories, Mortgagees, Third parties and more)
- attributes of the specific document (Ex: language, page count, source location, duplicates/near duplicates, packaging & attachment ranges)
- and with the content of the document (EX: issues, keywords, subject matter, sensitive information, tone & intent).
There is an equally long list of reasons that people code documents. Coded documents are easy to search and sort. Coding helps people understand complex documents at a high-level, without having to read through them. Coding helps group together similar documents, for storage, retrieval, assessment, routing or production. Sometimes there are government or investor mandates to maintain a fully coded database. Whatever the reasons, all coded databases require the same things: accurate & complete data capture; prompt, repeatable service; and responsible, competent project management.
At Valora, we have been coding documents for more than a dozen years and we are truly world experts at it. We support 175 unique document types, more than 110 data types (fields) and 15 languages. Our services come in fully automated, semi-automated and fully manual solutions. There is almost nothing we cannot code (or haven’t already)!
Just ask.

Example Usage Scenarios for Coding and Indexing
- Provide a set of bibliographic “codes” for each document in a litigation or records management database. Typical bibliographic codes include: Document Type, Date of Document, Title or Subject, and Author, Recipient, Copyees & BCC’s.
- Provide preliminary legal disposition “code” on the relevance/responsiveness of a document to a particular claim or inquiry.
- Similar to above, provide preliminary legal disposition “codes” for privilege, key/hot issues, witness documents and such.
- Create simple tag and sort systems for records. Typical records management fields include: Document Date, Document Type or Title and Source,
- Create a summary of information about documents or database records. Ex: “Invoice 12654 from XYZ Corp to ABC, Inc. for $13,462.00, dated 01/09/2011.”
- Sometimes coding is used as a type of classification technique. For example, documents or records may be coded as: “delete record,” “hot document,” or “classified,” depending on different Rules-Based criteria. From there, other workflow steps may be initiated.
- Fully or semi-automated coding methods are often used to expedite the coding effort and/or to keep the costs of manual coding labor to a minimum. With AutoCoding, it is common to process hundreds of thousands of records in a single day!
- Highlight or redact key information in the record.
- Enter key information to the database about a customer, transaction, patient, document or record.
- Data mining across files to determine statistical patterns, trends, and forecasts.
- Routine analysis and searching of data for compliance purposes, such as for improper protocols or language use, disclosure of PII or sensitive information (SI), or security violations.

Important Things to Know and Ask About Coding and Indexing
1. Coding work is usually performed according to the specifications provided in a Coding Manual. The manual can be created by client, outside counsel or consultant or by the service provider. The important thing is for all relevant parties to agree and sign off on the specifications of the manual. A good Coding Manual should always provide concrete examples of how to and how not to code particulars.
2. Consider the optimal time for the coding work to be performed. The typical options are as part of a scanning or conversion process, or taking place in batch mode at some other time. In Valora’s experience, the post-conversion batch mode works the best in providing strong accuracy and consistency in the coding work product. We recommend tight, controlled batch coding, with step-by-step quality control as a best practice.
3. Coding can be charged by the file or document coded, by the page, by the character typed or by the hour. Make sure you understand how the charges work and how this will affect your overall budget. If you need to compare methods, consider these common metric conversions.
4. Valora recommends providing both text and images for coding projects. This way the Quality Control staff can easily search for information and also see it in its true format onscreen. Don’t be fooled that saving a small amount now on OCR/text extraction or imaging will save you significant money. In the long run, the coding will take longer without these easily obtainable file elements.
5. If you are unsure which coded fields you will need, start small. You can always have additional fields added at a later date. For fully and semi-autocoded projects, this is a very easy (and expected) effort.
6. It is possible that not all of your documents will need to be coded the same way or for the same set of fields across the board. Consider different Levels of Treatment to save on coding expenses by narrowing the coding effort by Document Type, Document Status or Date.
7. Who will make the final judgment call on codes? Sometimes sensitive information can be tricky to assess. Who in the client organization (or designee) will have final authority on close calls?
8. How will you handle partial scenarios? For example, suppose part of a SSN or Date is present. Should that information to be captured as if the information was present in full, or is it preferable to leave the partial information as it stands? Do you want the Coding Team to estimate dates or other partial info?
9. As long as documents and files are being analyzed and coded on a page-by-page basis, or possibly a line-by-line basis, what other information should be gathered at the same time, to save effort later on? Typical examples might include a document summary, first pass legal document review or redactions of sensitive information.
