Input requirements
The input for the document parsing job must meet these criteria:- PDF documents with English text and up to 100 pages. Smaller files usually give better results.
- Embedded text or scanned documents.
- Documents that describe a single asset or piece of equipment.
- Key-value pair data representation.
Before you start
- Ingest the documents into CDF.
- Set up access capabilities.
- Create a view in a data model with properties that reflect the key-value data.
Parse documents
Create parsing task
Select Create parsing task, and then select the documents you want to parse.
You can parse several documents at the same time, but data from each document is ingested into a separate data model view.
Review the parsed data
Review the parsed data.
If many properties show low scores, your view property names may not align with field names in the document, for example, abbreviations, different wording, or spelling. Rename properties in the view so they resemble the field names more closely, then run parsing again.
- Select a property in the Parsed data sidebar to zoom into a field in the document.
- Hover over a field to update the value.
- Enable Confidence score to view the confidence level for each extracted property. Use the confidence score to decide how much to trust each value and where to focus your review before you approve:
| Score range | Interpretation |
|---|---|
| High (for example, 80–100%) | Strong match. The extracted value is likely correct; spot-check if needed. |
| Medium (for example, 50–80%) | Moderate match. Review the value and the document to confirm it maps to the right property. |
| Low (for example, below 50%) | Weak match. The field name in the document may differ from the property name. Verify or correct the value before approving. |
Exact score bands may vary. Focus review on lower-confidence properties; properties with higher scores usually require less review and can be trusted for automated workflows.
Further reading
- About document parsing – Overview of document parsing, what the confidence score means, and how it is calculated