Key Capabilities
OCR Processing
Extract text from scanned documents and images.
Multiple Formats
Works with PDF, PNG, JPEG, and JPG files.
Page-by-Page Output
For multi-page documents, get text from each page separately.
Ready for AI
Output flows directly to AI tasks for analysis.
When to Use It
Use Opus Text Extraction when you need to convert documents or images into text your workflow can process:- Pull text from scanned invoices or receipts
- Extract text from screenshots or photos
- Convert PDFs into text for AI analysis
- Digitize paper documents for processing
Opus Text Extraction outputs raw text. To extract specific fields (like amounts from invoices), follow it with an Opus Agent task to parse and structure the text.
Supported File Types
| Format | Extensions |
|---|---|
.pdf | |
| PNG | .png |
| JPEG | .jpeg, .jpg |
How to Add Opus Text Extraction
1
Drop it into your workflow
Drag an Opus Text Extraction task into your workflow where you need to convert documents to text.
2
Connect the input file
Link the file input to a source—like a Workflow Input, Import Data task, or another task’s file output.
3
Name the output
Give the extracted text output a meaningful name so you can reference it in later tasks.
4
Connect to downstream tasks
Wire the text output to tasks that will process the content—like Opus Agent or Custom Agent.
5
Test it
Run a preview with a sample document to verify the extraction works correctly.
Tips for Better Results
Use high-quality source documents
Use high-quality source documents
OCR accuracy depends on input quality:
- Use high-resolution scans (300 DPI or higher)
- Make sure documents are properly aligned
- Avoid blurry or distorted images
- Ensure good contrast between text and background
Validate extracted text
Validate extracted text
OCR isn’t perfect—plan for errors:
- Use Agentic Review to check extraction quality
- Add Human Review for critical documents
- Handle low-confidence extractions appropriately
Structure text with downstream agents
Structure text with downstream agents
Opus Text Extraction outputs raw text:
- Use Opus Agent to parse into structured fields
- Use Custom Agent to extract specific data points
- Validate structure before further processing