Opus Text Extraction

The Opus Text Extraction task uses OCR (Optical Character Recognition) to pull text out of documents and images. Feed it a PDF, PNG, or JPEG, and it returns the text content—ready for processing by Opus Agent, Custom Agent, or other tasks in your workflow.

Key Capabilities

OCR Processing

Extract text from scanned documents and images.

Multiple Formats

Works with PDF, PNG, JPEG, and JPG files.

Page-by-Page Output

For multi-page documents, get text from each page separately.

Ready for AI

Output flows directly to AI tasks for analysis.

When to Use It

Use Opus Text Extraction when you need to convert documents or images into text your workflow can process:

Pull text from scanned invoices or receipts
Extract text from screenshots or photos
Convert PDFs into text for AI analysis
Digitize paper documents for processing

Opus Text Extraction outputs raw text. To extract specific fields (like amounts from invoices), follow it with an Opus Agent task to parse and structure the text.

Supported File Types

Format	Extensions
PDF	`.pdf`
PNG	`.png`
JPEG	`.jpeg`, `.jpg`

Only PDF, PNG, JPEG, and JPG formats are supported. Other file types will cause the task to fail. The maximum file size is 10 MB.

Input

The File input accepts the document to extract text from. Only File (single) type is supported; multiple files are not currently supported.

Synchronous Extraction

The Synchronous Extraction toggle (off by default) controls how the extraction is processed:

Off (Asynchronous): Supports multi-page PDFs. The extraction runs in the background and retrieves results when complete. Use this for documents with multiple pages or when processing larger files.
On (Synchronous): Returns results immediately in real-time. However, this mode only supports single-page documents. If you pass a multi-page PDF with synchronous extraction enabled, the extraction will fail.

For documents with varying page counts, keep synchronous extraction off to ensure reliable processing.

How to Add Opus Text Extraction

Drop it into your workflow

Drag an Opus Text Extraction task into your workflow where you need to convert documents to text.

Connect the input file

Link the file input to a source—like a Workflow Input, Import Data task, or another task’s file output.

Name the output

Give the extracted text output a meaningful name so you can reference it in later tasks.

Connect to downstream tasks

Wire the text output to tasks that will process the content—like Opus Agent or Custom Agent.

Test it

Run a preview with a sample document to verify the extraction works correctly.

For multi-page PDFs, the output is a list where each item contains the text from one page. This makes it easier to process documents page by page.

Tips for Better Results

Use high-quality source documents

OCR accuracy depends on input quality:

Use high-resolution scans (300 DPI or higher)
Make sure documents are properly aligned
Avoid blurry or distorted images
Ensure good contrast between text and background

Validate extracted text

OCR isn’t perfect—plan for errors:

Use Agentic Review to check extraction quality
Add Human Review for critical documents
Handle low-confidence extractions appropriately

Structure text with downstream agents

Opus Text Extraction outputs raw text:

Use Opus Agent to parse into structured fields
Use Custom Agent to extract specific data points
Validate structure before further processing

Opus Agent

Process extracted text with AI reasoning.

Custom Agent

Parse extracted text into structured data.

Import Data

Pull documents from connected services.

Review Task

Validate extraction quality with human or AI review.

Getting Started

Opus Basics

Roles & Access

Tasks

Integrations

Opus Text Extraction

Key Capabilities

OCR Processing

Multiple Formats

Page-by-Page Output

Ready for AI

When to Use It

Supported File Types

Input

Synchronous Extraction

How to Add Opus Text Extraction

Tips for Better Results

Opus Agent

Custom Agent

Import Data

Review Task

​Key Capabilities

OCR Processing

Multiple Formats

Page-by-Page Output

Ready for AI

​When to Use It

​Supported File Types

​Input

​Synchronous Extraction

​How to Add Opus Text Extraction

​Tips for Better Results

​Related

Opus Agent

Custom Agent

Import Data

Review Task

Key Capabilities

When to Use It

Supported File Types

Input

Synchronous Extraction

How to Add Opus Text Extraction

Tips for Better Results

Related