OnnxOCR for Dify

Description

OnnxOCR is a Dify plugin that extracts text from images using the OnnxOCR API. This plugin provides a simple and efficient way to convert images containing text into machine-readable text format within Dify workflows and applications.

Features

Image URL Support: Process images from web URLs
High Performance: Powered by OnnxOCR's ONNX-optimized PaddleOCR backend
Configurable API Endpoint: Use your own OnnxOCR service endpoint
Dual Output Format: Provides both clean text and structured JSON data
Robust Error Handling: Built-in retry mechanism and comprehensive error reporting
Workflow Integration: Provides structured variables for use in Dify workflows

Parameters

Parameter	Type	Required	Default	Description
`image_url`	string	Yes	-	URL of the image to process
`api_endpoint`	string	No	`http://127.0.0.1:5005`	OCR API endpoint URL

Output Variables

The plugin provides the following variables for use in workflows:

extracted_text (string): Clean extracted text from the image (line-separated)
ocr_results (object): Complete OCR API response with coordinates and confidence scores
processing_time (number): Time taken to process the image in seconds

Output Format

The plugin automatically outputs both formats simultaneously:

Text Display: Clean text format shown to users
JSON Data: Complete OCR response available as structured data
Workflow Variables: Both text and JSON formats available for workflow nodes

Usage Examples

Text Output (User Display)

Name
Header
Additional Text

JSON Output (Workflow Data)

{
  "processing_time": 0.456,
  "results": [
    {
      "text": "Name",
      "confidence": 0.9999361634254456,
      "bounding_box": [[4.0, 8.0], [31.0, 8.0], [31.0, 24.0], [4.0, 24.0]]
    },
    {
      "text": "Header", 
      "confidence": 0.9998759031295776,
      "bounding_box": [[233.0, 7.0], [258.0, 7.0], [258.0, 23.0], [233.0, 23.0]]
    }
  ]
}

Workflow Integration Guide

Basic Integration Steps

Add OnnxOCR Node: Add the OnnxOCR tool to your workflow
Configure Parameters: Set the image URL and optional API endpoint
Use Output Variables: Reference the extracted text in subsequent nodes

Processing Uploaded Images

If you need to process user-uploaded images, use a code execution node to extract file URLs:

def main(files: list) -> dict:
    image_urls = []
    
    if files:
        for file_obj in files:
            if isinstance(file_obj, dict) and file_obj.get('type') == 'image' and 'url' in file_obj:
                image_urls.append(file_obj['url'])
    
    return {
        'image_urls': image_urls
    }

Usage Steps:

Add a Code Execution Node with the above code
Pass file uploads as input to the code node
The code node outputs image_urls (Array[string])
Use an Iteration Node to loop through the URL array
Call the OnnxOCR node within the iteration to process each image

Installation

Step 1: Install OnnxOCR Service

This plugin requires an OnnxOCR service to be running. You can install and set up the original OnnxOCR project:

Option 1: Using the Original OnnxOCR Project

Clone the repository:

git clone https://github.com/jingsongliujing/OnnxOCR.git
cd OnnxOCR

Install dependencies:
```
pip install -r requirements.txt
```
Run the service:
```
python app.py
```
The service will start on http://127.0.0.1:5005 by default.

Option 2: Docker Installation

Build the Docker image:

git clone https://github.com/jingsongliujing/OnnxOCR.git
cd OnnxOCR
docker build -t onnxocr .

Run the Docker container:
```
docker run -p 5005:5005 onnxocr
```

Step 2: Install Dify Plugin

Download or clone this plugin repository
Upload the plugin to your Dify instance
Configure the API endpoint if using a custom OnnxOCR service
Start using the tool in your applications or workflows

API Requirements

This plugin requires an OnnxOCR service running and accessible at the configured endpoint. The service should accept POST requests to /ocr with the following format:

Request:

{
  "image": "base64_encoded_image_data"
}

Response:

{
  "processing_time": 0.456,
  "results": [
    {
      "text": "extracted_text",
      "confidence": 0.99,
      "bounding_box": [[x1, y1], [x2, y2], [x3, y3], [x4, y4]]
    }
  ]
}

Error Handling

The plugin includes comprehensive error handling for:

Invalid image URLs
Network connectivity issues
API endpoint unavailability
Image processing failures

All errors are reported with descriptive messages to help troubleshoot issues.

Acknowledgments

This plugin is built on top of the excellent OnnxOCR project by jingsongliujing. OnnxOCR provides a high-performance OCR solution using ONNX Runtime with PaddleOCR models, delivering fast and accurate text recognition capabilities.

Original Project: https://github.com/jingsongliujing/OnnxOCR

We extend our gratitude to the OnnxOCR project maintainers and contributors for their excellent work in making OCR accessible and efficient.

Supported Image Formats

The plugin supports all image formats supported by the underlying OnnxOCR service:

JPEG
PNG
BMP
GIF
WebP
TIFF

License

This plugin is provided as-is for use with Dify. Please refer to the original OnnxOCR project for licensing information regarding the OCR service.

Contributing

Contributions to improve this plugin are welcome. Please ensure that any changes maintain compatibility with the OnnxOCR API format.

Support

For issues related to:

Plugin functionality: Open an issue in this repository
OCR service: Refer to the OnnxOCR project
Dify integration: Check the Dify documentation

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.claude		.claude
_assets		_assets
provider		provider
tools		tools
utils		utils
.difyignore		.difyignore
.env.example		.env.example
.gitignore		.gitignore
GUIDE.md		GUIDE.md
PRIVACY.md		PRIVACY.md
README.md		README.md
README_zh.md		README_zh.md
main.py		main.py
manifest.yaml		manifest.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OnnxOCR for Dify

Description

Features

Parameters

Output Variables

Output Format

Usage Examples

Text Output (User Display)

JSON Output (Workflow Data)

Workflow Integration Guide

Basic Integration Steps

Processing Uploaded Images

Installation

Step 1: Install OnnxOCR Service

Option 1: Using the Original OnnxOCR Project

Option 2: Docker Installation

Step 2: Install Dify Plugin

API Requirements

Error Handling

Acknowledgments

Supported Image Formats

License

Contributing

Support

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

ding113/dify-OnnxOCR-plugin

Folders and files

Latest commit

History

Repository files navigation

OnnxOCR for Dify

Description

Features

Parameters

Output Variables

Output Format

Usage Examples

Text Output (User Display)

JSON Output (Workflow Data)

Workflow Integration Guide

Basic Integration Steps

Processing Uploaded Images

Installation

Step 1: Install OnnxOCR Service

Option 1: Using the Original OnnxOCR Project

Option 2: Docker Installation

Step 2: Install Dify Plugin

API Requirements

Error Handling

Acknowledgments

Supported Image Formats

License

Contributing

Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages