OnnxOCR is a Dify plugin that extracts text from images using the OnnxOCR API. This plugin provides a simple and efficient way to convert images containing text into machine-readable text format within Dify workflows and applications.
- Image URL Support: Process images from web URLs
- High Performance: Powered by OnnxOCR's ONNX-optimized PaddleOCR backend
- Configurable API Endpoint: Use your own OnnxOCR service endpoint
- Dual Output Format: Provides both clean text and structured JSON data
- Robust Error Handling: Built-in retry mechanism and comprehensive error reporting
- Workflow Integration: Provides structured variables for use in Dify workflows
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
image_url |
string | Yes | - | URL of the image to process |
api_endpoint |
string | No | http://127.0.0.1:5005 |
OCR API endpoint URL |
The plugin provides the following variables for use in workflows:
extracted_text
(string): Clean extracted text from the image (line-separated)ocr_results
(object): Complete OCR API response with coordinates and confidence scoresprocessing_time
(number): Time taken to process the image in seconds
The plugin automatically outputs both formats simultaneously:
- Text Display: Clean text format shown to users
- JSON Data: Complete OCR response available as structured data
- Workflow Variables: Both text and JSON formats available for workflow nodes
Name
Header
Additional Text
{
"processing_time": 0.456,
"results": [
{
"text": "Name",
"confidence": 0.9999361634254456,
"bounding_box": [[4.0, 8.0], [31.0, 8.0], [31.0, 24.0], [4.0, 24.0]]
},
{
"text": "Header",
"confidence": 0.9998759031295776,
"bounding_box": [[233.0, 7.0], [258.0, 7.0], [258.0, 23.0], [233.0, 23.0]]
}
]
}
- Add OnnxOCR Node: Add the OnnxOCR tool to your workflow
- Configure Parameters: Set the image URL and optional API endpoint
- Use Output Variables: Reference the extracted text in subsequent nodes
If you need to process user-uploaded images, use a code execution node to extract file URLs:
def main(files: list) -> dict:
image_urls = []
if files:
for file_obj in files:
if isinstance(file_obj, dict) and file_obj.get('type') == 'image' and 'url' in file_obj:
image_urls.append(file_obj['url'])
return {
'image_urls': image_urls
}
Usage Steps:
- Add a Code Execution Node with the above code
- Pass file uploads as input to the code node
- The code node outputs
image_urls
(Array[string]) - Use an Iteration Node to loop through the URL array
- Call the OnnxOCR node within the iteration to process each image
This plugin requires an OnnxOCR service to be running. You can install and set up the original OnnxOCR project:
-
Clone the repository:
git clone https://github.com/jingsongliujing/OnnxOCR.git cd OnnxOCR
-
Install dependencies:
pip install -r requirements.txt
-
Run the service:
python app.py
The service will start on
http://127.0.0.1:5005
by default.
-
Build the Docker image:
git clone https://github.com/jingsongliujing/OnnxOCR.git cd OnnxOCR docker build -t onnxocr .
-
Run the Docker container:
docker run -p 5005:5005 onnxocr
- Download or clone this plugin repository
- Upload the plugin to your Dify instance
- Configure the API endpoint if using a custom OnnxOCR service
- Start using the tool in your applications or workflows
This plugin requires an OnnxOCR service running and accessible at the configured endpoint. The service should accept POST requests to /ocr
with the following format:
Request:
{
"image": "base64_encoded_image_data"
}
Response:
{
"processing_time": 0.456,
"results": [
{
"text": "extracted_text",
"confidence": 0.99,
"bounding_box": [[x1, y1], [x2, y2], [x3, y3], [x4, y4]]
}
]
}
The plugin includes comprehensive error handling for:
- Invalid image URLs
- Network connectivity issues
- API endpoint unavailability
- Image processing failures
All errors are reported with descriptive messages to help troubleshoot issues.
This plugin is built on top of the excellent OnnxOCR project by jingsongliujing. OnnxOCR provides a high-performance OCR solution using ONNX Runtime with PaddleOCR models, delivering fast and accurate text recognition capabilities.
Original Project: https://github.com/jingsongliujing/OnnxOCR
We extend our gratitude to the OnnxOCR project maintainers and contributors for their excellent work in making OCR accessible and efficient.
The plugin supports all image formats supported by the underlying OnnxOCR service:
- JPEG
- PNG
- BMP
- GIF
- WebP
- TIFF
This plugin is provided as-is for use with Dify. Please refer to the original OnnxOCR project for licensing information regarding the OCR service.
Contributions to improve this plugin are welcome. Please ensure that any changes maintain compatibility with the OnnxOCR API format.
For issues related to:
- Plugin functionality: Open an issue in this repository
- OCR service: Refer to the OnnxOCR project
- Dify integration: Check the Dify documentation