Forge OCR Server v0.1.0

Multi-layer document OCR with ROI cropping, consensus and validation

OCR Engines

Document Types

50 MB

Max Upload

~1-3s

Avg Response

API Endpoints

POST /api/v1/ocr/{doc_type} OCR any document (multipart)

POST /api/v1/ocr/{doc_type}/json OCR any document (base64 JSON)

POST /api/v1/pan/ocr PAN card OCR (multipart)

POST /api/v1/pan/ocr/json PAN card OCR (base64 JSON)

GET /api/v1/documents List document types

GET /api/v1/layers Available OCR layers

GET /api/v1/health Health check

Supported Documents

Aadhaar Card (UIDAI, India)

doc_type: aadhaar_card

aadhaar_number Aadhaar Number required
name Name required
date_of_birth Date of Birth required
gender Gender required
address Address optional
masked_number Masked Aadhaar Number optional

PAN Card (India)

doc_type: pan_card

name Name required
fathers_name Father's Name required
date_of_birth Date of Birth required
pan_number PAN Number required

OCR Engines

ID	Description	Weight	Status
`llm_workers_ai`	Cloudflare Workers AI Vision OCR	0.95	active
`ocrs_deep_learning`	OCRS deep learning OCR (RTen inference)	0.50	active
`tesseract`	Tesseract LSTM OCR engine	0.70	active
`onnx_ocr`	PaddleOCR v4 English recognition (ONNX Runtime)	0.60	active

Pipeline

Upload image (multipart or base64 JSON)
Per-field ROI cropping from document layout
Multiple OCR engines process each field region in parallel
Results merged via weighted consensus per field
Each field validated (format, checksum, cross-field checks)
Response: extracted fields, confidence score, review recommendation

Quick Start

# Health check curl https://ocr.setulab.com/api/v1/health # PAN card OCR curl -X POST https://ocr.setulab.com/api/v1/pan/ocr \ -F "image=@pan_card.jpg" # Aadhaar card OCR curl -X POST https://ocr.setulab.com/api/v1/ocr/aadhaar_card \ -F "image=@aadhaar.jpg" # With per-field layer details curl -X POST https://ocr.setulab.com/api/v1/pan/ocr \ -F "image=@pan_card.jpg" \ -F "include_layers=true" # Run only specific layers curl -X POST https://ocr.setulab.com/api/v1/pan/ocr \ -F "image=@pan_card.jpg" \ -F "layers=tesseract,ocrs_deep_learning" \ -F "include_layers=true"