Forge OCR Server v0.1.0

Multi-layer document OCR with consensus and validation

2
Active Layers
2
Document Types
50 MB
Max Upload
~1-3s
Avg Response

API Endpoints

POST /api/v1/ocr/{doc_type} OCR any document (multipart)
POST /api/v1/ocr/{doc_type}/json OCR any document (base64 JSON)
POST /api/v1/pan/ocr PAN card OCR (multipart)
POST /api/v1/pan/ocr/json PAN card OCR (base64 JSON)
GET /api/v1/documents List document types
GET /api/v1/pan/config Default PAN layer config
GET /api/v1/layers Available OCR layers
GET /api/v1/health Health check

Supported Documents

PAN Card (India)

doc_type: pan_card

Aadhaar Card (UIDAI, India)

doc_type: aadhaar_card

OCR Layers

IDDescriptionWeightPSMPreprocessStatus
ocrs_deep_learning OCRS Deep Learning OCR (PSM 3 + denoise) — best accuracy 0.90 3 grayscale, denoise active
tesseract Tesseract LSTM (PSM 3, grayscale) — reliable PAN/DOB 0.70 3 grayscale active
onnx_ocr ONNX OCR (PSM 6 + binarize) — slow on large images, disabled 0.85 6 grayscale, binarize disabled
llm_anthropic LLM Vision (PSM 11 sparse) — garbles names, disabled 0.95 11 none disabled

Quick Start

# Health check curl https://forge-ocr.sachinkumarskrose.workers.dev/api/v1/health # PAN card OCR (multipart) curl -X POST https://forge-ocr.sachinkumarskrose.workers.dev/api/v1/pan/ocr \ -F "image=@pan_card.jpg" # With layer details curl -X POST https://forge-ocr.sachinkumarskrose.workers.dev/api/v1/pan/ocr \ -F "image=@pan_card.jpg" \ -F "include_layers=true" # Single layer override (fastest) curl -X POST https://forge-ocr.sachinkumarskrose.workers.dev/api/v1/pan/ocr \ -F "image=@pan_card.jpg" \ -F 'config=[{"id":"ocrs_deep_learning","description":"OCRS","weight":0.9,"psm":3,"preprocess":{"grayscale":true,"binarize":false,"denoise":true,"deskew":false},"enabled":true}]' # Generic route (any document type) curl -X POST https://forge-ocr.sachinkumarskrose.workers.dev/api/v1/ocr/pan_card \ -F "image=@pan_card.jpg"

How It Works

  1. Upload image (multipart or base64 JSON)
  2. Multiple OCR layers process image in parallel
  3. Fields parsed from raw text using document-specific regex
  4. Results merged via weighted consensus per field
  5. Each field validated (format, checksum, date range)
  6. Response: extracted fields, confidence score, review recommendation