Image OCR — Text Extraction
Recognize text in photos and screenshots right in your browser. Great for receipts, business cards, book pages, and subtitle captures.
For Korean receipts, enabling Korean + English usually improves accuracy. The first run downloads a language model (5–15MB), cached after that.
How to Use
Select one or more languages to recognize. For Korean receipts, enabling Korean + English usually improves accuracy.
Drag, paste from clipboard, or pick a file. The first run downloads a recognition model (~5–15MB) per language; cached after that.
Edit the recognized text inline, then copy it or download as .txt.
FAQ
How accurate is recognition?
Clean printed text (receipts, books, screenshots) typically reaches 90%+. Handwriting, skewed photos, and low-res images may be less accurate. Try enlarging or boosting contrast for better results.
Why is the first recognition slow?
The per-language model (~5–15MB) is downloaded once from CDN. Subsequent runs in the same language load instantly from your browser cache.
Can I recognize multiple languages at once?
Yes — select more than one language to recognize together. More languages mean longer processing and higher memory use.
Do you support traditional Chinese?
Currently only simplified Chinese (chi_sim). Traditional Chinese needs a separate model and isn't bundled.
Are images uploaded to a server?
No. Tesseract.js runs entirely in your browser. Models are fetched from jsDelivr CDN, but the image itself never leaves the page.