Run optical character recognition on any image using the device’s native on-device text engine, returning the extracted text straight to your web app. Read from a hosted image, a photo the user picks, a multi-page document scan, or raw image bytes already held in memory. Recognition runs entirely on-device, so it works offline and never sends image data anywhere. Useful for receipts, invoices, business cards, ID capture, handwritten notes, and any flow where you would otherwise ship an image to a cloud OCR service.Documentation Index
Fetch the complete documentation index at: https://setup.despia.com/llms.txt
Use this file to discover all available pages before exploring further.
Assign
window.onVisionEvent before issuing the first call. Results are delivered to that callback as soon as recognition finishes, and any event emitted before the callback exists is dropped rather than queued.Installation
- Bundle
- CDN
How it works
OCR is a two-part flow. Assignwindow.onVisionEvent to receive results, then call vision://ocr with the image you want recognized. Each call fires a queued event the moment it is accepted, then a success event carrying the extracted text once recognition completes, or an error event if it fails. Every event echoes back the id you passed, so a single callback can route results across many requests running at once.
lines array is left as the engine produced it, so each line object holds the raw recognized text if you need it. evt.text is ready to display or parse without further cleanup.
| Parameter | Required | Description |
|---|---|---|
src | Yes | The image to recognize. Accepts a hosted HTTPS URL, a picker token (@imagepicker, @filepicker, @documentscanner), a data: URI, or a raw base64 string. URL and data URI values must be wrapped with encodeURIComponent. |
id | No | A label echoed on every event for this request. Use it to correlate results when several jobs run concurrently. Defaults to an auto-generated UUID. |
lang | No | BCP-47 language hint, comma-separated for multiple scripts. Advisory only, both platforms auto-detect by default. See Choosing a recognition language. |
Reading the result
Assignwindow.onVisionEvent once. It receives every event for every request, each tagged with the id you supplied and a status describing what happened.
data: URI or hosted receipt produces a success event shaped like this:
confidence is absent. The same receipt produces:
code and an advisory message:
The request was accepted and recognition is running. Carries only
type and id.Recognition completed. Carries
text, the full extracted string with lines joined by \n, and lines, an array of { text, confidence? } objects in reading order. confidence is a float from 0 to 1 on iOS; it is omitted on Android, where the recognizer does not expose a per-line score. Treat a missing confidence as unknown rather than zero.Recognition failed. Carries
error.code, a stable machine-readable string you can branch on, and error.message, a human-readable detail for logging. See Error reference for the full list.The user closed a picker or the document scanner without selecting anything. No text was produced. Carries only
type and id.Recognizing a hosted image
Pass an HTTPS URL assrc to recognize an image already hosted on your CDN or storage. The native side fetches it with the WebView’s cookies and user-agent attached, so images behind your app’s own session are reachable without extra authentication.
Letting the user choose an image
Three picker tokens open a native chooser instead of taking a URL. Pass one assrc and the user’s selection flows straight into recognition.
@imagepicker opens the system photo library. @filepicker opens a file browser filtered to images. Both let the user pick an existing image; the difference is purely which native chooser appears.
dismissed event on that request’s id. Pickers are modal, so only one can be open at a time; a second picker request issued while one is already open returns picker_busy immediately while the first stays on screen.
Scanning a multi-page document
@documentscanner opens the native document camera with automatic edge detection and perspective correction. The user captures one or more pages, confirms the batch, and every page is recognized together and returned in a single success event. The pages are concatenated in capture order into evt.text, separated by line breaks like any other text, so you can render or parse the whole document as one string.
success event, every page’s lines flattened into a single lines array and joined into text:
scanner_unsupported. If the scanner opens but errors before recognition starts, it fails with scanner_failed. Closing the scanner without confirming any pages fires dismissed.
Recognizing an in-memory image
When the image already exists in the page as bytes, a canvas export, a generated graphic, a freshly decoded blob, pass it inline as a data URI and skip the upload entirely. A bare base64 string is also accepted as a fallback, though a data URI is preferred because it declares the image type.Choosing a recognition language
Both platforms auto-detect the script by default, so most apps never setlang. Pass it only when you already know the script and want to constrain recognition, which improves accuracy for non-Latin text. On Android the hint selects the recognizer for Latin, Chinese, Japanese, Korean, or Devanagari script; on iOS it narrows the candidate languages the engine considers.
Running several jobs at once
Recognition jobs are independent. Issue as manyvision://ocr calls as you need with distinct id values, and each result arrives on the callback as it finishes. Results come back in completion order, not the order you submitted them, so always key off evt.id rather than assuming a sequence.
Error reference
Every failure arrives as anerror event with a stable code. Messages are advisory and may change; branch on the code.
| Code | Cause |
|---|---|
unknown_command | The URL host was something other than ocr |
missing_src | No src parameter was provided |
invalid_src | src did not match any supported form |
invalid_data_uri | A data: URI could not be decoded |
fetch_failed | The hosted image could not be fetched |
fetch_empty | The fetch succeeded but returned no data |
file_unreadable | A local or picked file could not be read |
decode_failed | The bytes could not be decoded into an image |
ocr_failed | The recognition engine returned an error |
picker_busy | A picker is already open and the new request was rejected |
picker_failed | A selection was made but the image could not be loaded |
no_presenter | A picker was requested before a screen was available to present it |
scanner_unsupported | The document scanner is not available on this device |
scanner_failed | The document scanner errored before recognition began |
Resources
NPM Package
despia-native