Complete API reference for Despia Local Intelligence schemes, streaming callbacks, error codes, and available models.
Despia Local Intelligence is in beta. The API spec will likely change before the official launch. A dedicated NPM package for Local Intelligence will be released before launch to make setup more convenient - the current despia-native integration is temporary.
Despia Local Intelligence requires Despia V4, which is currently in beta. To request access, email beta@despia.com.
On-device inference via HuggingFace models. Downloaded once and cached locally. All inference runs without a network connection.
HuggingFace model inference runs on both iOS and Android. The appleintelligence:// one-shot scheme is iOS only.
Runs a prompt to completion and calls a named function on window with the full response string.
// Note: This API is not final and subject to change.const isDespia = navigator.userAgent.toLowerCase().includes('despia')if (isDespia) { despia( `appleintelligence://?prompt=${encodeURIComponent('What is the capital of France?')}` )}function handleAIResponse(response) { console.log(response)}
With system instructions:
// Note: This API is not final and subject to change.const system = 'You are a concise assistant. Reply in one sentence.'const prompt = 'Explain what a transformer model is.'despia( `appleintelligence://?instructions=${encodeURIComponent(system)}&prompt=${encodeURIComponent(prompt)}`)
Parameters
Parameter
Type
Required
Description
prompt
string
Yes
The user prompt
instructions
string
No
System-level instruction context for the session
callback
string
No
Name of the global JS function to receive the response. Defaults to handleAIResponse
CallbackThe native layer calls window[callback](response) on success, or window[callback](errorMessage) on failure.
// Note: This API is not final and subject to change.function handleAIResponse(response) { document.getElementById('output').textContent = response}
Streams tokens as they are generated. Set up callbacks before firing the scheme call.
// Note: This API is not final and subject to change.const isDespia = navigator.userAgent.toLowerCase().includes('despia')if (isDespia) { const jobId = crypto.randomUUID() window.onMLToken = (id, chunk) => { if (id === jobId) { // chunk is the full accumulated response so far - replace, do not append document.getElementById('output').textContent = chunk } } window.onMLComplete = (id, fullText) => { if (id === jobId) { console.log('Complete:', fullText) } } window.onMLError = ({ errorCode, errorMessage }) => { console.error(`Error ${errorCode}: ${errorMessage}`) } despia( `intelligence://?id=${encodeURIComponent(jobId)}&prompt=${encodeURIComponent('What is the capital of France?')}` )}
With system instructions:
// Note: This API is not final and subject to change.const system = 'You are a concise assistant. Reply in three sentences or fewer.'const prompt = 'What is the difference between TCP and UDP?'despia( `intelligence://?id=${encodeURIComponent(jobId)}&system=${encodeURIComponent(system)}&prompt=${encodeURIComponent(prompt)}`)
Parameters
Parameter
Type
Required
Description
prompt
string
Yes
The user prompt
id
string
Yes
Unique job ID used to correlate token and completion events
system
string
No
System-level instruction context for the session
webhook
string
No
Reserved. Parsed by the native layer but not yet active
Callbacks
window.onMLToken(id, chunk)
Called for each snapshot as it is generated. chunk is the full accumulated response so far, not just the new token. Replace the output element’s content rather than appending.
// Note: This API is not final and subject to change.window.onMLToken = (id, chunk) => { if (id === jobId) { document.getElementById('output').textContent = chunk }}
window.onMLComplete(id, fullText)
Called once when inference finishes. fullText is the complete response.
// Note: This API is not final and subject to change.window.onMLComplete = (id, fullText) => { if (id === jobId) { saveToHistory(fullText) }}
window.onMLError({ errorCode, errorMessage })
Called on any failure. See error codes below.
// Note: This API is not final and subject to change.window.onMLError = ({ errorCode, errorMessage }) => { console.error(`Error ${errorCode}: ${errorMessage}`)}
Use unique id values per job to handle concurrent streams without collision.
// Note: This API is not final and subject to change.const jobs = new Map()window.onMLToken = (id, chunk) => { const el = jobs.get(id) if (el) el.textContent = chunk}window.onMLComplete = (id) => { jobs.delete(id)}function runJob(prompt, outputElement) { const jobId = crypto.randomUUID() jobs.set(jobId, outputElement) despia(`intelligence://?id=${encodeURIComponent(jobId)}&prompt=${encodeURIComponent(prompt)}`)}
Gate all Despia Local Intelligence calls behind a user agent check so the feature degrades gracefully in a standard browser.
// Note: This API is not final and subject to change.const isDespia = navigator.userAgent.toLowerCase().includes('despia')if (isDespia) { // Despia Local Intelligence calls here} else { // Fallback - cloud API or disabled state}