Engines

Engines turn model providers into runtime services.

An engine implements configured runtime methods such as chat, generate stream, embeddings, STT, TTS, or image analysis. The application asks for a model or objective; the runtime resolves and invokes the provider.

Provider lifecycle

Model capability is configured before application code asks for it.

The system module handles install and activation. Feature code asks for a configured objective or model id.

Engine orchestration: objective or model id resolves to a provider, runs in an isolated engine worker, and streams results back
Engine spawning and sandboxing sequence: model resolution, local worker process spawn, access policy injection, and execution isolation
01

Install

Engines are installed and configured through the system module and provider catalog.

02

Configure

Models store capabilities, runtime methods, provider options, and performance defaults.

03

Warmup

Model loading can be requested outside the UI process so first use does not freeze clients.

Invocation model

The app calls a runtime method, not a provider-specific client.

This keeps modules away from provider SDK details and lets the engine layer normalize local and remote runtimes.

01

Runtime methods

Providers expose the methods they support. Test pages and UI should use those real method contracts.

02

Local and remote

OpenAI-compatible APIs, local runtimes, and specialized engines share the same high-level provider boundary.

03

Observability

Requests carry ids and context so failures, timings, and output can be traced.

Resource behavior

Local engines need resource-aware startup.

Local model runtimes can be expensive to load. The platform should keep those costs outside UI-critical paths.

01

Background workers

Engine processes can hold model memory away from the main application/UI process.

02

Warmup requests

A provider can be resolved first, then warmed up before the first user-facing invocation.

03

Configuration limits

Context length, runtime options, and performance defaults belong to model/provider configuration rather than page code.

Implemented engines

Available engine implementations.

These are the engine implementations exposed through the runtime/provider layer.

Chat and reasoning

Remote APIs, OpenAI-compatible providers, and local LLM runtimes used for chat and streamed generation.

OpenAI
OpenAI-compatible
Anthropic
Gemini
Ollama
llama.cpp
vLLM
ONNX

Speech

Speech-to-text and text-to-speech providers used by composer voice input and audio generation flows.

Whisper
OpenAI Whisper
OpenAI TTS
Edge TTS
Edge
eSpeak
Parler TTS
Qwen TTS

Vision and detection

Specialized engines for visual analysis and object detection workflows.

YOLO

Next

Related pages

Use these pages to move from the concept to adjacent parts of the runtime.