This section covers the methods you use when a module needs to request extractor installation, synchronize the live extractor runtime after lifecycle changes, or check whether an extractor is ready on the current node.
These methods are intended for setup screens, admin tools, and extractor activation flows. They mirror the engine lifecycle shape while staying in the extractor domain.
request_install(extractor_id: str, force: bool = False, install_config: dict | None = None, requested_by: dict | None = None, task_id: str | None = None)¶
This method publishes an extractor installation request event and returns the event payload that was emitted.
Example:
install_event = await module_sdk.extractors.request_install(
extractor_id="docling",
force=False,
install_config={
"ocr_engine": "rapidocr",
},
requested_by={
"user_id": 42,
"organization_id": 7,
},
task_id=task_id,
)What It Is For¶
Use request_install(...) when your module wants to start the official extractor installation flow rather than directly running install logic itself.
This method does not mean "install the extractor inside this action". It means "publish an extractor.install.requested event and let the runtime install pipeline process it".
That distinction matters because extractor installation is multinode runtime work. Each node that consumes the event can install and check its own local dependencies while recording node-level install status.
What It Does¶
Under the hood, the method delegates to publish_extractor_install_requested(...).
That flow:
- builds a normalized install event
- includes manifest and version metadata
- records the source runtime node
- updates
ExtractorRegistryrows toinstallingwhen appropriate - records source-node install status
- broadcasts the event on the extractor install stream
- returns the event payload to the caller
The returned payload includes fields such as:
event_idevent_nameextractor_idmanifest_versionextractor_versionforceinstall_configsource_node_idrequested_bytask_idrequested_at
Install Configuration¶
install_config is the pre-install configuration selected by the user before
the install request is published. It is different from runtime extractor config.
Use it for choices that affect what the installer downloads or prepares, such as OCR backends or model artifacts. Runtime behavior that can be changed after installation belongs in the extractor registry config instead.
The install event carries install_config so background installers and node
install status records can reconstruct the same request. Do not store
pre-install choices only in UI state.
Background Task Attachment¶
Pass task_id when the request is launched from a background task and the
installer should stream progress/output into that task.
This is the same pattern used by engine installation: the action creates the
background task, passes its id to the SDK method, and the UI attaches a
BackgroundTask component to that id.
When To Use force¶
Pass force=True only when the module is explicitly asking the runtime to reinstall or refresh the extractor even if it may already look ready.
In normal interactive installs, force=False is usually the correct default.
When To Use requested_by¶
Use requested_by when you want the installation event to retain user and organization context for auditability, debugging, or admin workflows.
sync_runtime() -> None¶
This method tells the runtime extractor manager to synchronize the currently active extractors.
Example:
await module_sdk.extractors.sync_runtime()Use sync_runtime() after a module changes extractor lifecycle state in a way that should be reflected in the live runtime manager.
Typical examples:
- after activating an extractor
- after deactivating an extractor
- after a configuration change that affects active extractor handles
Updating the database or UI is not enough. The runtime manager keeps active extractor handles, and those handles need to be brought back in sync with persisted extractor state.
check_ready(extractor_id: str) -> dict¶
This method checks whether an extractor runtime is installed and importable on the current node.
Example:
ready = await module_sdk.extractors.check_ready(extractor_id="docling")
if not ready.get("ready"):
missing = list(ready.get("missing_local") or [])What It Is For¶
Use check_ready(...) before activation when the UI needs to decide whether the extractor can be loaded now or whether an installation step is still needed.
This method answers a different question from resolution and extraction:
resolve(...)decides which configured MIME binding should handle a file.check_ready(...)checks whether an extractor's runtime dependencies are present.extract(...)executes the resolved extractor.
Return Value¶
The current readiness payload uses these fields:
ready: whether the extractor runtime is availablemissing_local: extractor-local dependencies that are missingmessage: human-readable readiness message
Activation flows can use missing_local and message to decide whether to show an install prompt instead of trying to instantiate the extractor.
Typical Extractor Lifecycle Flow¶
A realistic module flow often looks like this:
- list manifests or registered extractor rows
- check whether the extractor runtime is ready with
check_ready(...) - request installation if dependencies are missing
- activate or update the extractor registry row
- call
sync_runtime() - configure MIME bindings with
set_mime_type_binding(...) - resolve and run extraction with
resolve(...)orextract(...)
This ordering keeps discovery, installation, activation, and execution as separate concerns.