operonx.providers¶
LLM, embedding, reranker, and ONNX provider ops — plus the chat() /
ask() shorthand helpers. Provider backends are loaded lazily — a
tier-1 install (pip install operonx) can import operonx.providers
without pulling openai / httpx / numpy / torch. Missing-dep
errors surface only when the corresponding backend is actually
accessed.
See Pick an extra on the installation page for which extra each provider needs.
Provider ops¶
The four core provider op types. Each exposes an Op.of(...)
classmethod for concise construction with explicit keyword args —
that's the recommended style.
LLMOp
¶
LLMOp(
resource: Optional[Union[str, List[str]]] = None,
ratios: Optional[List[float]] = None,
fallback: Optional[List[str]] = None,
batch_mode: bool = False,
seed: Optional[int] = None,
inputs: Dict[str, Any] = None,
outputs: Dict[str, Any] = None,
**kwargs: Any,
)
Bases: BaseOp
Op that calls a language model via ResourceHub.
Supports streaming, weighted load balancing across multiple models, fallback chains with retry, and OpenAI Batch API mode (50 % cheaper).
Inputs
messages (list): Chat messages in OpenAI format. Required. temperature (float): Sampling temperature. Default: 0.0. max_tokens (int): Max output tokens. Default: None (model default). tools (list): Tool/function definitions. Default: None. tool_choice (str | dict): Tool selection strategy. Default: None. response_format (dict): Structured output format. Default: None.
Outputs
content (str): Generated text.
role (str): Message role (usually "assistant").
finish_reason (str): Stop reason ("stop", "tool_calls", etc.).
model_used (str): Actual model that served the request.
tool_calls (list): Tool-call objects (empty list when absent).
usage (dict): Flat token-cost metrics with keys
prompt_tokens, completion_tokens, total_tokens,
cached_tokens (cache hit), cache_write_tokens
(Anthropic cache write), reasoning_tokens.
extras (dict): Bag of uncommon fields — thinking_content,
refusal, logprobs. Values are None when absent.
Example::
llm = LLMOp.of(resource="gpt-4o", messages=PARENT["messages"])
Initialize LLMOp.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource
|
Optional[Union[str, List[str]]]
|
Resource key(s) for LLM in ResourceHub. - Single string: "gpt-4" - List for load balancing: ["gpt-4", "claude-3"] |
None
|
ratios
|
Optional[List[float]]
|
Weight ratios for load balancing. Must sum to 1.0. Only used when resource is a list. |
None
|
fallback
|
Optional[List[str]]
|
Fallback resource key(s) to use when primary model fails. List of resource keys from ResourceHub, tried in order. |
None
|
batch_mode
|
bool
|
Whether to use OpenAI Batch API (50% cheaper, async processing) |
False
|
seed
|
Optional[int]
|
Optional seed for load balancing RNG. Provides reproducible selection. |
None
|
inputs
|
Dict[str, Any]
|
Input variable mappings |
None
|
outputs
|
Dict[str, Any]
|
Output variable mappings |
None
|
**kwargs
|
Any
|
Additional keyword arguments for BaseOp |
{}
|
Source code in operonx/providers/ops/llm.py
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | |
Attributes¶
specific_metadata
property
¶
Return LLM-specific metadata dictionary.
Functions¶
warmup
¶
normalize_trace_io
¶
Wrap OpenAI chat-format multimodal blocks as Media for tracing.
Runs inside the tracing collector — returns a shallow-copied inputs
dict with messages rewritten so image_url / input_audio
blocks become Media instances. Real state is untouched; this is
only the trace-time view.
Source code in operonx/providers/ops/llm.py
of
¶
Create an LLMOp with flat kwargs.
Example::
llm = LLMOp.of(resource="gpt-4", messages=PARENT["messages"], outputs={"*": PARENT})
llm = LLMOp.of(resource=["gpt-4", "claude-3"], ratios=[0.7, 0.3], messages=PARENT["messages"])
Source code in operonx/providers/ops/llm.py
serialize
¶
Serialize LLMOp for Rust backend, including backend configs.
Source code in operonx/providers/ops/llm.py
EmbeddingOp
¶
EmbeddingOp(
resource: Optional[str] = None,
inputs: Dict[str, Any] = None,
outputs: Dict[str, Any] = None,
**kwargs: Any,
)
Bases: BaseOp
Op that converts texts to vector embeddings via ResourceHub.
Wraps an embedding backend (e.g. BGE-M3, OpenAI, TEI) and returns a list of embedding vectors matching the input order.
Inputs
texts (list[str]): Texts to embed. Required.
Outputs
embeddings (list[list[float]]): Embedding vectors.
Example::
embed = EmbeddingOp.of(resource="bge-m3", texts=PARENT["texts"])
Initialize EmbeddingOp.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource
|
Optional[str]
|
Resource key for embedding model in ResourceHub (e.g., "bge-m3") |
None
|
inputs
|
Dict[str, Any]
|
Input variable mappings |
None
|
outputs
|
Dict[str, Any]
|
Output variable mappings |
None
|
**kwargs
|
Any
|
Additional keyword arguments for BaseOp |
{}
|
Source code in operonx/providers/ops/embedding.py
Attributes¶
specific_metadata
property
¶
Return embedding-specific metadata dictionary.
Functions¶
warmup
¶
of
¶
Create an EmbeddingOp with flat kwargs.
Example::
embed = EmbeddingOp.of(resource="bge-m3", texts=PARENT["texts"], outputs={"*": PARENT})
Source code in operonx/providers/ops/embedding.py
serialize
¶
Serialize EmbeddingOp for Rust backend, including backend config.
Source code in operonx/providers/ops/embedding.py
RerankOp
¶
RerankOp(
resource: Optional[str] = None,
inputs: Dict[str, Any] = None,
outputs: Dict[str, Any] = None,
**kwargs: Any,
)
Bases: BaseOp
Op that scores and re-orders documents by relevance to a query.
Wraps a reranker backend (e.g. BGE-M3, Pinecone, TEI) accessed via ResourceHub. Returns documents sorted by relevance score.
Inputs
query (str): The query to rank against. Required. documents (list[str]): Documents to rerank. Required. top_k (int): Max results to return. Default: -1 (all). threshold (float): Min score cutoff. Default: 0.0.
Outputs
reranks (list[dict]): Reranked results with index, score,
and document fields.
Example::
rerank = RerankOp.of(
resource="bge-m3",
query=PARENT["query"],
documents=PARENT["docs"],
top_k=5,
)
Initialize RerankOp.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource
|
Optional[str]
|
Resource key for reranker in ResourceHub (e.g., "bge-m3") |
None
|
inputs
|
Dict[str, Any]
|
Input variable mappings |
None
|
outputs
|
Dict[str, Any]
|
Output variable mappings |
None
|
**kwargs
|
Any
|
Additional keyword arguments for BaseOp |
{}
|
Source code in operonx/providers/ops/rerank.py
Attributes¶
specific_metadata
property
¶
Return rerank-specific metadata dictionary.
Functions¶
warmup
¶
of
¶
Create a RerankOp with flat kwargs.
Example::
rerank = RerankOp.of(resource="bge-m3", query=PARENT["q"], documents=PARENT["docs"])
Source code in operonx/providers/ops/rerank.py
serialize
¶
Serialize RerankOp for Rust backend, including backend config.
Source code in operonx/providers/ops/rerank.py
PromptOp
¶
Bases: BaseOp
Op that formats a template into chat messages (OpenAI format).
Accepts three template formats:
- str — becomes a single user message:
"Hello {name}". - dict —
{"system": "...", "user": "..."}with{var}placeholders. - list — full messages array for multimodal / complex prompts.
All non-reserved input keys are substituted as template variables.
Inputs
template (str | dict | list): Message template. Default: None.
conversation_history (list): Prior messages to prepend. Default: [].
tool_results (list): Tool-call results to append. Default: [].
(any): Template variables ({var} placeholders).
Outputs
messages (list): Formatted chat messages ready for an LLM.
Example::
p = PromptOp.of(
template={"system": "You are {role}.", "user": "{query}"},
role="helpful",
query=PARENT["query"],
)
Initialize PromptOp.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
Dict[str, Any]
|
Input mappings. Reserved keys for templates, others are template vars. |
None
|
outputs
|
Dict[str, Any]
|
Output variable mappings |
None
|
**kwargs
|
Any
|
Additional keyword arguments for BaseOp. |
{}
|
Source code in operonx/providers/ops/prompt.py
Attributes¶
Functions¶
of
¶
Create a PromptOp with flat kwargs.
Example::
p = PromptOp.of(template={"system": "You are {role}.", "user": "{query}"}, role="helpful", query=PARENT["q"])
Source code in operonx/providers/ops/prompt.py
High-level helpers¶
chat() glues a PromptOp and LLMOp into one op call (most
LLM-using examples use this). ask() is the structured-output
variant that pairs chat with a parser.
chat
¶
chat(
template: Any = None,
resource: Optional[Union[str, List[str]]] = None,
ratios: Optional[List[float]] = None,
fallback: Optional[List[str]] = None,
response_format: Optional[Dict[str, Any]] = None,
delay: float = 0,
) -> Any
Prompt → LLM graph for text generation.
Returns raw LLM output: content, role, model_used, usage, extras, etc.
Example::
c = chat(
resource="claude-haiku",
template={"system": "You are helpful.", "user": "{query}"},
query=PARENT["query"],
)
Source code in operonx/providers/ops/chain.py
ask
¶
ask(
error: str = None,
*,
template: Any = None,
resource: Optional[Union[str, List[str]]] = None,
ratios: Optional[List[float]] = None,
fallback: Optional[List[str]] = None,
fields: Optional[List[str]] = None,
parser: str = "xml",
delay: float = 0,
response_format: Optional[Dict[str, Any]] = None,
validators: Optional[Dict[str, list]] = None,
until: str = None,
max_iterations: int = 2,
) -> Any
Prompt → LLM → Parser graph for structured extraction.
Simple mode (no retry)::
a = ask(
resource="claude-haiku",
template="Classify: {speech}",
fields=["result: str"],
parser="xml",
validators={"result": ["CONFIRM", "DENY", "@FALLBACK"]},
speech=PARENT["speech"],
)
Retry mode (pass until=)::
a = ask(
resource="claude-haiku",
template="Classify: {speech}",
fields=["result: str"],
parser="xml",
validators={"result": ["CONFIRM", "DENY", "@FALLBACK"]},
until="error == None",
max_iterations=3,
error="init",
speech=PARENT["speech"],
)
Source code in operonx/providers/ops/chain.py
Resource resolution¶
Backend selection happens by name, not by direct construction.
Wire your resources.yaml:
llm:gpt-4o-mini:
api_type: openai
api_key: ${OPENAI_API_KEY}
base_url: https://api.openai.com/v1
model: gpt-4o-mini
embedding:openai:
api_type: openai
api_key: ${OPENAI_API_KEY}
base_url: https://api.openai.com/v1
model: text-embedding-3-small
dimensions: 1536
Then reference by key in your op definitions:
llm = LLMOp.of(resource="gpt-4o-mini", messages=PARENT["msgs"])
embed = EmbeddingOp.of(resource="openai", texts=PARENT["docs"])
Full reference — including the five disambiguated failure branches when a key is missing or unset — is in Resource hub.
Config classes¶
The Pydantic models behind resources.yaml. You rarely construct
these directly; the framework loads them from YAML. Listed here for
reference.
- LLM —
LLMConfig,OpenAIConfig,AzureConfig,GeminiConfig,AnthropicConfig,LLMType(inoperonx.providers.llms). - Embedding —
EmbeddingConfig,EmbeddingType(inoperonx.providers.embeddings). - Reranker —
RerankingConfig,RerankingType(inoperonx.providers.rerankers). - Auth —
KeycloakTokenConfig(inoperonx.providers.auth).
Factory functions¶
Resolve a config to a backend instance. Used internally by
ResourceHub — most
users don't call these directly.
create_llm
¶
Create an LLM backend from config.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
LLMConfig
|
LLMConfig with api_type determining which backend to create. |
required |
Returns:
| Type | Description |
|---|---|
BaseLLM
|
BaseLLM instance. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If api_type is unsupported. |
ImportError
|
With a helpful pointer to the right |
Source code in operonx/providers/llms/factory.py
create_embedding
¶
Create an embedding backend from config.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
EmbeddingConfig
|
EmbeddingConfig with api_type determining which backend to create. |
required |
Returns:
| Type | Description |
|---|---|
BaseEmbedder
|
BaseEmbedder instance. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If api_type is unsupported. |
ImportError
|
With a helpful pointer to the right |
Source code in operonx/providers/embeddings/factory.py
create_reranking
¶
Create a reranking backend from config.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
RerankingConfig
|
RerankingConfig with api_type determining which backend to create. |
required |
Returns:
| Type | Description |
|---|---|
BaseReranker
|
BaseReranker instance. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If api_type is unsupported. |
ImportError
|
With a helpful pointer to the right |
Source code in operonx/providers/rerankers/factory.py
create_auth
¶
Create a KeycloakTokenProvider from config.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
KeycloakTokenConfig
|
KeycloakTokenConfig instance. |
required |
Returns:
| Type | Description |
|---|---|
KeycloakTokenProvider
|
KeycloakTokenProvider instance. |
Raises:
| Type | Description |
|---|---|
ImportError
|
with a pointer to the right |