Skip to content

operonx.providers

LLM, embedding, reranker, and ONNX provider ops — plus the chat() / ask() shorthand helpers. Provider backends are loaded lazily — a tier-1 install (pip install operonx) can import operonx.providers without pulling openai / httpx / numpy / torch. Missing-dep errors surface only when the corresponding backend is actually accessed.

See Pick an extra on the installation page for which extra each provider needs.

Provider ops

The four core provider op types. Each exposes an Op.of(...) classmethod for concise construction with explicit keyword args — that's the recommended style.

LLMOp

LLMOp(
    resource: Optional[Union[str, List[str]]] = None,
    ratios: Optional[List[float]] = None,
    fallback: Optional[List[str]] = None,
    batch_mode: bool = False,
    seed: Optional[int] = None,
    inputs: Dict[str, Any] = None,
    outputs: Dict[str, Any] = None,
    **kwargs: Any,
)

Bases: BaseOp

Op that calls a language model via ResourceHub.

Supports streaming, weighted load balancing across multiple models, fallback chains with retry, and OpenAI Batch API mode (50 % cheaper).

Inputs

messages (list): Chat messages in OpenAI format. Required. temperature (float): Sampling temperature. Default: 0.0. max_tokens (int): Max output tokens. Default: None (model default). tools (list): Tool/function definitions. Default: None. tool_choice (str | dict): Tool selection strategy. Default: None. response_format (dict): Structured output format. Default: None.

Outputs

content (str): Generated text. role (str): Message role (usually "assistant"). finish_reason (str): Stop reason ("stop", "tool_calls", etc.). model_used (str): Actual model that served the request. tool_calls (list): Tool-call objects (empty list when absent). usage (dict): Flat token-cost metrics with keys prompt_tokens, completion_tokens, total_tokens, cached_tokens (cache hit), cache_write_tokens (Anthropic cache write), reasoning_tokens. extras (dict): Bag of uncommon fields — thinking_content, refusal, logprobs. Values are None when absent.

Example::

llm = LLMOp.of(resource="gpt-4o", messages=PARENT["messages"])

Initialize LLMOp.

Parameters:

Name Type Description Default
resource Optional[Union[str, List[str]]]

Resource key(s) for LLM in ResourceHub. - Single string: "gpt-4" - List for load balancing: ["gpt-4", "claude-3"]

None
ratios Optional[List[float]]

Weight ratios for load balancing. Must sum to 1.0. Only used when resource is a list.

None
fallback Optional[List[str]]

Fallback resource key(s) to use when primary model fails. List of resource keys from ResourceHub, tried in order.

None
batch_mode bool

Whether to use OpenAI Batch API (50% cheaper, async processing)

False
seed Optional[int]

Optional seed for load balancing RNG. Provides reproducible selection.

None
inputs Dict[str, Any]

Input variable mappings

None
outputs Dict[str, Any]

Output variable mappings

None
**kwargs Any

Additional keyword arguments for BaseOp

{}
Source code in operonx/providers/ops/llm.py
def __init__(
    self,
    resource: Optional[Union[str, List[str]]] = None,
    ratios: Optional[List[float]] = None,
    fallback: Optional[List[str]] = None,
    batch_mode: bool = False,
    seed: Optional[int] = None,
    inputs: Dict[str, Any] = None,
    outputs: Dict[str, Any] = None,
    **kwargs: Any,
):
    """Initialize LLMOp.

    Args:
        resource: Resource key(s) for LLM in ResourceHub.
            - Single string: "gpt-4"
            - List for load balancing: ["gpt-4", "claude-3"]
        ratios: Weight ratios for load balancing. Must sum to 1.0.
            Only used when resource is a list.
        fallback: Fallback resource key(s) to use when primary model fails.
            List of resource keys from ResourceHub, tried in order.
        batch_mode: Whether to use OpenAI Batch API (50% cheaper, async processing)
        seed: Optional seed for load balancing RNG. Provides reproducible selection.
        inputs: Input variable mappings
        outputs: Output variable mappings
        **kwargs: Additional keyword arguments for BaseOp
    """
    kwargs.setdefault("bound", "io")
    super().__init__(**kwargs)

    self.batch_mode = batch_mode
    self.contain_generation = True
    self.fallback = fallback
    self._rng = random.Random(seed)

    # Validate resource + ratios
    if isinstance(resource, list):
        self.resource = resource
        self.ratios = ratios or [1.0 / len(resource)] * len(resource)
        if len(self.ratios) != len(self.resource):
            raise ValueError(
                f"ratios length ({len(self.ratios)}) must match "
                f"resource length ({len(self.resource)})"
            )
        if abs(sum(self.ratios) - 1.0) > 0.01:
            raise ValueError(f"ratios must sum to 1.0, got {sum(self.ratios)}")
    else:
        self.resource = resource
        self.ratios = [1.0] if resource else None

    # I/O schema
    input_schema = {
        "messages": Param(type=list, required=True),
        "temperature": Param(type=float, default=0.0),
        "max_tokens": Param(type=int, default=None),
        "tools": Param(type=list, default=None),
        "tool_choice": Param(type=(str, dict), default=None),
        "response_format": Param(type=dict, default=None),
        "top_p": Param(type=float, default=None),
        "stop": Param(type=(str, list), default=None),
        "frequency_penalty": Param(type=float, default=None),
        "presence_penalty": Param(type=float, default=None),
        "seed": Param(type=int, default=None),
        "logprobs": Param(type=bool, default=None),
        "top_logprobs": Param(type=int, default=None),
        "n": Param(type=int, default=None),
        "user": Param(type=str, default=None),
    }

    output_schema = {
        "role": Param(type=str, default="assistant"),
        "content": Param(type=str, required=True),
        "finish_reason": Param(type=str, default=None),
        "model_used": Param(type=str, required=True),
        "tool_calls": Param(type=list, default=[]),
        "usage": Param(type=dict, default={}),
        "extras": Param(type=dict, default={}),
    }

    normalized_inputs = self._normalize_params(inputs)
    normalized_outputs = self._normalize_params(outputs)
    self.inputs = self._merge_params(input_schema, normalized_inputs)
    self.outputs = self._merge_params(output_schema, normalized_outputs)

    # Lazy-initialized from ResourceHub on first use
    self._llms: List["BaseLLM"] = []
    self._fallback_llms: List["BaseLLM"] = []
    self._batch_coordinator = None
    self._initialized = False

    # Core: stream → _stream_core, else → _generate_core
    if self.stream:
        self._set_core(self._stream_core)
    else:
        self._set_core(self._generate_core)

Attributes

specific_metadata property

specific_metadata: Dict[str, Any]

Return LLM-specific metadata dictionary.

Functions

warmup

warmup() -> None

Eagerly initialize LLM backends on engine startup.

Source code in operonx/providers/ops/llm.py
def warmup(self) -> None:
    """Eagerly initialize LLM backends on engine startup."""
    self._ensure_initialized()

normalize_trace_io

normalize_trace_io(inputs: Dict[str, Any], outputs: Dict[str, Any]) -> tuple

Wrap OpenAI chat-format multimodal blocks as Media for tracing.

Runs inside the tracing collector — returns a shallow-copied inputs dict with messages rewritten so image_url / input_audio blocks become Media instances. Real state is untouched; this is only the trace-time view.

Source code in operonx/providers/ops/llm.py
def normalize_trace_io(self, inputs: Dict[str, Any], outputs: Dict[str, Any]) -> tuple:
    """Wrap OpenAI chat-format multimodal blocks as ``Media`` for tracing.

    Runs inside the tracing collector — returns a shallow-copied inputs
    dict with ``messages`` rewritten so ``image_url`` / ``input_audio``
    blocks become ``Media`` instances. Real state is untouched; this is
    only the trace-time view.
    """
    msgs = inputs.get("messages")
    if msgs:
        wrapped = self._wrap_openai_media_blocks(msgs)
        if wrapped is not msgs:
            inputs = {**inputs, "messages": wrapped}
    return inputs, outputs

of

of(
    resource=None,
    *,
    ratios=None,
    fallback=None,
    batch_mode=False,
    seed=None,
    **kwargs,
) -> LLMOp

Create an LLMOp with flat kwargs.

Example::

llm = LLMOp.of(resource="gpt-4", messages=PARENT["messages"], outputs={"*": PARENT})
llm = LLMOp.of(resource=["gpt-4", "claude-3"], ratios=[0.7, 0.3], messages=PARENT["messages"])
Source code in operonx/providers/ops/llm.py
@shorthand
def of(
    cls, resource=None, *, ratios=None, fallback=None, batch_mode=False, seed=None, **kwargs
) -> "LLMOp":
    """Create an LLMOp with flat kwargs.

    Example::

        llm = LLMOp.of(resource="gpt-4", messages=PARENT["messages"], outputs={"*": PARENT})
        llm = LLMOp.of(resource=["gpt-4", "claude-3"], ratios=[0.7, 0.3], messages=PARENT["messages"])
    """
    input_mappings, init_kwargs = split_shorthand_kwargs(kwargs)
    return cls(
        resource=resource,
        ratios=ratios,
        fallback=fallback,
        batch_mode=batch_mode,
        seed=seed,
        inputs=input_mappings or None,
        **init_kwargs,
    )

serialize

serialize() -> dict

Serialize LLMOp for Rust backend, including backend configs.

Source code in operonx/providers/ops/llm.py
def serialize(self) -> dict:
    """Serialize LLMOp for Rust backend, including backend configs."""
    self._ensure_initialized()
    base = super().serialize()

    base["resource"] = self.resource
    base["ratios"] = self.ratios
    base["fallback"] = self.fallback
    base["batch_mode"] = self.batch_mode

    configs = []
    for llm in self._llms:
        if llm and hasattr(llm, "config"):
            configs.append(llm.config.model_dump(mode="json"))
    if configs:
        base["resource_configs"] = configs

    fallback_configs = []
    for llm in self._fallback_llms:
        if llm and hasattr(llm, "config"):
            fallback_configs.append(llm.config.model_dump(mode="json"))
    if fallback_configs:
        base["fallback_configs"] = fallback_configs

    return base

EmbeddingOp

EmbeddingOp(
    resource: Optional[str] = None,
    inputs: Dict[str, Any] = None,
    outputs: Dict[str, Any] = None,
    **kwargs: Any,
)

Bases: BaseOp

Op that converts texts to vector embeddings via ResourceHub.

Wraps an embedding backend (e.g. BGE-M3, OpenAI, TEI) and returns a list of embedding vectors matching the input order.

Inputs

texts (list[str]): Texts to embed. Required.

Outputs

embeddings (list[list[float]]): Embedding vectors.

Example::

embed = EmbeddingOp.of(resource="bge-m3", texts=PARENT["texts"])

Initialize EmbeddingOp.

Parameters:

Name Type Description Default
resource Optional[str]

Resource key for embedding model in ResourceHub (e.g., "bge-m3")

None
inputs Dict[str, Any]

Input variable mappings

None
outputs Dict[str, Any]

Output variable mappings

None
**kwargs Any

Additional keyword arguments for BaseOp

{}
Source code in operonx/providers/ops/embedding.py
def __init__(
    self,
    resource: Optional[str] = None,
    inputs: Dict[str, Any] = None,
    outputs: Dict[str, Any] = None,
    **kwargs: Any,
):
    """Initialize EmbeddingOp.

    Args:
        resource: Resource key for embedding model in ResourceHub (e.g., "bge-m3")
        inputs: Input variable mappings
        outputs: Output variable mappings
        **kwargs: Additional keyword arguments for BaseOp
    """
    # Provider ops are I/O-bound by default (HTTP calls to embedding backends)
    kwargs.setdefault("bound", "io")
    super().__init__(**kwargs)

    self.resource = resource

    # Define input/output schema
    input_schema = {
        "texts": Param(type=list, required=True),
    }

    output_schema = {
        "embeddings": Param(type=list, required=True),
    }

    # Merge with user-provided
    self.inputs = self._merge_params(input_schema, inputs)
    self.outputs = self._merge_params(output_schema, outputs)

    # Embedding backend — lazy-initialized on first use to allow
    # graph construction before ResourceHub is set up
    self.backend = None
    self._initialized = False
    self._set_core(self._process)

Attributes

specific_metadata property

specific_metadata: Dict[str, Any]

Return embedding-specific metadata dictionary.

Functions

warmup

warmup() -> None

Eagerly initialize embedding backend on engine startup.

Source code in operonx/providers/ops/embedding.py
def warmup(self) -> None:
    """Eagerly initialize embedding backend on engine startup."""
    self._ensure_initialized()

of

of(resource=None, **kwargs) -> EmbeddingOp

Create an EmbeddingOp with flat kwargs.

Example::

embed = EmbeddingOp.of(resource="bge-m3", texts=PARENT["texts"], outputs={"*": PARENT})
Source code in operonx/providers/ops/embedding.py
@shorthand
def of(cls, resource=None, **kwargs) -> "EmbeddingOp":
    """Create an EmbeddingOp with flat kwargs.

    Example::

        embed = EmbeddingOp.of(resource="bge-m3", texts=PARENT["texts"], outputs={"*": PARENT})
    """
    input_mappings, init_kwargs = split_shorthand_kwargs(kwargs)
    return cls(resource=resource, inputs=input_mappings or None, **init_kwargs)

serialize

serialize() -> dict

Serialize EmbeddingOp for Rust backend, including backend config.

Source code in operonx/providers/ops/embedding.py
def serialize(self) -> dict:
    """Serialize EmbeddingOp for Rust backend, including backend config."""
    self._ensure_initialized()
    base = super().serialize()
    base["resource"] = self.resource
    if self.backend and hasattr(self.backend, "config"):
        base["resource_config"] = self.backend.config.model_dump(mode="json")
    return base

RerankOp

RerankOp(
    resource: Optional[str] = None,
    inputs: Dict[str, Any] = None,
    outputs: Dict[str, Any] = None,
    **kwargs: Any,
)

Bases: BaseOp

Op that scores and re-orders documents by relevance to a query.

Wraps a reranker backend (e.g. BGE-M3, Pinecone, TEI) accessed via ResourceHub. Returns documents sorted by relevance score.

Inputs

query (str): The query to rank against. Required. documents (list[str]): Documents to rerank. Required. top_k (int): Max results to return. Default: -1 (all). threshold (float): Min score cutoff. Default: 0.0.

Outputs

reranks (list[dict]): Reranked results with index, score, and document fields.

Example::

rerank = RerankOp.of(
    resource="bge-m3",
    query=PARENT["query"],
    documents=PARENT["docs"],
    top_k=5,
)

Initialize RerankOp.

Parameters:

Name Type Description Default
resource Optional[str]

Resource key for reranker in ResourceHub (e.g., "bge-m3")

None
inputs Dict[str, Any]

Input variable mappings

None
outputs Dict[str, Any]

Output variable mappings

None
**kwargs Any

Additional keyword arguments for BaseOp

{}
Source code in operonx/providers/ops/rerank.py
def __init__(
    self,
    resource: Optional[str] = None,
    inputs: Dict[str, Any] = None,
    outputs: Dict[str, Any] = None,
    **kwargs: Any,
):
    """Initialize RerankOp.

    Args:
        resource: Resource key for reranker in ResourceHub (e.g., "bge-m3")
        inputs: Input variable mappings
        outputs: Output variable mappings
        **kwargs: Additional keyword arguments for BaseOp
    """
    # Provider ops are I/O-bound by default (HTTP calls to reranker backends)
    kwargs.setdefault("bound", "io")
    super().__init__(**kwargs)

    self.resource = resource

    # Define input/output schema
    input_schema = {
        "query": Param(type=str, required=True),
        "documents": Param(type=list, required=True),
        "top_k": Param(type=int, default=-1),
        "threshold": Param(type=float, default=0.0),
    }

    output_schema = {
        "reranks": Param(type=list, required=True),
    }

    # Merge with user-provided
    self.inputs = self._merge_params(input_schema, inputs)
    self.outputs = self._merge_params(output_schema, outputs)

    # Reranker backend — lazy-initialized on first use to allow
    # graph construction before ResourceHub is set up
    self.backend = None
    self._initialized = False
    self._set_core(self._process)

Attributes

specific_metadata property

specific_metadata: Dict[str, Any]

Return rerank-specific metadata dictionary.

Functions

warmup

warmup() -> None

Eagerly initialize reranker backend on engine startup.

Source code in operonx/providers/ops/rerank.py
def warmup(self) -> None:
    """Eagerly initialize reranker backend on engine startup."""
    self._ensure_initialized()

of

of(resource=None, **kwargs) -> RerankOp

Create a RerankOp with flat kwargs.

Example::

rerank = RerankOp.of(resource="bge-m3", query=PARENT["q"], documents=PARENT["docs"])
Source code in operonx/providers/ops/rerank.py
@shorthand
def of(cls, resource=None, **kwargs) -> "RerankOp":
    """Create a RerankOp with flat kwargs.

    Example::

        rerank = RerankOp.of(resource="bge-m3", query=PARENT["q"], documents=PARENT["docs"])
    """
    input_mappings, init_kwargs = split_shorthand_kwargs(kwargs)
    return cls(resource=resource, inputs=input_mappings or None, **init_kwargs)

serialize

serialize() -> dict

Serialize RerankOp for Rust backend, including backend config.

Source code in operonx/providers/ops/rerank.py
def serialize(self) -> dict:
    """Serialize RerankOp for Rust backend, including backend config."""
    self._ensure_initialized()
    base = super().serialize()
    base["resource"] = self.resource
    if self.backend and hasattr(self.backend, "config"):
        base["resource_config"] = self.backend.config.model_dump(mode="json")
    return base

PromptOp

PromptOp(
    inputs: Dict[str, Any] = None, outputs: Dict[str, Any] = None, **kwargs: Any
)

Bases: BaseOp

Op that formats a template into chat messages (OpenAI format).

Accepts three template formats:

  • str — becomes a single user message: "Hello {name}".
  • dict{"system": "...", "user": "..."} with {var} placeholders.
  • list — full messages array for multimodal / complex prompts.

All non-reserved input keys are substituted as template variables.

Inputs

template (str | dict | list): Message template. Default: None. conversation_history (list): Prior messages to prepend. Default: []. tool_results (list): Tool-call results to append. Default: []. (any): Template variables ({var} placeholders).

Outputs

messages (list): Formatted chat messages ready for an LLM.

Example::

p = PromptOp.of(
    template={"system": "You are {role}.", "user": "{query}"},
    role="helpful",
    query=PARENT["query"],
)

Initialize PromptOp.

Parameters:

Name Type Description Default
inputs Dict[str, Any]

Input mappings. Reserved keys for templates, others are template vars.

None
outputs Dict[str, Any]

Output variable mappings

None
**kwargs Any

Additional keyword arguments for BaseOp.

{}
Source code in operonx/providers/ops/prompt.py
def __init__(
    self, inputs: Dict[str, Any] = None, outputs: Dict[str, Any] = None, **kwargs: Any
):
    """Initialize PromptOp.

    Args:
        inputs: Input mappings. Reserved keys for templates, others are template vars.
        outputs: Output variable mappings
        **kwargs: Additional keyword arguments for BaseOp.
    """
    super().__init__(**kwargs)

    # Copy fixed schema
    parsed_inputs = {
        k: Param(type=v.type, required=v.required, default=v.default)
        for k, v in self.INPUT_SCHEMA.items()
    }
    parsed_outputs = {
        k: Param(type=v.type, required=v.required, default=v.default)
        for k, v in self.OUTPUT_SCHEMA.items()
    }

    # Normalize user inputs
    normalized_inputs = self._normalize_params(inputs)
    normalized_outputs = self._normalize_params(outputs)

    # Wildcard handling: infer template variable names from the source op
    if "__FORWARD_WILDCARD__" in normalized_inputs:
        for var in self._infer_wildcard_vars(normalized_inputs["__FORWARD_WILDCARD__"]):
            if var not in parsed_inputs:
                parsed_inputs[var] = Param(type=Any, required=False, default=None)

    # Add non-reserved keys from user inputs to schema (template variables)
    for key, param in normalized_inputs.items():
        if key not in RESERVED_KEYS and key != "__FORWARD_WILDCARD__":
            parsed_inputs[key] = Param(type=Any, required=False, default=None)

    self.inputs = self._merge_params(parsed_inputs, normalized_inputs)
    self.outputs = self._merge_params(parsed_outputs, normalized_outputs)

    # Set core function
    self._set_core(self._format)

Attributes

specific_metadata property

specific_metadata: Dict[str, Any]

Return prompt-specific metadata.

Functions

of

of(template=None, **kwargs) -> PromptOp

Create a PromptOp with flat kwargs.

Example::

p = PromptOp.of(template={"system": "You are {role}.", "user": "{query}"}, role="helpful", query=PARENT["q"])
Source code in operonx/providers/ops/prompt.py
@shorthand
def of(cls, template=None, **kwargs) -> "PromptOp":
    """Create a PromptOp with flat kwargs.

    Example::

        p = PromptOp.of(template={"system": "You are {role}.", "user": "{query}"}, role="helpful", query=PARENT["q"])
    """
    input_mappings, init_kwargs = split_shorthand_kwargs(kwargs)
    if template is not None:
        input_mappings["template"] = template
    return cls(inputs=input_mappings or None, **init_kwargs)

High-level helpers

chat() glues a PromptOp and LLMOp into one op call (most LLM-using examples use this). ask() is the structured-output variant that pairs chat with a parser.

chat

chat(
    template: Any = None,
    resource: Optional[Union[str, List[str]]] = None,
    ratios: Optional[List[float]] = None,
    fallback: Optional[List[str]] = None,
    response_format: Optional[Dict[str, Any]] = None,
    delay: float = 0,
) -> Any

Prompt → LLM graph for text generation.

Returns raw LLM output: content, role, model_used, usage, extras, etc.

Example::

c = chat(
    resource="claude-haiku",
    template={"system": "You are helpful.", "user": "{query}"},
    query=PARENT["query"],
)
Source code in operonx/providers/ops/chain.py
@graph
def chat(
    template: Any = None,
    resource: Optional[Union[str, List[str]]] = None,
    ratios: Optional[List[float]] = None,
    fallback: Optional[List[str]] = None,
    response_format: Optional[Dict[str, Any]] = None,
    delay: float = 0,
) -> Any:
    """Prompt → LLM graph for text generation.

    Returns raw LLM output: content, role, model_used, usage, extras, etc.

    Example::

        c = chat(
            resource="claude-haiku",
            template={"system": "You are helpful.", "user": "{query}"},
            query=PARENT["query"],
        )
    """
    _prompt = PromptOp(name="prompt", inputs={"template": template, "*": PARENT})

    _llm = LLMOp(
        name="llm",
        resource=resource,
        ratios=ratios,
        fallback=fallback,
        inputs={"messages": _prompt["messages"], "response_format": response_format},
        outputs={"*": PARENT},
        delay=delay,
    )

    START >> _prompt >> _llm >> END

ask

ask(
    error: str = None,
    *,
    template: Any = None,
    resource: Optional[Union[str, List[str]]] = None,
    ratios: Optional[List[float]] = None,
    fallback: Optional[List[str]] = None,
    fields: Optional[List[str]] = None,
    parser: str = "xml",
    delay: float = 0,
    response_format: Optional[Dict[str, Any]] = None,
    validators: Optional[Dict[str, list]] = None,
    until: str = None,
    max_iterations: int = 2,
) -> Any

Prompt → LLM → Parser graph for structured extraction.

Simple mode (no retry)::

a = ask(
    resource="claude-haiku",
    template="Classify: {speech}",
    fields=["result: str"],
    parser="xml",
    validators={"result": ["CONFIRM", "DENY", "@FALLBACK"]},
    speech=PARENT["speech"],
)

Retry mode (pass until=)::

a = ask(
    resource="claude-haiku",
    template="Classify: {speech}",
    fields=["result: str"],
    parser="xml",
    validators={"result": ["CONFIRM", "DENY", "@FALLBACK"]},
    until="error == None",
    max_iterations=3,
    error="init",
    speech=PARENT["speech"],
)
Source code in operonx/providers/ops/chain.py
@graph
def ask(
    # Loop state (before *) — only these become loop state variables
    error: str = None,
    *,
    # Config (keyword-only) — passed directly, NOT loop state
    template: Any = None,
    resource: Optional[Union[str, List[str]]] = None,
    ratios: Optional[List[float]] = None,
    fallback: Optional[List[str]] = None,
    fields: Optional[List[str]] = None,
    parser: str = "xml",
    delay: float = 0,
    response_format: Optional[Dict[str, Any]] = None,
    validators: Optional[Dict[str, list]] = None,
    until: str = None,
    max_iterations: int = 2,
) -> Any:
    """Prompt → LLM → Parser graph for structured extraction.

    Simple mode (no retry)::

        a = ask(
            resource="claude-haiku",
            template="Classify: {speech}",
            fields=["result: str"],
            parser="xml",
            validators={"result": ["CONFIRM", "DENY", "@FALLBACK"]},
            speech=PARENT["speech"],
        )

    Retry mode (pass ``until=``)::

        a = ask(
            resource="claude-haiku",
            template="Classify: {speech}",
            fields=["result: str"],
            parser="xml",
            validators={"result": ["CONFIRM", "DENY", "@FALLBACK"]},
            until="error == None",
            max_iterations=3,
            error="init",
            speech=PARENT["speech"],
        )
    """
    if not fields:
        raise TypeError("fields is required for ask()")

    _prompt = PromptOp(name="prompt", inputs={"template": template, "*": PARENT})

    _llm = LLMOp(
        name="llm",
        resource=resource,
        ratios=ratios,
        fallback=fallback,
        inputs={"messages": _prompt["messages"], "response_format": response_format},
        delay=delay,
    )

    _parser = ParserOp(
        name="parser",
        format=parser,
        extract=fields,
        inputs={"text": _llm["content"], "validators": validators},
        outputs={"*": PARENT},
    )

    # In loop mode, feed error back to PARENT for until check
    if error is not None:
        _parser["error"] >> PARENT["error"]

    START >> _prompt >> _llm >> _parser >> END

Resource resolution

Backend selection happens by name, not by direct construction. Wire your resources.yaml:

llm:gpt-4o-mini:
  api_type: openai
  api_key: ${OPENAI_API_KEY}
  base_url: https://api.openai.com/v1
  model: gpt-4o-mini

embedding:openai:
  api_type: openai
  api_key: ${OPENAI_API_KEY}
  base_url: https://api.openai.com/v1
  model: text-embedding-3-small
  dimensions: 1536

Then reference by key in your op definitions:

llm = LLMOp.of(resource="gpt-4o-mini", messages=PARENT["msgs"])
embed = EmbeddingOp.of(resource="openai", texts=PARENT["docs"])

Full reference — including the five disambiguated failure branches when a key is missing or unset — is in Resource hub.

Config classes

The Pydantic models behind resources.yaml. You rarely construct these directly; the framework loads them from YAML. Listed here for reference.

  • LLM — LLMConfig, OpenAIConfig, AzureConfig, GeminiConfig, AnthropicConfig, LLMType (in operonx.providers.llms).
  • Embedding — EmbeddingConfig, EmbeddingType (in operonx.providers.embeddings).
  • Reranker — RerankingConfig, RerankingType (in operonx.providers.rerankers).
  • Auth — KeycloakTokenConfig (in operonx.providers.auth).

Factory functions

Resolve a config to a backend instance. Used internally by ResourceHub — most users don't call these directly.

create_llm

create_llm(config: LLMConfig) -> BaseLLM

Create an LLM backend from config.

Parameters:

Name Type Description Default
config LLMConfig

LLMConfig with api_type determining which backend to create.

required

Returns:

Type Description
BaseLLM

BaseLLM instance.

Raises:

Type Description
ValueError

If api_type is unsupported.

ImportError

With a helpful pointer to the right operonx[<extra>] install when an optional dependency is missing.

Source code in operonx/providers/llms/factory.py
def create_llm(config: LLMConfig) -> BaseLLM:
    """Create an LLM backend from config.

    Args:
        config: LLMConfig with api_type determining which backend to create.

    Returns:
        BaseLLM instance.

    Raises:
        ValueError: If api_type is unsupported.
        ImportError: With a helpful pointer to the right ``operonx[<extra>]``
            install when an optional dependency is missing.
    """
    if config.api_type in [LLMType.VLLM, LLMType.OPENAI]:
        try:
            from .openai import OpenAISDKModel
        except ImportError as e:
            raise ImportError(_missing_extra_message("OpenAISDKModel", "providers", e)) from e
        return OpenAISDKModel(config=config)
    if config.api_type == LLMType.AZURE:
        try:
            from .azure import AzureSDKModel
        except ImportError as e:
            raise ImportError(_missing_extra_message("AzureSDKModel", "providers", e)) from e
        return AzureSDKModel(config=config)
    if config.api_type == LLMType.GEMINI:
        try:
            from .gemini import GeminiOpenAISDKModel
        except ImportError as e:
            raise ImportError(_missing_extra_message("Gemini", "gemini", e)) from e
        return GeminiOpenAISDKModel(config=config)
    if config.api_type == LLMType.ANTHROPIC:
        try:
            from .anthropic import AnthropicModel
        except ImportError as e:
            raise ImportError(_missing_extra_message("AnthropicModel", "anthropic", e)) from e
        return AnthropicModel(config=config)
    raise ValueError(f"Unsupported Model: {config.api_type}")

create_embedding

create_embedding(config: EmbeddingConfig) -> BaseEmbedder

Create an embedding backend from config.

Parameters:

Name Type Description Default
config EmbeddingConfig

EmbeddingConfig with api_type determining which backend to create.

required

Returns:

Type Description
BaseEmbedder

BaseEmbedder instance.

Raises:

Type Description
ValueError

If api_type is unsupported.

ImportError

With a helpful pointer to the right operonx[<extra>] install when an optional dependency is missing.

Source code in operonx/providers/embeddings/factory.py
def create_embedding(config: EmbeddingConfig) -> BaseEmbedder:
    """Create an embedding backend from config.

    Args:
        config: EmbeddingConfig with api_type determining which backend to create.

    Returns:
        BaseEmbedder instance.

    Raises:
        ValueError: If api_type is unsupported.
        ImportError: With a helpful pointer to the right ``operonx[<extra>]``
            install when an optional dependency is missing.
    """
    if config.api_type == EmbeddingType.TEXT_EMBEDDING_INFERENCE:
        try:
            from operonx.providers.embeddings.tei import TEIEmbedding
        except ImportError as e:
            raise ImportError(_missing_extra_message("TEIEmbedding", "providers", e)) from e
        return TEIEmbedding(config)
    if config.api_type in (EmbeddingType.VLLM, EmbeddingType.OPENAI, EmbeddingType.AZURE):
        try:
            from operonx.providers.embeddings.vllm import VLLMEmbedding
        except ImportError as e:
            raise ImportError(_missing_extra_message("VLLMEmbedding", "providers", e)) from e
        return VLLMEmbedding(config)
    if config.api_type == EmbeddingType.HF:
        try:
            from operonx.providers.embeddings.huggingface import HFEmbedding
        except ImportError as e:
            raise ImportError(_missing_extra_message("HFEmbedding", "huggingface", e)) from e
        return HFEmbedding(config)
    if config.api_type == EmbeddingType.ONNX:
        try:
            from operonx.providers.embeddings.onnx import ONNXEmbedding
        except ImportError as e:
            raise ImportError(_missing_extra_message("ONNXEmbedding", "onnx", e)) from e
        return ONNXEmbedding(config)
    raise ValueError(f"Unsupported Model: {config.api_type}")

create_reranking

create_reranking(config: RerankingConfig) -> BaseReranker

Create a reranking backend from config.

Parameters:

Name Type Description Default
config RerankingConfig

RerankingConfig with api_type determining which backend to create.

required

Returns:

Type Description
BaseReranker

BaseReranker instance.

Raises:

Type Description
ValueError

If api_type is unsupported.

ImportError

With a helpful pointer to the right operonx[<extra>] install when an optional dependency is missing.

Source code in operonx/providers/rerankers/factory.py
def create_reranking(config: RerankingConfig) -> BaseReranker:
    """Create a reranking backend from config.

    Args:
        config: RerankingConfig with api_type determining which backend to create.

    Returns:
        BaseReranker instance.

    Raises:
        ValueError: If api_type is unsupported.
        ImportError: With a helpful pointer to the right ``operonx[<extra>]``
            install when an optional dependency is missing.
    """
    if config.api_type == RerankingType.TEXT_EMBEDDING_INFERENCE:
        try:
            from operonx.providers.rerankers.tei import TEIReranker
        except ImportError as e:
            raise ImportError(_missing_extra_message("TEIReranker", "providers", e)) from e
        return TEIReranker(config)
    if config.api_type == RerankingType.VLLM:
        try:
            from operonx.providers.rerankers.vllm import VLLMReranker
        except ImportError as e:
            raise ImportError(_missing_extra_message("VLLMReranker", "providers", e)) from e
        return VLLMReranker(config)
    if config.api_type == RerankingType.PINECONE:
        try:
            from operonx.providers.rerankers.pinecone import PineconeReranker
        except ImportError as e:
            raise ImportError(_missing_extra_message("PineconeReranker", "providers", e)) from e
        return PineconeReranker(config)
    if config.api_type == RerankingType.HF:
        try:
            from operonx.providers.rerankers.huggingface import HFReranker
        except ImportError as e:
            raise ImportError(_missing_extra_message("HFReranker", "huggingface", e)) from e
        return HFReranker(config)
    if config.api_type == RerankingType.ONNX:
        try:
            from operonx.providers.rerankers.onnx import ONNXReranker
        except ImportError as e:
            raise ImportError(_missing_extra_message("ONNXReranker", "onnx", e)) from e
        return ONNXReranker(config)
    raise ValueError(f"Unsupported Model: {config.api_type}")

create_auth

create_auth(config: KeycloakTokenConfig) -> KeycloakTokenProvider

Create a KeycloakTokenProvider from config.

Parameters:

Name Type Description Default
config KeycloakTokenConfig

KeycloakTokenConfig instance.

required

Returns:

Type Description
KeycloakTokenProvider

KeycloakTokenProvider instance.

Raises:

Type Description
ImportError

with a pointer to the right operonx[<extra>] install if httpx (or another keycloak dep) is missing.

Source code in operonx/providers/auth/factory.py
def create_auth(config: KeycloakTokenConfig) -> "KeycloakTokenProvider":  # noqa: F821
    """Create a KeycloakTokenProvider from config.

    Args:
        config: KeycloakTokenConfig instance.

    Returns:
        KeycloakTokenProvider instance.

    Raises:
        ImportError: with a pointer to the right `operonx[<extra>]`
            install if `httpx` (or another keycloak dep) is missing.
    """
    try:
        from .keycloak import KeycloakTokenProvider
    except ImportError as e:
        raise ImportError(
            "KeycloakTokenProvider requires additional packages.\n"
            "  Install with: pip install operonx[providers]\n"
            f"  Original error: {e}"
        ) from e
    return KeycloakTokenProvider(config)