Deployment¶
Operonx ships two HTTP servers — a Python FastAPI server (operonx[serve])
and a Rust Axum server (operonx-serve binary). Both expose the same
endpoints over the same JSON contract; pick whichever matches your
operational profile.
Python: operonx[serve]¶
from operonx.serve import build_app
from operonx.core import Operon
app = build_app(engine_factory=lambda: Operon(my_graph))
Run with uvicorn:
Endpoints:
POST /run— synchronous run, returns the final state.POST /stream— server-sent events stream of frames.GET /healthz— readiness probe.
Rust: operonx-serve binary¶
The Rust server is a single static binary — useful for edge deployment and containers without a Python runtime. Same endpoints, same JSON contract; graphs portable between the two via the shared schema.
Configuration¶
Both servers honour the standard Operonx setup:
.envfor credentials.resources.yamlfor model and tracer configs.bootstrap()(Python) / equivalent Rust call at startup.
For Kubernetes / containerised deployments, mount resources.yaml and
provide credentials through the platform's secret store rather than a
file-based .env.
Production checklist¶
- Set
OPERON_TRACES_DIRto a persistent volume (or skip the local tracer and use Langfuse / OTEL). - Cap concurrent requests via uvicorn
--limit-concurrency(Python) or the Rust server's--max-concurrentflag. - Wire health checks:
/healthzreturns 200 once the engine is built and the resource hub is loaded. - Pin model versions in
resources.yaml— never referencelatest. - Watch the Tracing backend for token-cost and latency drift.
Where to go next¶
- Architecture overview — internals.
- Rust and Python — backend choice.