LLM chat¶
This guide adds a real LLM call to the workflow. Pre-requisites:
You will need:
OPENAI_API_KEYin.env.- A
resources.yamllisting the model.
Configure resources¶
resources.yaml:
.env:
Use chat for a one-shot prompt¶
chat is a shorthand that combines a prompt template with an LLM call.
Use ask when you want a single-string answer instead of a structured
chat response.
import asyncio
import operonx
from operonx.core import Operon, GraphOp, START, END, PARENT
from operonx.providers import chat
async def main():
operonx.bootstrap() # loads .env + resources.yaml
with GraphOp(name="chat") as graph:
c = chat(
resource="gpt-4o",
template={
"system": "You are a concise assistant.",
"user": "{question}",
},
question=PARENT["question"],
)
START >> c >> END
engine = Operon(graph)
result = await engine.run(inputs={"question": "What is Python?"})
print(result["content"])
asyncio.run(main())
chat always uses keyword arguments — never positional. The output
key is content by default. Map it explicitly with outputs={...} if
you want a different name.
Use LLMOp.of for structured calls¶
When you already have a list of messages (e.g. multi-turn conversation),
use LLMOp.of directly:
from operonx.providers import LLMOp
llm = LLMOp.of(resource="gpt-4o", messages=PARENT["messages"])
START >> llm >> END
messages is a list of {"role": "system" | "user" | "assistant", "content": ...}
dicts.
Streaming a response¶
LLMOp and chat both support streaming. The op yields one frame per
token chunk; downstream ops consume them as they arrive.
See Streaming for the consumption side.