How a run works

A run is one turn: you send a message, the agent works it, and you get a reply. Understanding what happens in between explains why changes take effect on the next message, why some models handle longer conversations better, and how the little 👍 / 👎 / ✏️ bar under each reply actually teaches your agent.

The run lifecycle

Every message you send starts a fresh run. Nothing is pre-baked — the agent is assembled from scratch each time, so whatever you've changed in its Settings is already in effect.

Stage	What happens
1. Assemble	The agent is built for this run from your instructions, the tools you've toggled on, the knowledge & connectors you've attached, and the model you picked.
2. Work the request	The model works through the task in several steps — searching, reading a page, running a tool, then answering — all in one turn. You see short live "what it's doing now" labels stream by as it goes.
3. Reply	The answer streams back to you.
4. Housekeeping	After answering, the platform quietly records the run's cost and — for most agents — reflects on the conversation and saves anything worth remembering. See Memory.

Changes take effect on the next run

Because the agent is rebuilt for every message, editing its instructions, tools, knowledge, connectors, or model applies on the next reply — you don't need to restart the conversation. Make the change in Settings, then send your next message.

One reply can be many steps

A single reply isn't one model call — the agent can take several steps (search, read, run a tool, then write) before it answers. Those streamed labels are the steps happening in real time, not separate replies.

Context window & history

Each model can only "see" so much at once. That limit is the model's context window — a fixed property of the model you choose, not a setting you can turn up.

Behavior	What it means for you
Context window	How much the model can hold at once. A property of the model — pick one with a big enough window for your task.
Recent history replayed	Roughly the last ~8 turns of the conversation are fed back as context. Older turns aren't replayed word-for-word.
Long outputs auto-summarized	When several long tool outputs pile up in a turn, they're automatically summarized to stay within the window.
Overflow → compress & retry	If a turn still runs over the window, the platform compresses the conversation and retries rather than failing the reply.

Pick the right window, then keep it lean

For long or document-heavy work, choose a model with a large context window (see Models & reasoning). Rely on Memory and the agent's Instructions for facts that must persist — not on the raw chat scrollback, since only the recent turns are replayed. Keeping instructions and knowledge lean leaves more of the window for the actual work.

Don't count on old turns being remembered verbatim

Only the recent turns come back as context. If something from earlier in a long thread must stay in play, restate it, put it in the agent's Instructions, or teach it with a correction (below) — don't assume the agent still "sees" a message from 20 turns ago.

Caching

On Claude models, MyChatBot reuses the stable parts of each request so repeat turns are cheaper and faster. This is fully automatic — there's nothing to switch on.

What's cached	Why it helps
Your instructions + the tool list	These barely change turn to turn, so they're cached. Repeat turns reuse the cache instead of re-processing it — cheaper and faster, since cached input costs only a fraction of fresh input.
The agent's browser session	Not caching, but related reuse: the browser stays logged in across turns in a conversation, so it doesn't have to sign in again on every message.

Frequent tiny edits reduce cache hits

The cache works because the stable part of the prompt stays stable between turns. Making lots of small instruction tweaks mid-conversation resets that stable part, so the next turn can't reuse the cache. Batch your instruction changes rather than nudging them one word at a time.

Claude models cache; others don't

Prompt caching applies to Claude models. Other models still run fine, but they don't get this discount — one reason a chatty, high-volume agent can be cheaper on a Claude model.

Rate & correct replies

Under each finished reply is a small feedback bar. It isn't just analytics — it's a real loop that shapes what your agent does next.

Control	What it does
👍 Thumbs up	A positive signal on that reply.
👎 Thumbs down	A negative signal — and it also makes the agent forget what it auto-learned from that exchange, so a bad turn doesn't poison its memory.
✏️ Correction	Opens a box: "what should the agent have said or done?" What you write becomes a durable instruction the agent remembers and applies on future runs. This is the main way to teach it.

Corrections are how you teach the agent

A correction isn't a one-off fix — it sticks. Write it as the rule you want followed going forward ("always quote prices in euros", "never promise same-day shipping") and the agent carries it into later runs. See Memory for how durable facts are kept.

Thumbs are one per reply; corrections are latest-wins

👍 and 👎 are a single choice per reply — clicking the other flips it. You can leave a correction as many times as you like; the most recent one wins.

Use 👎 to undo a bad lesson

If the agent picked up the wrong habit from a specific exchange, thumbs-down that reply — it clears what the agent auto-learned there. Pair it with a ✏️ correction to replace the bad lesson with the behavior you actually want.

How a run works ​

The run lifecycle ​

Context window & history ​

Caching ​

Rate & correct replies ​

See also ​

How a run works

The run lifecycle

Context window & history

Caching

Rate & correct replies

See also