Skip to content

How a run works ​

A run is one turn: you send a message, the agent works it, and you get a reply. Understanding what happens in between explains why changes take effect on the next message, why some models handle longer conversations better, and how the little πŸ‘ / πŸ‘Ž / ✏️ bar under each reply actually teaches your agent.

The run lifecycle ​

Every message you send starts a fresh run. Nothing is pre-baked β€” the agent is assembled from scratch each time, so whatever you've changed in its Settings is already in effect.

StageWhat happens
1. AssembleThe agent is built for this run from your instructions, the tools you've toggled on, the knowledge & connectors you've attached, and the model you picked.
2. Work the requestThe model works through the task in several steps β€” searching, reading a page, running a tool, then answering β€” all in one turn. You see short live "what it's doing now" labels stream by as it goes.
3. ReplyThe answer streams back to you.
4. HousekeepingAfter answering, the platform quietly records the run's cost and β€” for most agents β€” reflects on the conversation and saves anything worth remembering. See Memory.

Changes take effect on the next run

Because the agent is rebuilt for every message, editing its instructions, tools, knowledge, connectors, or model applies on the next reply β€” you don't need to restart the conversation. Make the change in Settings, then send your next message.

One reply can be many steps

A single reply isn't one model call β€” the agent can take several steps (search, read, run a tool, then write) before it answers. Those streamed labels are the steps happening in real time, not separate replies.

Context window & history ​

Each model can only "see" so much at once. That limit is the model's context window β€” a fixed property of the model you choose, not a setting you can turn up.

BehaviorWhat it means for you
Context windowHow much the model can hold at once. A property of the model β€” pick one with a big enough window for your task.
Recent history replayedRoughly the last ~8 turns of the conversation are fed back as context. Older turns aren't replayed word-for-word.
Long outputs auto-summarizedWhen several long tool outputs pile up in a turn, they're automatically summarized to stay within the window.
Overflow β†’ compress & retryIf a turn still runs over the window, the platform compresses the conversation and retries rather than failing the reply.

Pick the right window, then keep it lean

For long or document-heavy work, choose a model with a large context window (see Models & reasoning). Rely on Memory and the agent's Instructions for facts that must persist β€” not on the raw chat scrollback, since only the recent turns are replayed. Keeping instructions and knowledge lean leaves more of the window for the actual work.

Don't count on old turns being remembered verbatim

Only the recent turns come back as context. If something from earlier in a long thread must stay in play, restate it, put it in the agent's Instructions, or teach it with a correction (below) β€” don't assume the agent still "sees" a message from 20 turns ago.

Caching ​

On Claude models, MyChatBot reuses the stable parts of each request so repeat turns are cheaper and faster. This is fully automatic β€” there's nothing to switch on.

What's cachedWhy it helps
Your instructions + the tool listThese barely change turn to turn, so they're cached. Repeat turns reuse the cache instead of re-processing it β€” cheaper and faster, since cached input costs only a fraction of fresh input.
The agent's browser sessionNot caching, but related reuse: the browser stays logged in across turns in a conversation, so it doesn't have to sign in again on every message.

Frequent tiny edits reduce cache hits

The cache works because the stable part of the prompt stays stable between turns. Making lots of small instruction tweaks mid-conversation resets that stable part, so the next turn can't reuse the cache. Batch your instruction changes rather than nudging them one word at a time.

Claude models cache; others don't

Prompt caching applies to Claude models. Other models still run fine, but they don't get this discount β€” one reason a chatty, high-volume agent can be cheaper on a Claude model.

Rate & correct replies ​

Under each finished reply is a small feedback bar. It isn't just analytics β€” it's a real loop that shapes what your agent does next.

ControlWhat it does
πŸ‘ Thumbs upA positive signal on that reply.
πŸ‘Ž Thumbs downA negative signal β€” and it also makes the agent forget what it auto-learned from that exchange, so a bad turn doesn't poison its memory.
✏️ CorrectionOpens a box: "what should the agent have said or done?" What you write becomes a durable instruction the agent remembers and applies on future runs. This is the main way to teach it.

Corrections are how you teach the agent

A correction isn't a one-off fix β€” it sticks. Write it as the rule you want followed going forward ("always quote prices in euros", "never promise same-day shipping") and the agent carries it into later runs. See Memory for how durable facts are kept.

Thumbs are one per reply; corrections are latest-wins

πŸ‘ and πŸ‘Ž are a single choice per reply β€” clicking the other flips it. You can leave a correction as many times as you like; the most recent one wins.

Use πŸ‘Ž to undo a bad lesson

If the agent picked up the wrong habit from a specific exchange, thumbs-down that reply β€” it clears what the agent auto-learned there. Pair it with a ✏️ correction to replace the bad lesson with the behavior you actually want.

See also ​

  • Memory β€” what the agent remembers between runs, and for how long
  • Models & reasoning β€” picking a model with the right context window and reasoning depth
  • Usage & billing β€” what each run costs and how to read the Usage page
  • Tools & toggles β€” what the agent is assembled from each run
  • Tasks & schedules β€” unattended runs on a recurring cadence