Summary
An agent conversation can reach a state where tool-result messages are persisted without their preceding assistant tool_calls message ("orphan" tool results). Once that happens, replaying the conversation history fails for every provider — the chat request 400s on tool-call/result pairing — and the conversation becomes permanently unusable regardless of which model is selected.
This is not provider-specific: the built-in OpenAI driver hits it too, and the persistence path (server/ai/runtime/runner.ts) is unchanged core code.
Steps to reproduce
I was not able to pin an exact deterministic sequence — it surfaced during heavy agent use (building a full site) while switching models mid-conversation and with some interrupted / duplicate sends. The smallest conditions I believe trigger it:
- Run the site agent through a long, tool-heavy turn (many
insertHtml/token tools).
- Interrupt or re-submit a prompt while a tool turn is mid-flight (or switch the model mid-turn).
- Continue the same conversation.
The resulting persisted state is deterministic and is the real evidence (below): the conversation ends up with tool-result rows that have no matching assistant tool_calls row before them.
Expected behavior
A turn that errors, is interrupted, or is double-submitted should never leave the conversation in a state that bricks all future turns. Replaying history should always send well-formed tool-call/result pairing to the provider.
Actual behavior
ai_messages ends up with orphan role:'tool' rows (a tool result whose tool_call_id has no preceding assistant tool_calls message). Every subsequent turn replays that malformed history and the provider rejects it with a tool-pairing error — across providers. Switching models does not help, because the corruption is in the stored history, not the model.
Version or commit
CoreBunch/Instatic@a125a4a (main), exercised through a local branch that adds an OpenAI-compatible provider — but runner.ts/persistence is untouched core code and the built-in OpenAI driver reproduces the same error, so this is not specific to that branch.
Deployment mode
Local dev with Bun
Logs or screenshots
# Provider errors on replay (all the same underlying tool-pairing problem):
OpenAI (400): No tool call found for function call output with call_id call_37ac.
DeepSeek (400): An assistant message with 'tool_calls' must be followed by tool
messages responding to each 'tool_call_id'. (insufficient tool
messages following tool_calls message)
MiniMax (400): invalid params, tool call result does not follow tool call (2013)
# Persisted ai_messages for the bricked conversation (pos | role | tool_call_id | tool_name):
88 | user | - | - | "Build a single-page website..."
89 | user | - | - | (duplicate of 88 — double-submit)
90 | tool | call_37ac| read_document| toolResult ok <-- ORPHAN (no assistant tool_calls before it)
91 | tool | call_dc2b| insertHtml | toolResult ok <-- ORPHAN
92 | tool | call_a9df| insertHtml | toolResult ok <-- ORPHAN
93 | user | - | - | "you only did the hero section"
# Count mismatch confirms it: 35 tool-result rows vs 32 assistant tool_call rows = 3 orphans.
Likely cause (hypothesis)
In server/ai/runtime/runner.ts, a turn persists the assistant tool_calls (appendToolCall) and then the tool results (appendToolResult) as separate events. Orphan results imply a turn persisted results while its assistant tool_calls rows were lost — consistent with an interrupted / concurrent / double-submitted turn racing on one conversation (the tail of the corrupted conversation showed duplicate prompts and repeated sends). I did not fully isolate the exact race, so treat this as a hypothesis backed by the corrupted end-state, not a confirmed root cause.
Suggested mitigation
Make history replay resilient to malformed pairing when building the provider request from persisted history: drop orphan tool-result messages (a result with no matching prior assistant tool_call) and assistant tool_calls with no following result. That way a single interrupted/duplicated turn can't permanently brick a conversation, and it also hardens against cross-provider replay differences. (Optionally, also persist a turn's assistant tool_calls + results atomically and/or guard against concurrent turns on the same conversation, to prevent the orphans from being written in the first place.)
Disclosure: surfaced while testing a local OpenAI-compatible provider against a reasoning-model gateway; the root-cause analysis was AI-assisted (Claude Code). Reproduced the symptom on the built-in OpenAI driver, so it is not provider-specific.
Summary
An agent conversation can reach a state where tool-result messages are persisted without their preceding assistant
tool_callsmessage ("orphan" tool results). Once that happens, replaying the conversation history fails for every provider — the chat request 400s on tool-call/result pairing — and the conversation becomes permanently unusable regardless of which model is selected.This is not provider-specific: the built-in OpenAI driver hits it too, and the persistence path (
server/ai/runtime/runner.ts) is unchanged core code.Steps to reproduce
I was not able to pin an exact deterministic sequence — it surfaced during heavy agent use (building a full site) while switching models mid-conversation and with some interrupted / duplicate sends. The smallest conditions I believe trigger it:
insertHtml/token tools).The resulting persisted state is deterministic and is the real evidence (below): the conversation ends up with tool-result rows that have no matching assistant
tool_callsrow before them.Expected behavior
A turn that errors, is interrupted, or is double-submitted should never leave the conversation in a state that bricks all future turns. Replaying history should always send well-formed tool-call/result pairing to the provider.
Actual behavior
ai_messagesends up with orphanrole:'tool'rows (a tool result whosetool_call_idhas no preceding assistanttool_callsmessage). Every subsequent turn replays that malformed history and the provider rejects it with a tool-pairing error — across providers. Switching models does not help, because the corruption is in the stored history, not the model.Version or commit
CoreBunch/Instatic@a125a4a(main), exercised through a local branch that adds an OpenAI-compatible provider — butrunner.ts/persistence is untouched core code and the built-in OpenAI driver reproduces the same error, so this is not specific to that branch.Deployment mode
Local dev with Bun
Logs or screenshots
Likely cause (hypothesis)
In
server/ai/runtime/runner.ts, a turn persists the assistanttool_calls(appendToolCall) and then the tool results (appendToolResult) as separate events. Orphan results imply a turn persisted results while its assistanttool_callsrows were lost — consistent with an interrupted / concurrent / double-submitted turn racing on one conversation (the tail of the corrupted conversation showed duplicate prompts and repeated sends). I did not fully isolate the exact race, so treat this as a hypothesis backed by the corrupted end-state, not a confirmed root cause.Suggested mitigation
Make history replay resilient to malformed pairing when building the provider request from persisted history: drop orphan tool-result messages (a result with no matching prior assistant
tool_call) and assistanttool_callswith no following result. That way a single interrupted/duplicated turn can't permanently brick a conversation, and it also hardens against cross-provider replay differences. (Optionally, also persist a turn's assistanttool_calls+ results atomically and/or guard against concurrent turns on the same conversation, to prevent the orphans from being written in the first place.)Disclosure: surfaced while testing a local OpenAI-compatible provider against a reasoning-model gateway; the root-cause analysis was AI-assisted (Claude Code). Reproduced the symptom on the built-in OpenAI driver, so it is not provider-specific.