Question 1

The LLM is reading full JSON order payloads every turn

Accepted Answer

Past-orders and order-detail tools return raw JSON of dozens of fields. The model re-reads, re-summarises, and burns tokens on every subsequent turn. A thin server-side normaliser cuts that cost by ~75% per tool call.

Question 2

Buttons and map-pickers are emitted by the LLM

Accepted Answer

UI element choice — button labels, option arrays, when to render the map — travels through the model as tool instructions and tool outputs. Tokens pay for UI twice: to instruct the model and to retry when it skips a required render. Moving UI emission into deterministic code saves tokens and makes fabricated IDs structurally impossible.

Question 3

Reorder flow re-asks data the system already has

Accepted Answer

Load size, truck spacing, quantity, address — all in the past-orders table. A well-designed reorder prefills all of them server-side in one step; a naive one turns them into four follow-up questions and four extra LLM turns.

Question 4

System prompt carries rules the code could enforce

Accepted Answer

Typical dispatch prompts carry 40–60% of their content for rules the surrounding application could enforce deterministically. Every turn pays for those rules in tokens. Stripping them is the fastest structural win.

AI for Concrete Dispatch Software