Description
Spotted here for gemini-2.5-pro-preview-05-06
: https://gist.github.com/simonw/87a59e7f5c12274d65e2ac053b0eacdb#token-usage
264 input, 104 output, {"promptTokensDetails": [{"modality": "TEXT", "tokenCount": 6}, {"modality": "IMAGE", "tokenCount": 258}], "thoughtsTokenCount": 989}
And then Aider wrote about the same problem: https://aider.chat/2025/05/07/gemini-cost.html
An investigation determined the primary cause was that the litellm package (used by aider for LLM API connections) was not properly including reasoning tokens in the token counts it reported.
Here's where LiteLLM fixed it: BerriAI/litellm@a7db0df
This note is very important:
Note that the Gemini API returns different usage metadata than Vertex AI. With the Gemini API,
candidatesTokenCount
includes thinking tokens, but on Vertex AI,candidatesTokenCount
does not include thinking tokens.
But that doesn't fit what I'm seeing here, because I did NOT use Vertex but still got that response where thoughtsTokenCount
did not get included in the candidatesTokenCount
For reference, here's our current code for that:
Lines 365 to 379 in 902519b