-
Notifications
You must be signed in to change notification settings - Fork 3.7k
feat(ai): Add countTokens to languageModelv3 #12176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
Und3rf10w
wants to merge
13
commits into
vercel:main
Choose a base branch
from
Und3rf10w:und3rf10w/ai-count-tokens
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add optional doCountTokens method to LanguageModelV3 interface for counting tokens in prompts before making generation requests. This enables providers to implement native token counting APIs or local estimation methods to help users: - Estimate costs before making API calls - Validate prompt lengths against model context limits - Optimize prompt construction New types: - LanguageModelV3CountTokensOptions: Input options including prompt and tools - LanguageModelV3CountTokensResult: Result with token count, warnings, and metadata
Add high-level countTokens function that wraps the provider's doCountTokens method with standardized prompt conversion, retry logic, and telemetry. The function accepts the same prompt formats as generateText/streamText (prompt string, messages array, or system + messages) and converts them to the standard LanguageModelV3Prompt format. Key features: - Automatic prompt conversion from user-facing format to provider format - Support for tools/functions in token counting - Configurable retry behavior with exponential backoff - AbortSignal support for cancellation - OpenTelemetry integration for observability - Provider-specific options passthrough Also adds mock doCountTokens method to MockLanguageModelV3 for testing.
Add doCountTokens method to AnthropicMessagesLanguageModel that uses Anthropic's native /v1/messages/count_tokens API endpoint. The implementation: - Converts prompts to Anthropic message format - Includes tool definitions in the request when provided - Returns input_tokens count from the API response - Includes raw request/response data for debugging - Handles provider-specific options (cacheControl, thinking) Also adds AnthropicCountTokensResponseSchema for response validation.
Add comprehensive tests for the Anthropic doCountTokens implementation: - Basic token counting for simple prompts - Token counting with tool definitions - Response headers and body verification - Request body validation - API error handling Also adds explicit baseURL configuration to test providers to prevent environment variable interference during testing.
Add doCountTokens method to GoogleGenerativeAILanguageModel that uses Google's native :countTokens API endpoint. The implementation: - Converts prompts to Google's GenerateContentRequest format - Includes tool definitions when provided - Returns totalTokens from the API response - Exposes cachedContentTokenCount in providerMetadata when available - Includes raw request/response data for debugging Supports both Google AI and Vertex AI through the existing URL configuration.
Add comprehensive tests for the Google doCountTokens implementation: - Basic token counting for simple prompts - Token counting with tool definitions - Cached content token metadata exposure - Provider options passthrough (cachedContent) - API error handling
Add doCountTokens method to OpenAI language models (chat, completion, and responses) that uses js-tiktoken for local token estimation. OpenAI doesn't provide a native token counting API, so this uses tiktoken to estimate tokens locally. The implementation: - Uses the appropriate encoding for each model (o200k_base, cl100k_base) - Handles chat format with special tokens (<|im_start|>, <|im_end|>) - Includes tool/function definitions in token count - Returns providerMetadata.openai.estimatedTokenCount: true to indicate this is a local estimate rather than an API-verified count Adds js-tiktoken as a dependency for accurate token estimation. This approach provides immediate token counts without API latency, though actual usage may vary slightly from the estimate.
Add tests for the OpenAI doCountTokens implementation: - Basic token counting for simple prompts - Token counting with tool definitions - Verification of estimatedTokenCount metadata flag - Multiple message format support Tests verify the tiktoken-based estimation works correctly without requiring network calls.
Add doCountTokens method to BedrockChatLanguageModel that uses
Bedrock's native /model/{modelId}/count-tokens API endpoint.
The implementation:
- Converts prompts to Bedrock's Converse format
- Includes tool definitions when provided
- Returns inputTokens count from the API response
- Handles Mistral model differences in message format
- Uses SigV4 authentication via the configured fetch handler
- Includes raw request/response data for debugging
This enables accurate token counting for all Bedrock-hosted models
including Claude, Mistral, and others that support the Converse API.
Add tests for the Bedrock doCountTokens implementation: - Basic token counting for simple prompts - Token counting with tool definitions - Request/response body verification - API error handling Uses a separate test server for the count-tokens endpoint to properly mock the Bedrock API responses.
Add comprehensive documentation for the countTokens function including: - Overview and use cases (cost estimation, context management, optimization) - API signature with all parameters and return types - Examples for each supported provider: - Anthropic (native API) - Google AI/Vertex AI (native API) - OpenAI/Azure (tiktoken estimation) - Amazon Bedrock (native API) - Error handling patterns for unsupported providers - Provider-specific metadata (e.g., estimatedTokenCount for OpenAI)
Add runnable examples demonstrating countTokens usage: - anthropic.ts: Basic Anthropic token counting - anthropic-tools.ts: Token counting with tool definitions - google.ts: Google AI token counting - google-vertex.ts: Vertex AI token counting - openai.ts: OpenAI token counting with tiktoken - azure.ts: Azure OpenAI token counting - amazon-bedrock.ts: Amazon Bedrock token counting Each example shows the basic pattern of calling countTokens and logging the result. Run with: pnpm tsx src/count-tokens/<file>.ts
Contributor
Author
|
I need to do more thorough testing, but open to any initial reviews. I'm unsure if it's okay to do tiktoken for the azure/openai providers, but that's openai's official guidance. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
ai/core
ai/provider
provider/amazon-bedrock
Issues related to the @ai-sdk/amazon-bedrock provider
provider/anthropic
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
Token counting is a fundamental capability needed by developers building AI applications. Before making API calls, developers often need to:
Currently, developers must either:
This PR adds a native
countTokens()function to the AI SDK that provides accurate token counts using each provider's native API or best-available estimation method.Summary
Core Implementation
New
countTokens()function (packages/ai/src/count-tokens/)generateText/streamTextAPI patternpromptstring,messagesarray, orsystem+messagesLanguageModelV3PromptinternallyNew Provider Interface (
packages/provider/src/language-model/v3/)doCountTokens()method toLanguageModelV3interfaceLanguageModelV3CountTokensOptionsandLanguageModelV3CountTokensResultProvider Implementations
/v1/messages/count_tokens):countTokens)cachedContentTokenCountin metadata/model/{modelId}/count-tokens)js-tiktoken)o200k_base/cl100k_baseencodings; returnsproviderMetadata.openai.estimatedTokenCount: trueFiles Changed
API Usage
Manual Verification
Running Examples
Expected Output
Running Tests
Build Verification
Checklist
pnpm changesetin the project root)Future Work
Related Issues
convert-to-anthropic-messages-prompt.tsandconvert-to-google-generative-ai-messages.ts#9391