Skip to content

Conversation

@Und3rf10w
Copy link
Contributor

@Und3rf10w Und3rf10w commented Jan 31, 2026

Background

Token counting is a fundamental capability needed by developers building AI applications. Before making API calls, developers often need to:

  1. Estimate costs - Know how much an API call will cost before committing to it
  2. Validate context limits - Ensure prompts fit within model context windows (e.g., 200K for Claude, 128K for GPT-4)
  3. Optimize prompts - Compare token counts across different prompt variations to find the most efficient approach
  4. Budget management - Track and limit token usage across applications

Currently, developers must either:

  • Use external tokenization libraries with no guarantee of accuracy
  • Make actual API calls and check usage after the fact
  • Guess based on rough character-to-token ratios

This PR adds a native countTokens() function to the AI SDK that provides accurate token counts using each provider's native API or best-available estimation method.

Summary

Core Implementation

New countTokens() function (packages/ai/src/count-tokens/)

  • High-level function matching the generateText/streamText API pattern
  • Accepts same prompt formats: prompt string, messages array, or system + messages
  • Supports tool definitions in token counting
  • Includes retry logic, abort signal support, and OpenTelemetry integration
  • Converts user-facing prompt format to LanguageModelV3Prompt internally

New Provider Interface (packages/provider/src/language-model/v3/)

  • Added optional doCountTokens() method to LanguageModelV3 interface
  • New types: LanguageModelV3CountTokensOptions and LanguageModelV3CountTokensResult
  • Returns token count, warnings, request/response metadata, and provider-specific data

Provider Implementations

Provider Method Notes
Anthropic Native API (/v1/messages/count_tokens) Exact token count from Claude's tokenizer
Google AI / Vertex AI Native API (:countTokens) Includes cachedContentTokenCount in metadata
Amazon Bedrock Native API (/model/{modelId}/count-tokens) Works with all Converse-compatible models
OpenAI / Azure OpenAI Local estimation (js-tiktoken) Uses o200k_base/cl100k_base encodings; returns providerMetadata.openai.estimatedTokenCount: true

Files Changed

packages/provider/                    # Interface definition
├── src/language-model/v3/
│   ├── language-model-v3.ts         # Added doCountTokens? method
│   ├── language-model-v3-count-tokens-options.ts   # New
│   ├── language-model-v3-count-tokens-result.ts    # New
│   └── index.ts                     # Re-exports

packages/ai/                          # Core function
├── src/count-tokens/
│   ├── count-tokens.ts              # Main implementation
│   ├── count-tokens.test.ts         # 10 test cases
│   ├── count-tokens-result.ts       # Result type
│   └── index.ts
├── src/index.ts                     # Export countTokens
└── src/test/mock-language-model-v3.ts  # Mock for testing

packages/anthropic/                   # Anthropic provider
├── src/anthropic-messages-language-model.ts      # +90 lines
├── src/anthropic-messages-language-model.test.ts # +107 lines (5 tests)
└── src/anthropic-messages-api.ts    # Response schema

packages/google/                      # Google provider
├── src/google-generative-ai-language-model.ts      # +79 lines
└── src/google-generative-ai-language-model.test.ts # +106 lines (5 tests)

packages/openai/                      # OpenAI provider
├── src/openai-count-tokens.ts       # New (202 lines) - tiktoken wrapper
├── src/chat/openai-chat-language-model.ts         # +23 lines
├── src/chat/openai-chat-language-model.test.ts    # +71 lines (4 tests)
├── src/completion/openai-completion-language-model.ts  # +24 lines
├── src/responses/openai-responses-language-model.ts    # +23 lines
├── src/internal/index.ts            # Re-export
└── package.json                     # Added js-tiktoken dependency

packages/amazon-bedrock/              # Bedrock provider
├── src/bedrock-chat-language-model.ts      # +73 lines
└── src/bedrock-chat-language-model.test.ts # +89 lines (4 tests)

content/docs/                         # Documentation
└── 07-reference/01-ai-sdk-core/07-count-tokens.mdx  # New (313 lines)

examples/ai-functions/src/count-tokens/  # Examples
├── anthropic.ts
├── anthropic-tools.ts
├── google.ts
├── google-vertex.ts
├── openai.ts
├── azure.ts
└── amazon-bedrock.ts

API Usage

import { anthropic } from '@ai-sdk/anthropic';
import { countTokens } from 'ai';

// Basic usage
const { tokens } = await countTokens({
  model: anthropic('claude-sonnet-4-5-20250929'),
  messages: [{ role: 'user', content: 'Explain quantum computing.' }],
});

console.log(`This prompt uses ${tokens} tokens`);

// With tools
const result = await countTokens({
  model: anthropic('claude-sonnet-4-5-20250929'),
  messages: [{ role: 'user', content: 'What is the weather?' }],
  tools: {
    weather: tool({
      description: 'Get current weather',
      parameters: z.object({ location: z.string() }),
    }),
  },
});

// OpenAI (local estimation)
const { tokens, providerMetadata } = await countTokens({
  model: openai('gpt-4o'),
  messages: [{ role: 'user', content: 'Hello' }],
});

if (providerMetadata?.openai?.estimatedTokenCount) {
  console.log('Note: This is a tiktoken estimate');
}

Manual Verification

Running Examples

cd examples/ai-functions

# Anthropic (requires ANTHROPIC_API_KEY)
pnpm tsx src/count-tokens/anthropic.ts
pnpm tsx src/count-tokens/anthropic-tools.ts

# Google (requires GOOGLE_GENERATIVE_AI_API_KEY)
pnpm tsx src/count-tokens/google.ts

# OpenAI (requires OPENAI_API_KEY - though counting is local)
pnpm tsx src/count-tokens/openai.ts

# Bedrock (requires AWS credentials)
pnpm tsx src/count-tokens/amazon-bedrock.ts

Expected Output

# Anthropic
Token count: 12
Warnings: []

# OpenAI
Estimated tokens: 11
Note: This is a tiktoken estimate

# Bedrock
Token count: 14
Warnings: []

Running Tests

# All countTokens tests
pnpm test -- --run -t "countTokens"
pnpm test -- --run -t "doCountTokens"

# By provider
cd packages/anthropic && pnpm test -- --run -t "doCountTokens"
cd packages/google && pnpm test -- --run -t "doCountTokens"
cd packages/openai && pnpm test -- --run -t "doCountTokens"
cd packages/amazon-bedrock && pnpm test -- --run -t "doCountTokens"
cd packages/ai && pnpm test -- --run -t "countTokens"

Build Verification

pnpm build           # 63/63 tasks successful
pnpm type-check:full # No errors
pnpm test            # All relevant tests pass

Checklist

  • Tests have been added / updated (for bug fixes / features)
  • Documentation has been added / updated (for bug fixes / features)
  • A patch changeset for relevant packages has been added (for bug fixes / features - run pnpm changeset in the project root)
  • I have reviewed this pull request (self-review)

Future Work

  • Additional providers: Add support for other providers as they expose token counting APIs

Related Issues

Add optional doCountTokens method to LanguageModelV3 interface for
counting tokens in prompts before making generation requests.

This enables providers to implement native token counting APIs or
local estimation methods to help users:
- Estimate costs before making API calls
- Validate prompt lengths against model context limits
- Optimize prompt construction

New types:
- LanguageModelV3CountTokensOptions: Input options including prompt and tools
- LanguageModelV3CountTokensResult: Result with token count, warnings, and metadata
Add high-level countTokens function that wraps the provider's doCountTokens
method with standardized prompt conversion, retry logic, and telemetry.

The function accepts the same prompt formats as generateText/streamText
(prompt string, messages array, or system + messages) and converts them
to the standard LanguageModelV3Prompt format.

Key features:
- Automatic prompt conversion from user-facing format to provider format
- Support for tools/functions in token counting
- Configurable retry behavior with exponential backoff
- AbortSignal support for cancellation
- OpenTelemetry integration for observability
- Provider-specific options passthrough

Also adds mock doCountTokens method to MockLanguageModelV3 for testing.
Add doCountTokens method to AnthropicMessagesLanguageModel that uses
Anthropic's native /v1/messages/count_tokens API endpoint.

The implementation:
- Converts prompts to Anthropic message format
- Includes tool definitions in the request when provided
- Returns input_tokens count from the API response
- Includes raw request/response data for debugging
- Handles provider-specific options (cacheControl, thinking)

Also adds AnthropicCountTokensResponseSchema for response validation.
Add comprehensive tests for the Anthropic doCountTokens implementation:
- Basic token counting for simple prompts
- Token counting with tool definitions
- Response headers and body verification
- Request body validation
- API error handling

Also adds explicit baseURL configuration to test providers to prevent
environment variable interference during testing.
Add doCountTokens method to GoogleGenerativeAILanguageModel that uses
Google's native :countTokens API endpoint.

The implementation:
- Converts prompts to Google's GenerateContentRequest format
- Includes tool definitions when provided
- Returns totalTokens from the API response
- Exposes cachedContentTokenCount in providerMetadata when available
- Includes raw request/response data for debugging

Supports both Google AI and Vertex AI through the existing URL configuration.
Add comprehensive tests for the Google doCountTokens implementation:
- Basic token counting for simple prompts
- Token counting with tool definitions
- Cached content token metadata exposure
- Provider options passthrough (cachedContent)
- API error handling
Add doCountTokens method to OpenAI language models (chat, completion, and
responses) that uses js-tiktoken for local token estimation.

OpenAI doesn't provide a native token counting API, so this uses tiktoken
to estimate tokens locally. The implementation:
- Uses the appropriate encoding for each model (o200k_base, cl100k_base)
- Handles chat format with special tokens (<|im_start|>, <|im_end|>)
- Includes tool/function definitions in token count
- Returns providerMetadata.openai.estimatedTokenCount: true to indicate
  this is a local estimate rather than an API-verified count

Adds js-tiktoken as a dependency for accurate token estimation.

This approach provides immediate token counts without API latency, though
actual usage may vary slightly from the estimate.
Add tests for the OpenAI doCountTokens implementation:
- Basic token counting for simple prompts
- Token counting with tool definitions
- Verification of estimatedTokenCount metadata flag
- Multiple message format support

Tests verify the tiktoken-based estimation works correctly
without requiring network calls.
Add doCountTokens method to BedrockChatLanguageModel that uses
Bedrock's native /model/{modelId}/count-tokens API endpoint.

The implementation:
- Converts prompts to Bedrock's Converse format
- Includes tool definitions when provided
- Returns inputTokens count from the API response
- Handles Mistral model differences in message format
- Uses SigV4 authentication via the configured fetch handler
- Includes raw request/response data for debugging

This enables accurate token counting for all Bedrock-hosted models
including Claude, Mistral, and others that support the Converse API.
Add tests for the Bedrock doCountTokens implementation:
- Basic token counting for simple prompts
- Token counting with tool definitions
- Request/response body verification
- API error handling

Uses a separate test server for the count-tokens endpoint
to properly mock the Bedrock API responses.
Add comprehensive documentation for the countTokens function including:
- Overview and use cases (cost estimation, context management, optimization)
- API signature with all parameters and return types
- Examples for each supported provider:
  - Anthropic (native API)
  - Google AI/Vertex AI (native API)
  - OpenAI/Azure (tiktoken estimation)
  - Amazon Bedrock (native API)
- Error handling patterns for unsupported providers
- Provider-specific metadata (e.g., estimatedTokenCount for OpenAI)
Add runnable examples demonstrating countTokens usage:
- anthropic.ts: Basic Anthropic token counting
- anthropic-tools.ts: Token counting with tool definitions
- google.ts: Google AI token counting
- google-vertex.ts: Vertex AI token counting
- openai.ts: OpenAI token counting with tiktoken
- azure.ts: Azure OpenAI token counting
- amazon-bedrock.ts: Amazon Bedrock token counting

Each example shows the basic pattern of calling countTokens
and logging the result. Run with: pnpm tsx src/count-tokens/<file>.ts
@vercel-ai-sdk vercel-ai-sdk bot added ai/core ai/provider provider/amazon-bedrock Issues related to the @ai-sdk/amazon-bedrock provider provider/anthropic labels Jan 31, 2026
@Und3rf10w
Copy link
Contributor Author

Und3rf10w commented Jan 31, 2026

I need to do more thorough testing, but open to any initial reviews. I'm unsure if it's okay to do tiktoken for the azure/openai providers, but that's openai's official guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai/core ai/provider provider/amazon-bedrock Issues related to the @ai-sdk/amazon-bedrock provider provider/anthropic

1 participant