Products
  • Wolfram|One

    The definitive Wolfram Language and notebook experience

  • Mathematica

    The original technical computing environment

  • Wolfram Notebook Assistant + LLM Kit

    All-in-one AI assistance for your Wolfram experience

  • System Modeler
  • Wolfram Player
  • Finance Platform
  • Wolfram Engine
  • Enterprise Private Cloud
  • Application Server
  • Wolfram|Alpha Notebook Edition
  • Wolfram Cloud App
  • Wolfram Player App

More mobile apps

Core Technologies of Wolfram Products

  • Wolfram Language
  • Computable Data
  • Wolfram Notebooks
  • AI & Linguistic Understanding

Deployment Options

  • Wolfram Cloud
  • wolframscript
  • Wolfram Engine Community Edition
  • Wolfram LLM API
  • WSTPServer
  • Wolfram|Alpha APIs

From the Community

  • Function Repository
  • Community Paclet Repository
  • Example Repository
  • Neural Net Repository
  • Prompt Repository
  • Wolfram Demonstrations
  • Data Repository
  • Group & Organizational Licensing
  • All Products
Consulting & Solutions

We deliver solutions for the AI era—combining symbolic computation, data-driven insights and deep technical expertise

  • Data & Computational Intelligence
  • Model-Based Design
  • Algorithm Development
  • Wolfram|Alpha for Business
  • Blockchain Technology
  • Education Technology
  • Quantum Computation

WolframConsulting.com

Wolfram Solutions

  • Data Science
  • Artificial Intelligence
  • Biosciences
  • Healthcare Intelligence
  • Sustainable Energy
  • Control Systems
  • Enterprise Wolfram|Alpha
  • Blockchain Labs

More Wolfram Solutions

Wolfram Solutions For Education

  • Research Universities
  • Colleges & Teaching Universities
  • Junior & Community Colleges
  • High Schools
  • Educational Technology
  • Computer-Based Math

More Solutions for Education

  • Contact Us
Learning & Support

Get Started

  • Wolfram Language Introduction
  • Fast Intro for Programmers
  • Fast Intro for Math Students
  • Wolfram Language Documentation

More Learning

  • Highlighted Core Areas
  • Demonstrations
  • YouTube
  • Daily Study Groups
  • Wolfram Schools and Programs
  • Books

Grow Your Skills

  • Wolfram U

    Courses in computing, science, life and more

  • Community

    Learn, solve problems and share ideas.

  • Blog

    News, views and insights from Wolfram

  • Resources for

    Software Developers

Tech Support

  • Contact Us
  • Support FAQs
  • Support FAQs
  • Contact Us
Company
  • About Wolfram
  • Career Center
  • All Sites & Resources
  • Connect & Follow
  • Contact Us

Work with Us

  • Student Ambassador Initiative
  • Wolfram for Startups
  • Student Opportunities
  • Jobs Using Wolfram Language

Educational Programs for Adults

  • Summer School
  • Winter School

Educational Programs for Youth

  • Middle School Camp
  • High School Research Program
  • Computational Adventures

Read

  • Stephen Wolfram's Writings
  • Wolfram Blog
  • Wolfram Tech | Books
  • Wolfram Media
  • Complex Systems

Educational Resources

  • Wolfram MathWorld
  • Wolfram in STEM/STEAM
  • Wolfram Challenges
  • Wolfram Problem Generator

Wolfram Initiatives

  • Wolfram Science
  • Wolfram Foundation
  • History of Mathematics Project

Events

  • Stephen Wolfram Livestreams
  • Online & In-Person Events
  • Contact Us
  • Connect & Follow
Wolfram|Alpha
  • Your Account
  • User Portal
  • Wolfram Cloud
  • Products
    • Wolfram|One
    • Mathematica
    • Wolfram Notebook Assistant + LLM Kit
    • System Modeler
    • Wolfram Player
    • Finance Platform
    • Wolfram|Alpha Notebook Edition
    • Wolfram Engine
    • Enterprise Private Cloud
    • Application Server
    • Wolfram Cloud App
    • Wolfram Player App

    More mobile apps

    • Core Technologies
      • Wolfram Language
      • Computable Data
      • Wolfram Notebooks
      • AI & Linguistic Understanding
    • Deployment Options
      • Wolfram Cloud
      • wolframscript
      • Wolfram Engine Community Edition
      • Wolfram LLM API
      • WSTPServer
      • Wolfram|Alpha APIs
    • From the Community
      • Function Repository
      • Community Paclet Repository
      • Example Repository
      • Neural Net Repository
      • Prompt Repository
      • Wolfram Demonstrations
      • Data Repository
    • Group & Organizational Licensing
    • All Products
  • Consulting & Solutions

    We deliver solutions for the AI era—combining symbolic computation, data-driven insights and deep technical expertise

    WolframConsulting.com

    Wolfram Solutions

    • Data Science
    • Artificial Intelligence
    • Biosciences
    • Healthcare Intelligence
    • Sustainable Energy
    • Control Systems
    • Enterprise Wolfram|Alpha
    • Blockchain Labs

    More Wolfram Solutions

    Wolfram Solutions For Education

    • Research Universities
    • Colleges & Teaching Universities
    • Junior & Community Colleges
    • High Schools
    • Educational Technology
    • Computer-Based Math

    More Solutions for Education

    • Contact Us
  • Learning & Support

    Get Started

    • Wolfram Language Introduction
    • Fast Intro for Programmers
    • Fast Intro for Math Students
    • Wolfram Language Documentation

    Grow Your Skills

    • Wolfram U

      Courses in computing, science, life and more

    • Community

      Learn, solve problems and share ideas.

    • Blog

      News, views and insights from Wolfram

    • Resources for

      Software Developers
    • Tech Support
      • Contact Us
      • Support FAQs
    • More Learning
      • Highlighted Core Areas
      • Demonstrations
      • YouTube
      • Daily Study Groups
      • Wolfram Schools and Programs
      • Books
    • Support FAQs
    • Contact Us
  • Company
    • About Wolfram
    • Career Center
    • All Sites & Resources
    • Connect & Follow
    • Contact Us

    Work with Us

    • Student Ambassador Initiative
    • Wolfram for Startups
    • Student Opportunities
    • Jobs Using Wolfram Language

    Educational Programs for Adults

    • Summer School
    • Winter School

    Educational Programs for Youth

    • Middle School Camp
    • High School Research Program
    • Computational Adventures

    Read

    • Stephen Wolfram's Writings
    • Wolfram Blog
    • Wolfram Tech | Books
    • Wolfram Media
    • Complex Systems
    • Educational Resources
      • Wolfram MathWorld
      • Wolfram in STEM/STEAM
      • Wolfram Challenges
      • Wolfram Problem Generator
    • Wolfram Initiatives
      • Wolfram Science
      • Wolfram Foundation
      • History of Mathematics Project
    • Events
      • Stephen Wolfram Livestreams
      • Online & In-Person Events
    • Contact Us
    • Connect & Follow
  • Wolfram|Alpha
  • Wolfram Cloud
  • Your Account
  • User Portal
Wolfram Language & System Documentation Center
"Tokens" (Net Encoder)
  • See Also
    • NetEncoder
    • NetDecoder
    • NetChain
    • NetGraph
    • TextElement

    • Net Encoders
    • Class
    • Characters
    • SubwordTokens

    • Net Decoders
    • Tokens
    • Characters
    • Class
    • SubwordTokens
  • Related Guides
    • Neural Networks
  • Tech Notes
    • Neural Networks in the Wolfram Language
    • See Also
      • NetEncoder
      • NetDecoder
      • NetChain
      • NetGraph
      • TextElement

      • Net Encoders
      • Class
      • Characters
      • SubwordTokens

      • Net Decoders
      • Tokens
      • Characters
      • Class
      • SubwordTokens
    • Related Guides
      • Neural Networks
    • Tech Notes
      • Neural Networks in the Wolfram Language
NET ENCODER
  • See Also
    • NetEncoder
    • NetDecoder
    • NetChain
    • NetGraph
    • TextElement

    • Net Encoders
    • Class
    • Characters
    • SubwordTokens

    • Net Decoders
    • Tokens
    • Characters
    • Class
    • SubwordTokens
  • Related Guides
    • Neural Networks
  • Tech Notes
    • Neural Networks in the Wolfram Language
    • See Also
      • NetEncoder
      • NetDecoder
      • NetChain
      • NetGraph
      • TextElement

      • Net Encoders
      • Class
      • Characters
      • SubwordTokens

      • Net Decoders
      • Tokens
      • Characters
      • Class
      • SubwordTokens
    • Related Guides
      • Neural Networks
    • Tech Notes
      • Neural Networks in the Wolfram Language

"Tokens" (Net Encoder)

NetEncoder["Tokens"]

represents an encoder that converts the words in a string to a sequence of integer codes using a standard English vocabulary.

NetEncoder[{"Tokens","language"}]

represents an encoder that uses a standard vocabulary for the given language.

NetEncoder[{"Tokens",{token1,token2,…}}]

represents an encoder that uses a specified list of tokens as the vocabulary.

NetEncoder[{"Tokens",…,"param"value}]

represents an encoder in which additional parameters have been specified.

Details

  • NetEncoder[…][input] applies the encoder to an input to produce an output.
  • NetEncoder[…][{input1,input2,…}] applies the encoder to a list of strings to produce a list of outputs.
  • The input to the encoder must be a string or a TextElement with a sequence of strings that represents tokens. If it is a string, the segmentation into tokens will be done using a regular expression based on the value of "SplitPattern".
  • The output of the encoder is a sequence of integers between 1 and d+1, where d is the number of tokens in the vocabulary. The integer d+1 is used to signify tokens in the input that do not occur in the dictionary.
  • The type of the output NumericArray is the smallest unsigned integer that can represent all possible output integer values.
  • An encoder can be attached to an input port of a net by specifying "port"->NetEncoder[…] when constructing the net.
  • Parameters
  • The following parameters can be specified:
  • "IgnoreCase"Truewhether to ignore case when matching tokens from the string
    "SplitPattern"TemplateBox[{WordBoundary, paclet:ref/WordBoundary}, RefLink, BaseStyle -> {3ColumnTableMod}]the string pattern to use in order to split the input string into tokens
    "TargetLength"Allthe length of the final sequence to crop or pad to
  • With the parameter "IgnoreCase"->True, tokens are effectively converted to lowercase before encoding.
  • With the parameter "TargetLength"->All, all tokens found in the input string are encoded.
  • With the parameter "TargetLength"->n, the first n tokens found in the input string are encoded, with padding applied if fewer than n tokens are found. The padding value is d+1, where d is the number of tokens in the vocabulary.
  • With the parameter "SplitPattern"->None, the input to the encoder is assumed to be a pre-tokenized list of strings of the form {"token1","token2",…}.

Examples

open all close all

Basic Examples  (1)

Create a token encoder for English text:

Encode an English sentence:

Out-of-vocabulary words are encoded as the maximum code:

By default, words are detected using a simple regular expression:

The list of words can be explicitly passed using TextElement:

Scope  (6)

Use the default token encoder to encode a sentence:

Give a specific list of tokens:

Give a specific list of tokens, including a split pattern:

Specify that the sequence should be padded or trimmed to be 4 elements long:

Use a built-in dictionary for a specific language:

Use a custom tokenization with TextElement:

Use the output of TextStructure to compute a list of token indices:

A tree structure gets flattened:

Parameters  (3)

"IgnoreCase"  (1)

An encoder with "IgnoreCase"->True treats tokens that differ only by the case of their constituent characters as equivalent:

An encoder with "IgnoreCase"->False does not do this:

"SplitPattern"  (2)

Create an encoder that isolates digit characters, using "SplitPattern":

The encoder outputs one token for each digit character:

It is different from the default behavior, which gathers all consecutive digit characters together:

Create an encoder with "SplitPattern"->None and two tokens:

The encoder now expects a list of tokens as input:

The encoder still maps across a batch of examples:

See Also

NetEncoder  NetDecoder  NetChain  NetGraph  TextElement

Net Encoders: Class  Characters  SubwordTokens

Net Decoders: Tokens  Characters  Class  SubwordTokens

Tech Notes

    ▪
  • Neural Networks in the Wolfram Language

Related Guides

    ▪
  • Neural Networks

History

Introduced in 2018 (11.3) | Updated in 2019 (12.0)

Top
Introduction for Programmers
Introductory Book
Wolfram Function Repository | Wolfram Data Repository | Wolfram Data Drop | Wolfram Language Products
Top
  • Products
  • Wolfram|One
  • Mathematica
  • Notebook Assistant + LLM Kit
  • System Modeler

  • Wolfram|Alpha Notebook Edition
  • Wolfram|Alpha Pro
  • Mobile Apps

  • Wolfram Player
  • Wolfram Engine

  • Volume & Site Licensing
  • Server Deployment Options
  • Consulting
  • Wolfram Consulting
  • Repositories
  • Data Repository
  • Function Repository
  • Community Paclet Repository
  • Neural Net Repository
  • Prompt Repository

  • Wolfram Language Example Repository
  • Notebook Archive
  • Wolfram GitHub
  • Learning
  • Wolfram U
  • Wolfram Language Documentation
  • Webinars & Training
  • Educational Programs

  • Wolfram Language Introduction
  • Fast Introduction for Programmers
  • Fast Introduction for Math Students
  • Books

  • Wolfram Community
  • Wolfram Blog
  • Public Resources
  • Wolfram|Alpha
  • Wolfram Problem Generator
  • Wolfram Challenges

  • Computer-Based Math
  • Computational Thinking
  • Computational Adventures

  • Demonstrations Project
  • Wolfram Data Drop
  • MathWorld
  • Wolfram Science
  • Wolfram Media Publishing
  • Customer Resources
  • Store
  • Product Downloads
  • User Portal
  • Your Account
  • Organization Access

  • Support FAQ
  • Contact Support
  • Company
  • About Wolfram
  • Careers
  • Contact
  • Events
Wolfram Community Wolfram Blog
Legal & Privacy Policy
WolframAlpha.com | WolframCloud.com
© 2025 Wolfram
© 2025 Wolfram | Legal & Privacy Policy |
English