Products
  • Wolfram|One

    The definitive Wolfram Language and notebook experience

  • Mathematica

    The original technical computing environment

  • Wolfram Notebook Assistant + LLM Kit

    All-in-one AI assistance for your Wolfram experience

  • System Modeler
  • Wolfram Player
  • Finance Platform
  • Wolfram Engine
  • Enterprise Private Cloud
  • Application Server
  • Wolfram|Alpha Notebook Edition
  • Wolfram Cloud App
  • Wolfram Player App

More mobile apps

Core Technologies of Wolfram Products

  • Wolfram Language
  • Computable Data
  • Wolfram Notebooks
  • AI & Linguistic Understanding

Deployment Options

  • Wolfram Cloud
  • wolframscript
  • Wolfram Engine Community Edition
  • Wolfram LLM API
  • WSTPServer
  • Wolfram|Alpha APIs

From the Community

  • Function Repository
  • Community Paclet Repository
  • Example Repository
  • Neural Net Repository
  • Prompt Repository
  • Wolfram Demonstrations
  • Data Repository
  • Group & Organizational Licensing
  • All Products
Consulting & Solutions

We deliver solutions for the AI era—combining symbolic computation, data-driven insights and deep technical expertise

  • Data & Computational Intelligence
  • Model-Based Design
  • Algorithm Development
  • Wolfram|Alpha for Business
  • Blockchain Technology
  • Education Technology
  • Quantum Computation

WolframConsulting.com

Wolfram Solutions

  • Data Science
  • Artificial Intelligence
  • Biosciences
  • Healthcare Intelligence
  • Sustainable Energy
  • Control Systems
  • Enterprise Wolfram|Alpha
  • Blockchain Labs

More Wolfram Solutions

Wolfram Solutions For Education

  • Research Universities
  • Colleges & Teaching Universities
  • Junior & Community Colleges
  • High Schools
  • Educational Technology
  • Computer-Based Math

More Solutions for Education

  • Contact Us
Learning & Support

Get Started

  • Wolfram Language Introduction
  • Fast Intro for Programmers
  • Fast Intro for Math Students
  • Wolfram Language Documentation

More Learning

  • Highlighted Core Areas
  • Demonstrations
  • YouTube
  • Daily Study Groups
  • Wolfram Schools and Programs
  • Books

Grow Your Skills

  • Wolfram U

    Courses in computing, science, life and more

  • Community

    Learn, solve problems and share ideas.

  • Blog

    News, views and insights from Wolfram

  • Resources for

    Software Developers

Tech Support

  • Contact Us
  • Support FAQs
  • Support FAQs
  • Contact Us
Company
  • About Wolfram
  • Career Center
  • All Sites & Resources
  • Connect & Follow
  • Contact Us

Work with Us

  • Student Ambassador Initiative
  • Wolfram for Startups
  • Student Opportunities
  • Jobs Using Wolfram Language

Educational Programs for Adults

  • Summer School
  • Winter School

Educational Programs for Youth

  • Middle School Camp
  • High School Research Program
  • Computational Adventures

Read

  • Stephen Wolfram's Writings
  • Wolfram Blog
  • Wolfram Tech | Books
  • Wolfram Media
  • Complex Systems

Educational Resources

  • Wolfram MathWorld
  • Wolfram in STEM/STEAM
  • Wolfram Challenges
  • Wolfram Problem Generator

Wolfram Initiatives

  • Wolfram Science
  • Wolfram Foundation
  • History of Mathematics Project

Events

  • Stephen Wolfram Livestreams
  • Online & In-Person Events
  • Contact Us
  • Connect & Follow
Wolfram|Alpha
  • Your Account
  • User Portal
  • Wolfram Cloud
  • Products
    • Wolfram|One
    • Mathematica
    • Wolfram Notebook Assistant + LLM Kit
    • System Modeler
    • Wolfram Player
    • Finance Platform
    • Wolfram|Alpha Notebook Edition
    • Wolfram Engine
    • Enterprise Private Cloud
    • Application Server
    • Wolfram Cloud App
    • Wolfram Player App

    More mobile apps

    • Core Technologies
      • Wolfram Language
      • Computable Data
      • Wolfram Notebooks
      • AI & Linguistic Understanding
    • Deployment Options
      • Wolfram Cloud
      • wolframscript
      • Wolfram Engine Community Edition
      • Wolfram LLM API
      • WSTPServer
      • Wolfram|Alpha APIs
    • From the Community
      • Function Repository
      • Community Paclet Repository
      • Example Repository
      • Neural Net Repository
      • Prompt Repository
      • Wolfram Demonstrations
      • Data Repository
    • Group & Organizational Licensing
    • All Products
  • Consulting & Solutions

    We deliver solutions for the AI era—combining symbolic computation, data-driven insights and deep technical expertise

    WolframConsulting.com

    Wolfram Solutions

    • Data Science
    • Artificial Intelligence
    • Biosciences
    • Healthcare Intelligence
    • Sustainable Energy
    • Control Systems
    • Enterprise Wolfram|Alpha
    • Blockchain Labs

    More Wolfram Solutions

    Wolfram Solutions For Education

    • Research Universities
    • Colleges & Teaching Universities
    • Junior & Community Colleges
    • High Schools
    • Educational Technology
    • Computer-Based Math

    More Solutions for Education

    • Contact Us
  • Learning & Support

    Get Started

    • Wolfram Language Introduction
    • Fast Intro for Programmers
    • Fast Intro for Math Students
    • Wolfram Language Documentation

    Grow Your Skills

    • Wolfram U

      Courses in computing, science, life and more

    • Community

      Learn, solve problems and share ideas.

    • Blog

      News, views and insights from Wolfram

    • Resources for

      Software Developers
    • Tech Support
      • Contact Us
      • Support FAQs
    • More Learning
      • Highlighted Core Areas
      • Demonstrations
      • YouTube
      • Daily Study Groups
      • Wolfram Schools and Programs
      • Books
    • Support FAQs
    • Contact Us
  • Company
    • About Wolfram
    • Career Center
    • All Sites & Resources
    • Connect & Follow
    • Contact Us

    Work with Us

    • Student Ambassador Initiative
    • Wolfram for Startups
    • Student Opportunities
    • Jobs Using Wolfram Language

    Educational Programs for Adults

    • Summer School
    • Winter School

    Educational Programs for Youth

    • Middle School Camp
    • High School Research Program
    • Computational Adventures

    Read

    • Stephen Wolfram's Writings
    • Wolfram Blog
    • Wolfram Tech | Books
    • Wolfram Media
    • Complex Systems
    • Educational Resources
      • Wolfram MathWorld
      • Wolfram in STEM/STEAM
      • Wolfram Challenges
      • Wolfram Problem Generator
    • Wolfram Initiatives
      • Wolfram Science
      • Wolfram Foundation
      • History of Mathematics Project
    • Events
      • Stephen Wolfram Livestreams
      • Online & In-Person Events
    • Contact Us
    • Connect & Follow
  • Wolfram|Alpha
  • Wolfram Cloud
  • Your Account
  • User Portal
Wolfram Language & System Documentation Center
LongShortTermMemoryLayer
  • See Also
    • NetStateObject
    • BasicRecurrentLayer
    • GatedRecurrentLayer
    • NetBidirectionalOperator
    • NetFoldOperator
    • NetMapOperator
    • SequenceLastLayer
    • LinearLayer
    • NetChain
    • NetGraph
    • NetExtract
  • Related Guides
    • Natural Language Processing
    • Neural Network Layers
    • Speech Computation
  • Tech Notes
    • Neural Networks in the Wolfram Language
    • See Also
      • NetStateObject
      • BasicRecurrentLayer
      • GatedRecurrentLayer
      • NetBidirectionalOperator
      • NetFoldOperator
      • NetMapOperator
      • SequenceLastLayer
      • LinearLayer
      • NetChain
      • NetGraph
      • NetExtract
    • Related Guides
      • Natural Language Processing
      • Neural Network Layers
      • Speech Computation
    • Tech Notes
      • Neural Networks in the Wolfram Language

LongShortTermMemoryLayer[n]

represents a trainable recurrent layer that takes a sequence of vectors and produces a sequence of vectors, each of size n.

LongShortTermMemoryLayer[n,opts]

includes options for weights and other parameters.

Details and Options
Details and Options Details and Options
Examples  
Basic Examples  
Scope  
Options  
"Dropout"  
Applications  
Properties & Relations  
See Also
Tech Notes
Related Guides
History
Cite this Page
BUILT-IN SYMBOL
  • See Also
    • NetStateObject
    • BasicRecurrentLayer
    • GatedRecurrentLayer
    • NetBidirectionalOperator
    • NetFoldOperator
    • NetMapOperator
    • SequenceLastLayer
    • LinearLayer
    • NetChain
    • NetGraph
    • NetExtract
  • Related Guides
    • Natural Language Processing
    • Neural Network Layers
    • Speech Computation
  • Tech Notes
    • Neural Networks in the Wolfram Language
    • See Also
      • NetStateObject
      • BasicRecurrentLayer
      • GatedRecurrentLayer
      • NetBidirectionalOperator
      • NetFoldOperator
      • NetMapOperator
      • SequenceLastLayer
      • LinearLayer
      • NetChain
      • NetGraph
      • NetExtract
    • Related Guides
      • Natural Language Processing
      • Neural Network Layers
      • Speech Computation
    • Tech Notes
      • Neural Networks in the Wolfram Language

LongShortTermMemoryLayer

LongShortTermMemoryLayer[n]

represents a trainable recurrent layer that takes a sequence of vectors and produces a sequence of vectors, each of size n.

LongShortTermMemoryLayer[n,opts]

includes options for weights and other parameters.

Details and Options

  • LongShortTermMemoryLayer[n] represents a net that takes an input matrix representing a sequence of vectors and outputs a sequence of the same length.
  • Each element of the input sequence is a vector of size k, and each element of the output sequence is a vector of size n.
  • The size k of the input vectors is usually inferred automatically within a NetGraph, NetChain, etc.
  • The input and output ports of the net represented by LongShortTermMemoryLayer[n] are:
  • "Input"a sequence of vectors of size k
    "Output"a sequence of vectors of size n
  • Given an input sequence {x1,x2,…,xT}, the LSTM outputs a sequence of states {s1,s2,…,sT} using the following recurrence relation:
  • input gateit=LogisticSigmoid[Wix.xt+Wis.st-1+bi]
    output gateot=LogisticSigmoid[Wox.xt+Wos.st-1+bo]
    forget gateft=LogisticSigmoid[Wfx.xt+Wfs.st-1+bf]
    memory gatemt=Tanh[Wmx.xt+Wms.st-1+bm]
    cell statect=ft*ct-1+it*mt
    statest=ot*Tanh[ct]
  • LongShortTermMemoryLayer[n] has the following state ports:
  • "State"a vector of size n
    "CellState"a vector of size n
  • Within a NetGraph, a connection of the form src->NetPort[layer,"state"] can be used to provide the initial value of "State" or "CellState" for a LongShortTermMemoryLayer, corresponding to s0 and c0 in the recurrence relation. The default initial values are zero vectors.
  • Within a NetGraph, a connection of the form NetPort[layer,"state"]->dst can be used to obtain the final value of "State" or "CellState" for a LongShortTermMemoryLayer, corresponding to sT and cT in the recurrence relation.
  • NetStateObject can be used to create a net that will remember values for the state of LongShortTermMemoryLayer that update when the net is applied to inputs.
  • An initialized LongShortTermMemoryLayer[…] that operates on vectors of size k contains the following trainable arrays:
  • "InputGateInputWeights"Wixmatrix of size n×k
    "InputGateStateWeights"Wismatrix of size n×n
    "InputGateBiases"bivector of size n
    "OutputGateInputWeights"Woxmatrix of size n×k
    "OutputGateStateWeights"Wosmatrix of size n×n
    "OutputGateBiases"bovector of size n
    "ForgetGateInputWeights"Wfxmatrix of size n×k
    "ForgetGateStateWeights"Wfsmatrix of size n×n
    "ForgetGateBiases"bfvector of size n
    "MemoryGateInputWeights"Wmxmatrix of size n×k
    "MemoryGateStateWeights"Wmsmatrix of size n×n
    "MemoryGateBiases"bmvector of size n
  • In LongShortTermMemoryLayer[n,opts], initial values can be given to the trainable arrays using a rule of the form "array"->value.
  • The following training parameters can be included:
  • "Dropout" Nonedropout regularization, in which units are probabilistically set to zero
    LearningRateMultipliersAutomaticlearning rate multipliers for the trainable arrays
  • Specifying "Dropout"->None disables dropout during training.
  • Specifying "Dropout"->p uses an automatically chosen dropout method having dropout probability p.
  • Specifying "Dropout"->{"method1"->p1,"method2"->p2,…} can be used to combine specific methods of dropout with the corresponding dropout probabilities. Possible methods include:
  • "VariationalWeights"dropout applied to the recurrent connections between weight matrices (default)
    "VariationalInput"dropout applied to the gate contributions from the input, using the same pattern of units at each sequence step
    "VariationalState"dropout applied to the gate contributions from the previous state, using the same pattern of units at each sequence step
    "StateUpdate"dropout applied to the state update vector prior to it being added to the previous state, using a different pattern of units at each sequence step
  • The dropout methods "VariationalInput" and "VariationalState" are based on the Gal et al. 2016 method, while "StateUpdate" is based on the Semeniuta et al. 2016 method and "VariationalWeights" is based on the Merity et al. 2017 method.
  • LongShortTermMemoryLayer[n,"Input"->shape] allows the shape of the input to be specified. Possible forms for shape are:
  • NetEncoder[…]encoder producing a sequence of vectors
    {len,k}sequence of len length-k vectors
    {len,Automatic}sequence of len vectors whose length is inferred
    {"Varying",k}varying number of vectors each of length k
    {"Varying",Automatic}varying number of vectors each of inferred length
  • When given a NumericArray as input, the output will be a NumericArray.
  • Options[LongShortTermMemoryLayer] gives the list of default options to construct the layer. Options[LongShortTermMemoryLayer[…]] gives the list of default options to evaluate the layer on some data.
  • Information[LongShortTermMemoryLayer[…]] gives a report about the layer.
  • Information[LongShortTermMemoryLayer[…],prop] gives the value of the property prop of LongShortTermMemoryLayer[…]. Possible properties are the same as for NetGraph.

Examples

open all close all

Basic Examples  (2)

Create a LongShortTermMemoryLayer that produces a sequence of length-3 vectors:

Create a randomly initialized LongShortTermMemoryLayer that takes a sequence of length-2 vectors and produces a sequence of length-3 vectors:

Apply the layer to an input sequence:

Scope  (4)

Create a randomly initialized LongShortTermMemoryLayer that takes a string and produces a sequence of length-2 vectors:

Apply the layer to an input string:

Thread the layer over a batch of inputs:

Create a randomly initialized net that takes a sequence of length-2 vectors and produces a single length-3 vector:

Apply the layer to an input:

Thread the layer across a batch of inputs:

Create a NetGraph that allows the initial state and cell state of a LongShortTermMemoryLayer to be set:

Apply the graph to some data:

Create a NetGraph that allows the final state and cell state of a LongShortTermMemoryLayer to be obtained:

The final state is last element of the output sequence:

Options  (2)

"Dropout"  (2)

Create a LongShortTermMemoryLayer with the dropout method specified:

Create a randomly initialized LongShortTermMemoryLayer with specified dropout probability:

Evaluate the layer on a sequence of vectors:

Dropout has no effect during evaluation:

Use NetEvaluationMode to force the training behavior of dropout:

Multiple evaluations on the same input can give different results:

Applications  (2)

Create training data consisting of strings that describe two-digit additions and the corresponding numeric result:

Create a network using stacked LongShortTermMemoryLayer layers that reads the input string and predicts the numeric result:

Train the network:

Apply the trained network to a list of inputs:

Create training data based on strings containing x's and y's and either Less, Greater or Equal by comparing the number of x's and y's. The training data consists of all possible sentences up to length 8:

Create a network containing a LongShortTermMemoryLayer to read an input string and predict one of Less, Greater or Equal:

Train the network:

Apply the trained network to a list of inputs:

Measure the accuracy on the entire training set:

Properties & Relations  (1)

NetStateObject can be used to create a net that remembers the state of LongShortTermMemoryLayer:

Each evaluation modifies the state stored inside the NetStateObject:

See Also

NetStateObject  BasicRecurrentLayer  GatedRecurrentLayer  NetBidirectionalOperator  NetFoldOperator  NetMapOperator  SequenceLastLayer  LinearLayer  NetChain  NetGraph  NetExtract

Tech Notes

    ▪
  • Neural Networks in the Wolfram Language

Related Guides

    ▪
  • Natural Language Processing
  • ▪
  • Neural Network Layers
  • ▪
  • Speech Computation

History

Introduced in 2017 (11.1) | Updated in 2020 (12.1)

Wolfram Research (2017), LongShortTermMemoryLayer, Wolfram Language function, https://reference.wolfram.com/language/ref/LongShortTermMemoryLayer.html (updated 2020).

Text

Wolfram Research (2017), LongShortTermMemoryLayer, Wolfram Language function, https://reference.wolfram.com/language/ref/LongShortTermMemoryLayer.html (updated 2020).

CMS

Wolfram Language. 2017. "LongShortTermMemoryLayer." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/LongShortTermMemoryLayer.html.

APA

Wolfram Language. (2017). LongShortTermMemoryLayer. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/LongShortTermMemoryLayer.html

BibTeX

@misc{reference.wolfram_2025_longshorttermmemorylayer, author="Wolfram Research", title="{LongShortTermMemoryLayer}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/LongShortTermMemoryLayer.html}", note=[Accessed: 01-December-2025]}

BibLaTeX

@online{reference.wolfram_2025_longshorttermmemorylayer, organization={Wolfram Research}, title={LongShortTermMemoryLayer}, year={2020}, url={https://reference.wolfram.com/language/ref/LongShortTermMemoryLayer.html}, note=[Accessed: 01-December-2025]}

Top
Introduction for Programmers
Introductory Book
Wolfram Function Repository | Wolfram Data Repository | Wolfram Data Drop | Wolfram Language Products
Top
  • Products
  • Wolfram|One
  • Mathematica
  • Notebook Assistant + LLM Kit
  • System Modeler

  • Wolfram|Alpha Notebook Edition
  • Wolfram|Alpha Pro
  • Mobile Apps

  • Wolfram Player
  • Wolfram Engine

  • Volume & Site Licensing
  • Server Deployment Options
  • Consulting
  • Wolfram Consulting
  • Repositories
  • Data Repository
  • Function Repository
  • Community Paclet Repository
  • Neural Net Repository
  • Prompt Repository

  • Wolfram Language Example Repository
  • Notebook Archive
  • Wolfram GitHub
  • Learning
  • Wolfram U
  • Wolfram Language Documentation
  • Webinars & Training
  • Educational Programs

  • Wolfram Language Introduction
  • Fast Introduction for Programmers
  • Fast Introduction for Math Students
  • Books

  • Wolfram Community
  • Wolfram Blog
  • Public Resources
  • Wolfram|Alpha
  • Wolfram Problem Generator
  • Wolfram Challenges

  • Computer-Based Math
  • Computational Thinking
  • Computational Adventures

  • Demonstrations Project
  • Wolfram Data Drop
  • MathWorld
  • Wolfram Science
  • Wolfram Media Publishing
  • Customer Resources
  • Store
  • Product Downloads
  • User Portal
  • Your Account
  • Organization Access

  • Support FAQ
  • Contact Support
  • Company
  • About Wolfram
  • Careers
  • Contact
  • Events
Wolfram Community Wolfram Blog
Legal & Privacy Policy
WolframAlpha.com | WolframCloud.com
© 2025 Wolfram
© 2025 Wolfram | Legal & Privacy Policy |
English