Tiny Mamba SSM Lab

This repository is primarily for learning and experimenting with state space models, especially readable Mamba-2-style and Mamba-3-style architectures.

The current end-to-end example tasks are:

raw language-model pretraining on TinyStories or local text
fine-tuning on a toy subject-predicate-object extraction task

The architecture is the main subject. The SPO pipeline is only one downstream example used to exercise the model.

The default mamba3 settings now use an approximately 1M-parameter model so the repo remains small but is better suited to modest GPUs.

What This Repo Is For

studying how selective state space updates work in practice
comparing a simpler Mamba-2-style block against a richer Mamba-3-style block
running small CPU-friendly architecture experiments
using pretraining plus fine-tuning as a test harness for new block ideas

This is an educational codebase, not a production implementation.

What Is Implemented

mamba2 A small Mamba-2-style block with depthwise convolution and selective state updates.
mamba3 An educational Mamba-3-style block with exponential-trapezoidal updates, rotary or complex state mixing, BC normalization, B/C biases, and optional MIMO rank.
byte and character tokenizers
raw language-model pretraining
a toy downstream fine-tuning task

Learning Path

The detailed docs live in docs/:

If you want to understand the architecture, start with the first four. If you want to run experiments, then move to the last two.

Repository Layout

ssmlab/mamba/model.py Core architecture: config, Mamba-2 block, Mamba-3 block, and the tiny causal language model.
ssmlab/common/tokenizer.py Character and byte tokenizers.
ssmlab/common/pretrain_data.py Raw-text dataset utilities for language-model pretraining.
ssmlab/mamba/pretrain.py Pretraining entry point for TinyStories or local text.
ssmlab/common/data.py Synthetic SPO task generation and output parsing.
ssmlab/mamba/train.py Fine-tuning loop for the SPO demo task.
ssmlab/mamba/infer.py Inference CLI for the SPO demo task.

Install

uv sync

If you want uv to manage Python too:

uv python install 3.13
uv sync --python 3.13

ssmlab YAML CLI

The top-level workflow is now driven by ssmlab, a Python CLI backed by ssmlab.yaml.

The CLI shape is:

uv run ssmlab pretrain --model <name> --target <target>
uv run ssmlab train --model <name> --target <target>
uv run ssmlab infer --model <name> --target <target>

--model selects a named model version from YAML, for example mamba3-1b-a1b2. --target selects where to run it, for example local or a named SSH machine such as gpu-box.

The config file is split into:

models: named model versions and their task configs
data_sources: reusable dataset definitions for training tasks
targets: local or SSH machines
tasks.pretrain / tasks.train / tasks.infer: top-level CLI tasks

Right now, the named data-source layer is implemented only for tinystories. The schema is there for additional sources later, but the sample config keeps it on TinyStories only.

The default ssmlab.yaml in the repo already shows two model versions:

mamba3-1b-a1b2
mamba3-1b-a1b2-debug

Each model keeps its shared architecture params once under shared_args, then adds task-specific args under tasks.pretrain.args and tasks.train.args. The TinyStories dataset settings now live under top-level data_sources, and tasks.pretrain references one with data_source: ....

If you need one-off flag overrides, append them after --:

uv run ssmlab pretrain --model mamba3-1b-a1b2 --target local -- --max-train-stories 4000 --log-every 5

Remote GPU On Vast

For the default ~1M-parameter mamba3 setup, the cheapest sensible Vast target is usually a single RTX 3060 12GB or Tesla T4 16GB. A 6GB card is the practical floor, but 8GB+ is the safer starting point because it gives more headroom for batch size and longer sequences.

Configure the machine once in ssmlab.yaml under targets. For example:

targets:
  gpu-box:
    kind: ssh
    host: root@example.com
    port: 22
    remote_dir: /workspace/ssm
    bootstrap_python: python3
    python_bin: python
    venv_dir: /workspace/ssm/.venv
    device: cuda
    detach: true
    log_file: runs/remote-gpu.log
    pid_file: runs/remote-gpu.pid

Then launch the configured model version on that machine:

uv run ssmlab pretrain --model mamba3-1b-a1b2 --target gpu-box

The SSH target path:

syncs the repo with rsync
creates or refreshes a remote virtualenv
reuses the machine's system PyTorch when available
installs a CUDA-enabled torch wheel if needed
installs this package plus runtime dependencies
validates torch.cuda.is_available()
runs the configured task in the remote repo

With detach: true, the remote target runs under nohup-style detached execution and writes logs and pid files to the configured paths.

Quick Start

1. Pretrain The Architecture

uv run ssmlab pretrain --model mamba3-1b-a1b2 --target local

This trains the architecture as a plain language model and is the cleanest way to study how the SSM behaves on raw text.

To use a local corpus instead of TinyStories, either update the YAML model definition or append a one-off override:

uv run ssmlab pretrain --model mamba3-1b-a1b2 --target local -- --text-file ./some_corpus.txt --output-dir runs/local_lm

2. Train The Model

The second CLI task is train. It resolves tasks.train for the same named model. In the sample YAML, train is wired to fine-tuning:

uv run ssmlab train --model mamba3-1b-a1b2 --target local

If you want a different action for train, point that task at another module or command in ssmlab.yaml.

3. Low-Level CLIs Still Exist

If you want to bypass the YAML layer entirely, the low-level entry points are still available:

uv run ssmlab-mamba-train \
  --architecture mamba3 \
  --tokenizer byte \
  --mimo-rank 2 \
  --d-model 160 \
  --n-layers 5 \
  --head-dim 16 \
  --state-dim 16 \
  --init-checkpoint runs/tinystories_mamba3_1m/best.pt \
  --output-dir runs/spo_mamba3_1m_ft

4. Run Inference On The Demo Task

The top-level YAML CLI now supports inference too:

uv run ssmlab infer --model mamba3-1b-a1b2 --target local -- --text "Alice reads a book and Bob drives a car."

This resolves tasks.infer from ssmlab.yaml, which points at the model's fine-tuned checkpoint by default.

The low-level CLI still works if you want to pass the checkpoint explicitly:

uv run ssmlab-mamba-infer \
  --checkpoint runs/spo_mamba3_1m_ft/best.pt \
  --text "Alice reads a book and Bob drives a car."

Example output:

raw: [(alice, book, reads), (bob, car, drives)]
triples: [('alice', 'book', 'reads'), ('bob', 'car', 'drives')]

Important Notes

mamba2 requires --mimo-rank 1.
TinyStories pretraining is configured around the byte tokenizer.
Fine-tuning can use either tokenizer, but --init-checkpoint requires the tokenizer and model shapes to match exactly.
The implementation is intentionally readable and CPU-friendly, not optimized.
The current downstream task is synthetic and narrow by design.

If You Want To Experiment

Good first experiments:

compare mamba2 vs mamba3
vary state_dim
vary head_dim
vary mimo_rank for mamba3
pretrain first, then fine-tune

Those experiments are more aligned with the purpose of the repo than the specific SPO benchmark itself.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
ssmlab		ssmlab
.gitignore		.gitignore
2603.15569v1.pdf		2603.15569v1.pdf
README.md		README.md
pyproject.toml		pyproject.toml
ssmlab.yaml		ssmlab.yaml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tiny Mamba SSM Lab

What This Repo Is For

What Is Implemented

Learning Path

Repository Layout

Install

ssmlab YAML CLI

Remote GPU On Vast

Quick Start

1. Pretrain The Architecture

2. Train The Model

3. Low-Level CLIs Still Exist

4. Run Inference On The Demo Task

Important Notes

If You Want To Experiment

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tiny Mamba SSM Lab

What This Repo Is For

What Is Implemented

Learning Path

Repository Layout

Install

ssmlab YAML CLI

Remote GPU On Vast

Quick Start

1. Pretrain The Architecture

2. Train The Model

3. Low-Level CLIs Still Exist

4. Run Inference On The Demo Task

Important Notes

If You Want To Experiment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages