llmfit

A terminal tool that right-sizes LLM models to your system's RAM, CPU, and GPU. Detects your hardware, compares it against a database of 33 popular models, and tells you which ones will actually run on your machine.

Ships with an interactive TUI (default) and a classic CLI mode.

Quick install

curl -fsSL https://llmfit.axjns.dev/install.sh | sh

Downloads the latest release binary from GitHub and installs it to /usr/local/bin (or ~/.local/bin).

Example of a medium performance home laptop

Install

From source

git clone https://github.com/AlexsJones/llmfit.git
cd llmfit
cargo build --release
# binary is at target/release/llmfit

Usage

TUI (default)

llmfit

Launches the interactive terminal UI. Your system specs are shown at the top. Models are listed in a scrollable table sorted by compatibility.

Key	Action
`Up` / `Down` or `j` / `k`	Navigate models
`/`	Enter search mode (partial match on name, provider, params, use case)
`Esc` or `Enter`	Exit search mode
`Ctrl-U`	Clear search
`f`	Cycle fit filter: All, Runnable, Perfect, Good, Marginal
`1`-`9`	Toggle provider visibility
`Enter`	Toggle detail view for selected model
`PgUp` / `PgDn`	Scroll by 10
`g` / `G`	Jump to top / bottom
`q`	Quit

CLI mode

Use --cli or any subcommand to get classic table output:

# Table of all models ranked by fit
llmfit --cli

# Only perfectly fitting models, top 5
llmfit fit --perfect -n 5

# Show detected system specs
llmfit system

# List all models in the database
llmfit list

# Search by name, provider, or size
llmfit search "llama 8b"

# Detailed view of a single model
llmfit info "Mistral-7B"

How it works

Hardware detection -- Reads total/available RAM via sysinfo, counts CPU cores, and probes for NVIDIA (nvidia-smi) or AMD (rocm-smi) GPUs.
Model database -- 33 models sourced from the HuggingFace API, stored in data/hf_models.json and embedded at compile time. Memory requirements are computed from parameter counts using Q4_K_M quantization (0.5 bytes/param). VRAM is the primary constraint for GPU inference; system RAM is the fallback for CPU-only execution.
Fit analysis -- Each model is scored against available memory with awareness of GPU vs CPU execution:

Run modes:
- GPU -- Model fits in VRAM. Fast inference.
- CPU+GPU -- VRAM insufficient, model spills to system RAM with partial GPU offload.
- CPU -- No GPU detected. Model loaded entirely into system RAM. Slow.
Fit levels:
- Perfect -- Recommended memory met on GPU (VRAM). Requires GPU acceleration.
- Good -- Fits with headroom. Best achievable for CPU+GPU offload.
- Marginal -- Tight fit, or CPU-only (CPU-only always caps here).
- Too Tight -- Not enough VRAM or system RAM anywhere.

Model database

The model list is generated by scripts/scrape_hf_models.py, a standalone Python script (stdlib only, no pip dependencies) that queries the HuggingFace REST API. Models include families from Meta Llama, Mistral, Qwen, Google Gemma, Microsoft Phi, DeepSeek, Cohere, BigCode, and Nomic.

See MODELS.md for the full list of all 33 included models with parameters, quantization, context length, and use case.

To refresh:

python3 scripts/scrape_hf_models.py
cargo build

The scraper writes data/hf_models.json, which is baked into the binary via include_str!.

Project structure

src/
  main.rs         -- CLI argument parsing, entrypoint, TUI launch
  hardware.rs     -- System RAM/CPU/GPU detection
  models.rs       -- Model database loaded from embedded JSON
  fit.rs          -- Compatibility analysis (FitLevel scoring)
  display.rs      -- Classic CLI table rendering (tabled crate)
  tui_app.rs      -- TUI application state, filters, navigation
  tui_ui.rs       -- TUI rendering (ratatui)
  tui_events.rs   -- TUI keyboard event handling (crossterm)
data/
  hf_models.json  -- Model database (33 models)
scripts/
  scrape_hf_models.py  -- HuggingFace API scraper

Publishing to crates.io

The Cargo.toml already includes the required metadata (description, license, repository). To publish:

# Dry run first to catch issues
cargo publish --dry-run

# Publish for real (requires a crates.io API token)
cargo login
cargo publish

Before publishing, make sure:

The version in Cargo.toml is correct (bump with each release).
A LICENSE file exists in the repo root. Create one if missing:

# For MIT license:
curl -sL https://opensource.org/license/MIT -o LICENSE
# Or write your own. The Cargo.toml declares license = "MIT".

data/hf_models.json is committed. It is embedded at compile time and must be present in the published crate.
The exclude list in Cargo.toml keeps target/, scripts/, and demo.gif out of the published crate to keep the download small.

To publish updates:

# Bump version
# Edit Cargo.toml: version = "0.2.0"
cargo publish

Dependencies

Crate	Purpose
`clap`	CLI argument parsing with derive macros
`sysinfo`	Cross-platform RAM and CPU detection
`serde` / `serde_json`	JSON deserialization for model database
`tabled`	CLI table formatting
`colored`	CLI colored output
`ratatui`	Terminal UI framework
`crossterm`	Terminal input/output backend for ratatui

Platform support

Linux -- Full support. GPU detection via nvidia-smi (NVIDIA) and rocm-smi (AMD).
macOS (Apple Silicon) -- Full support. Detects unified memory via system_profiler. VRAM = system RAM (shared pool). Models run via Metal GPU acceleration.
macOS (Intel) -- RAM and CPU detection works. Discrete GPU detection if nvidia-smi available.
Windows -- RAM and CPU detection works. NVIDIA GPU detection via nvidia-smi if installed.

Contributing

Contributions are welcome, especially new models.

Adding a model

Add the model's HuggingFace repo ID (e.g., meta-llama/Llama-3.1-8B) to the TARGET_MODELS list in scripts/scrape_hf_models.py.
If the model is gated (requires HuggingFace authentication to access metadata), add a fallback entry to the FALLBACK dict in the same script with the parameter count and context length.
Run python3 scripts/scrape_hf_models.py to regenerate data/hf_models.json.
Run cargo build to verify compilation.
Open a pull request.

See MODELS.md for the current list and AGENTS.md for architecture details.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
data		data
scripts		scripts
src		src
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CNAME		CNAME
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
MODELS.md		MODELS.md
README.md		README.md
demo.gif		demo.gif
home_laptop.png		home_laptop.png
index.html		index.html
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llmfit

Quick install

Install

From source

Usage

TUI (default)

CLI mode

How it works

Model database

Project structure

Publishing to crates.io

Dependencies

Platform support

Contributing

Adding a model

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

three-foxes-in-a-trenchcoat/llmfit

Folders and files

Latest commit

History

Repository files navigation

llmfit

Quick install

Install

From source

Usage

TUI (default)

CLI mode

How it works

Model database

Project structure

Publishing to crates.io

Dependencies

Platform support

Contributing

Adding a model

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages