local-inference

Star

Here are 25 public repositories matching this topic...

Tiiny-AI / PowerInfer

Star

High-speed Large Language Model Serving for Local Deployment

llama large-language-models llm local-inference llm-inference

Updated Jan 24, 2026
C++

efeslab / fiddler

Star

[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration

mixture-of-experts llm local-inference llm-inference mixtral-8x7b

Updated Nov 18, 2024
Python

Modern desktop application (Rust + Tauri v2 + Svelte 5 + Candle (HF)) for communicating with AI models that runs completely locally on your computer. No subscriptions, no data sent to the internet — just you and your personal AI assistant

Updated Jan 30, 2026
Rust

sbhjt-gr / InferrLM

Star

On-device AI for iOS & Android

embeddings gemini http-server openai document-processing rag edge-ai on-device-ai local-inference anthropic llamacpp llama-cpp local-llm gguf deepseek multimodal-ai

Updated Jan 30, 2026
TypeScript

notolog / notolog-editor

Star

Notolog Markdown Editor

python markdown qt emacs markdown-editor onnx python-ai on-device-ai ai-assistant pyside6 python-qt local-inference llama-cpp local-llm qwen llama-cpp-python gguf phi-4 privacy-first-ai

Updated Jan 31, 2026
Python

BorjaOteroFerreira / IALab-Suite

Star

Tool for test diferents large language models without code.

api-rest chat-application flask-api inference-api large-language-models llm local-inference llamacpp llm-inference llama2 llama-cpp-python llama2-7b mixtral-8x7b

Updated Oct 18, 2025
Python

michaelborck-education / study-buddy

Star

Study Buddy is a desktop application that provides AI tutoring without requiring internet access or accounts

electron javascript css typescript desktop-application privacy-focused local-inference ai-tutor offline-application

Updated Jun 25, 2025
TypeScript

yas-sim / openvino-llm-chatbot-rag

Star

LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).

natural-language-processing offline chatbot intel edge-computing rag openvino huggingface edge-inference cloud-free llm local-inference langchain dolly2 retrieval-augmented-generation llama2 neural-chat

Updated Jan 25, 2024
Python

Raxephion / AuraGen-AuraFlow-WebUI

Star

Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.

python open-source image-generation webui gradio text-to-image stable-diffusion diffusers local-inference generative-ai ai-image-generator auraflow low-vram

Updated Jun 7, 2025
Python

LianHe-BI / Basic-Qwen-3B-SD-Prompt-SOUL-ARCHITECT-v2.0-DEMO

Star

EN: An overfitted SD prompt engine with severe "aesthetic snobbery," forcibly transforming mundane ideas into professional-grade physical rendering instructions. CN: 一个具备“审美洁癖”的过拟合提示词引擎，强行将平庸构思纠偏为具备极致物理质感的工业级渲染指令。

Updated Jan 19, 2026
Python

aTh1ef / ai-debate-agents

Star

Verify claims using AI agents that debate using scraped evidence and local language models.

python scraping beautifulsoup autonomous-agents local-inference qwen lm-studio langgraph private-llm agentic-ai claim-verification phi-4-mini evidence-based-ai

Updated Jun 1, 2025
Python

ThalesMMS / agent-playground-py

Star

Modular CLI agent for local light LLMs with a sub-agent system. Works with LM Studio and OpenAI-compatible APIs.

agent openai ai-agents llm local-inference ollama lmstudio

Updated Nov 30, 2025
Python

monday8am / koogagent

Star

Function calling and Agentic framework test app

android-development gemma litert edge-ai mediapipe local-inference function-calling agentic-ai koog litert-lm

Updated Jan 31, 2026
Kotlin

AuraFriday / llm_mcp

Star

MCP server that runs local LLMs (with full access to MCP tools included). Callable by Python to chain MCP tools with local intelligence.

ai gpu mcp npu llm local-inference local-llm local-ai mcp-servers mcp-server local-embeddings

Updated Jan 19, 2026
Python

shanevcantwell / prompt-prix

Star

Audit local LLM function calling and agentic reliability. Visual tool-use benchmarking for quantized models on YOUR hardware.

open-source gradio fan-out multi-gpu ai-safety tool-use model-benchmarking llm local-inference function-calling lm-studio llm-evaluation agentic-ai open-weight low-vram constraint-compliance quantization-testing

Updated Feb 1, 2026
Python

crazywilliam / phi3-local-demo

Star

A lightweight Python implementation of Microsoft's Phi-3 model running locally on CPU.

nlp huggingface-transformers local-inference phi-3-mini

Updated Jan 30, 2026
Python

cuiyuheng / nexa-sdk

Star

Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.

ai local-inference

Updated Nov 19, 2024
Python

aman-tiwari001 / Image-Semantic-Search

Star

Search through your thousands of photos using natural human language locally on your PC

python docker cli ai cuda transformers gemini semantic-search gemma nvidia-gpu rag huggingface vector-database vector-embeddings qdrant local-inference langchain

Updated Nov 30, 2025
Python

nazago / rag-llm-local

Star

script which performs RAG and use a local LLM for Q&A

rag llm local-inference langchain-python ollama

Updated Sep 16, 2024
Python

md-hameem / AI-Content-Generator-and-Editor

Star

The Streamlit AI Content Generator + Editor is a fully interactive, web-based tool designed to assist content creators, marketers, and developers in generating high-quality blog posts, articles, and marketing copy.

python openai content-editor content-automation pdf-export content-generation streamlit markdown-export writing-assistant local-inference seo-optimizer ollama article-writer ai-content-generator

Updated Aug 9, 2025
Python

Improve this page

Add a description, image, and links to the local-inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the local-inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

local-inference

Here are 25 public repositories matching this topic...

Tiiny-AI / PowerInfer

efeslab / fiddler

oxide-lab / Oxide-Lab

sbhjt-gr / InferrLM

notolog / notolog-editor

BorjaOteroFerreira / IALab-Suite

michaelborck-education / study-buddy

yas-sim / openvino-llm-chatbot-rag

Raxephion / AuraGen-AuraFlow-WebUI

LianHe-BI / Basic-Qwen-3B-SD-Prompt-SOUL-ARCHITECT-v2.0-DEMO

aTh1ef / ai-debate-agents

ThalesMMS / agent-playground-py

monday8am / koogagent

AuraFriday / llm_mcp

shanevcantwell / prompt-prix

crazywilliam / phi3-local-demo

cuiyuheng / nexa-sdk

aman-tiwari001 / Image-Semantic-Search

nazago / rag-llm-local

md-hameem / AI-Content-Generator-and-Editor

Improve this page

Add this topic to your repo