llama.cpp
High-performance inference engine written in pure C/C++ with no dependencies. The foundational technology powering most local LLM tools.
Discover the best tools for running Large Language Models on your own hardware - from powerful engines to user-friendly applications
14 tools available
High-performance inference engine written in pure C/C++ with no dependencies. The foundational technology powering most local LLM tools.
The "Docker for LLMs" - Developer-friendly tool with Docker-like commands for managing and serving models via OpenAI-compatible API.
High-throughput, memory-efficient inference engine designed for production serving with PagedAttention algorithm. Best for multi-user scenarios.
Free, open-source OpenAI alternative. Multi-modal API gateway supporting text, image, audio, and video generation through various backends.
Docker-native way to run AI models. Distributes models as OCI artifacts through Docker Hub, with familiar Docker CLI commands and OpenAI-compatible API.
Privacy-first, feature-rich desktop and web app with advanced workflows, Knowledge Stacks (RAG), split chats, and parallel model comparisons. Supports both local and cloud models.
Polished, closed-source desktop app with integrated model browser, chat interface, and one-click local API server. Best UI/UX experience.
Open-source, privacy-focused desktop app with unique LocalDocs RAG feature. Runs entirely offline with no data collection.
Modern, open-source desktop app emphasizing extensibility and clean UI. 100% offline with plugin architecture.
Feature-rich, self-hosted web interface with built-in RAG, multi-user support, and plugin system. Most popular web UI for local LLMs.
The "Swiss Army Knife" of local LLM UIs. Massive plugin ecosystem with support for TTS, image generation, and multiple backends.
All-in-one RAG application. Makes it incredibly easy to build and manage private knowledge bases with document chat capabilities.
Polished, open-source ChatGPT clone with enterprise features. Multi-user support with robust authentication methods.
Modern, open-source AI chat framework with sleek UI, PWA support, and plugin marketplace. Strong mobile experience.