Open-source LLMs are large language models whose weights are published for anyone to download, run, and modify — rather than locked behind a company's API. In practice almost all of them are more precisely "open-weight": you get the trained model file and a license, but not necessarily the training data or code. That single difference — owning the weights — is what lets you run these models privately, cheaply, and without vendor lock-in, and in 2026 the best of them sit within a few points of the closed frontier.
This guide is a map of that landscape: who leads right now, what "open" actually means, the three ways to run an open model, and how to pick one for your task and hardware.
The open-weight leaders right now
Rankings move monthly, but as of mid-2026 a clear top tier has formed. The hero above is the short version; here is the detail:
- GLM-5.2 (Z.ai) — released June 13, 2026, this ~744B-parameter model is the current open leader on the Artificial Analysis Intelligence Index, beating GPT-5.5 on several long-horizon coding benchmarks at a fraction of the price.
- DeepSeek V4 — the MIT-licensed mixture-of-experts family that reset price-to-performance expectations; frontier-class and cheap. We cover it in depth in our DeepSeek V4 guide.
- Qwen 3.5 / 3.6 (Alibaba) — the most versatile family, Apache-2.0 licensed, spanning compact mixture-of-experts models that run on a single GPU up to a 397B reasoning flagship, with strong tool-calling and vision.
- Kimi K2.6 (Moonshot) — a coding and math specialist; its "Thinking" variant tops several hard reasoning benchmarks when given tools.
- MiniMax M3 — released June 2026, the first open-weight model to combine frontier coding, a 1M-token context, and native multimodality, leading the open-weight SWE-Bench Pro.
- Llama 4 Scout (Meta) — the long-context champion, with a 10-million-token window that can swallow an entire codebase or a shelf of papers in one prompt.
- Mistral — Europe's contender, with Apache-2.0 models from small, self-hostable sizes up to its larger flagships.
"Open weight" is not the same as "open source"
The word "open" hides three very different things, and the license is what matters:
- Permissive open weights (Apache-2.0, MIT) — DeepSeek, Qwen, and most Chinese labs ship here. You can use them commercially with almost no strings attached, which is why they dominate production self-hosting.
- Restricted open weights — Meta's Llama models come with a community license that adds conditions (notably for very large deployments). The weights are downloadable, but it is not OSI-approved open source.
- Truly open source — weights and training data and code — remains rare. Most "open" models are open-weight, not fully reproducible.
Before you build on a model, read its license: the gap between Apache-2.0 and a bespoke community license can decide whether you are allowed to ship at all.
Three ways to run them
You do not need a GPU cluster to use open models. There are three routes, in rising order of effort and control:
Hosted API. Providers like OpenRouter, Together, and Fireworks serve the open models behind an OpenAI-compatible endpoint — zero setup, pay per token, and the fastest way to try several models against the same prompt. Local. For privacy or offline work, run a smaller model on your own machine; our guide to running LLMs locally with Ollama walks through it. Self-host at scale. When you need a big model under your own control, serve the weights on your own GPUs with vLLM or SGLang.
How to choose
Three questions narrow it down fast:
- What is the task? Coding and agents → GLM-5.2, Kimi K2.6, DeepSeek V4, or Qwen3-Coder. Long documents → Llama 4 Scout. Images and video in the loop → MiniMax M3 or a Qwen vision model. High-volume, cost-sensitive work → DeepSeek Flash or a small Qwen.
- What hardware do you have? A compact mixture-of-experts model can run on a single modern GPU; a 700B-class model means a hosted API or a serious self-hosting setup.
- What does the license allow? For commercial products, prefer Apache-2.0 or MIT and check the restricted licenses carefully.
Open versus closed
Open models win on cost, privacy, and control — you can run them on your own hardware, fine-tune them, and never worry about a price change or a model being retired out from under you. The closed frontier (see our Claude and Gemini guides) still tends to lead on the very hardest reasoning and on polished, governed enterprise features. The pragmatic 2026 answer is rarely "one or the other": route easy, high-volume, or privacy-sensitive work to an open model, and reserve the closed flagships for the few problems that genuinely need the last few points of capability.
Want AI news before everyone else?
The morning's most important AI stories, straight to your inbox. No fluff.