AI Models

Use 300+ models from OpenAI, Anthropic, Google and others through a single API

memory 346 models
free_breakfast 30 free
business 58 providers
inclusionAI: Ling-2.6-1T
inclusionai/ling-2.6-1t
Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-parameter flagship, designed for real-world agents that require fast execution and high efficiency at scale. It uses a “fast...
Input
1 238/M so'm
$0.10
Output
10 319/M so'm
$0.81
data_array 262K context
inclusionAI: Ling-2.6-flash
inclusionai/ling-2.6-flash
Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency....
Input
165/M so'm
$0.01
Output
495/M so'm
$0.04
data_array 262K context
inclusionAI: Ring-2.6-1T
inclusionai/ring-2.6-1t
Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and operational efficiency. It is optimized for coding agents, tool...
Input
1 238/M so'm
$0.10
Output
10 319/M so'm
$0.81
data_array 262K context
Inflection: Inflection 3 Pi
inflection/inflection-3-pi
Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backstory, emotional intelligence, productivity, and safety. It has access to recent news, and excels in scenarios like customer support and roleplay. Pi...
Input
41 275/M so'm
$3.25
Output
165 100/M so'm
$13.00
data_array 8K context
Inflection: Inflection 3 Productivity
inflection/inflection-3-productivity
Inflection 3 Productivity is optimized for following instructions. It is better for tasks requiring JSON output or precise adherence to provided guidelines. It has access to recent news. For emotional...
Input
41 275/M so'm
$3.25
Output
165 100/M so'm
$13.00
data_array 8K context
Kwaipilot: KAT-Coder-Pro V2
kwaipilot/kat-coder-pro-v2
Code
KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions,...
Input
4 953/M so'm
$0.39
Output
19 812/M so'm
$1.56
data_array 256K context
LiquidAI: LFM2-24B-A2B
liquid/lfm-2-24b-a2b
LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per...
Input
495/M so'm
$0.04
Output
1 981/M so'm
$0.16
data_array 128K context
LiquidAI: LFM2.5-1.2B-Instruct (free)
liquid/lfm-2.5-1.2b-instruct:free
free_breakfast Free
LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.
Input
Free
Output
Free
data_array 33K context
LiquidAI: LFM2.5-1.2B-Thinking (free)
liquid/lfm-2.5-1.2b-thinking:free
free_breakfast Free Reasoning
LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is...
Input
Free
Output
Free
data_array 33K context
Magnum v4 72B
anthracite-org/magnum-v4-72b
This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthropic/claude-3-opus). The model is fine-tuned on top of [Qwen2.5 72B](https://openrouter.ai/qwen/qwen-2.5-72b-instruct).
Input
49 530/M so'm
$3.90
Output
82 550/M so'm
$6.50
data_array 33K context
Mancer: Weaver (alpha)
mancer/weaver
An attempt to recreate Claude-style verbosity, but don't expect the same level of coherence or memory. Meant for use in roleplay/narrative situations.
Input
12 383/M so'm
$0.98
Output
16 510/M so'm
$1.30
data_array 8K context
Microsoft: Phi 4
microsoft/phi-4
Reasoning
[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion...
Input
1 073/M so'm
$0.08
Output
2 311/M so'm
$0.18
data_array 16K context
Microsoft: Phi 4 Mini Instruct
microsoft/phi-4-mini-instruct
Reasoning
Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4...
Input
1 321/M so'm
$0.10
Output
5 779/M so'm
$0.46
data_array 131K context
MiniMax: MiniMax M1
minimax/minimax-m1
Reasoning
MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (MoE) architecture paired with a custom "lightning attention" mechanism, allowing it...
Input
6 604/M so'm
$0.52
Output
36 322/M so'm
$2.86
data_array 1,000K context
MiniMax: MiniMax M2
minimax/minimax-m2
Reasoning
MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning,...
Input
4 210/M so'm
$0.33
Output
16 510/M so'm
$1.30
data_array 205K context
MiniMax: MiniMax M2-her
minimax/minimax-m2-her
MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message...
Input
4 953/M so'm
$0.39
Output
19 812/M so'm
$1.56
data_array 66K context
MiniMax: MiniMax M2.1
minimax/minimax-m2.1
MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world...
Input
4 788/M so'm
$0.38
Output
15 685/M so'm
$1.24
data_array 205K context
MiniMax: MiniMax M2.5
minimax/minimax-m2.5
MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1...
Input
2 477/M so'm
$0.20
Output
14 859/M so'm
$1.17
data_array 205K context
MiniMax: MiniMax M2.7
minimax/minimax-m2.7
MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent...
Input
4 128/M so'm
$0.33
Output
16 510/M so'm
$1.30
data_array 205K context
MiniMax: MiniMax M3
minimax/minimax-m3
Vision
MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...
Input
4 953/M so'm
$0.39
Output
19 812/M so'm
$1.56
data_array 1,049K context
MiniMax: MiniMax-01
minimax/minimax-01
Vision
MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding. It has 456 billion parameters, with 45.9 billion parameters activated per inference, and can handle a context...
Input
3 302/M so'm
$0.26
Output
18 161/M so'm
$1.43
data_array 1,000K context
Mistral Large 2407
mistralai/mistral-large-2407
Reasoning
This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....
Input
33 020/M so'm
$2.60
Output
99 060/M so'm
$7.80
data_array 131K context
Mistral: Codestral 2508
mistralai/codestral-2508
Code
Mistral's cutting-edge language model for coding released end of July 2025. Codestral specializes in low-latency, high-frequency tasks such as fill-in-the-middle (FIM), code correction and test generation. [Blog Post](https://mistral.ai/news/codestral-25-08)
Input
4 953/M so'm
$0.39
Output
14 859/M so'm
$1.17
data_array 256K context
Mistral: Devstral 2 2512
mistralai/devstral-2512
Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256K context window. Devstral 2 supports exploring...
Input
6 604/M so'm
$0.52
Output
33 020/M so'm
$2.60
data_array 262K context