AI Models — CloudAPI

Anthropic: Claude Fable 5

anthropic/claude-fable-5

star Featured Reasoning

Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...

Input

165 100/M so'm

$13.00

Output

825 500/M so'm

$65.00

data_array 1,000K context

Anthropic: Claude Opus 4.1

anthropic/claude-opus-4.1

star Featured Reasoning

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable gains...

Input

247 650/M so'm

$19.50

Output

1 238 250/M so'm

$97.50

data_array 200K context

Anthropic: Claude Opus 4.5

anthropic/claude-opus-4.5

star Featured Reasoning

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers strong multimodal capabilities, competitive performance across real-world coding and...

Input

82 550/M so'm

$6.50

Output

412 750/M so'm

$32.50

data_array 200K context

Anthropic: Claude Opus 4.8

anthropic/claude-opus-4.8

star Featured Reasoning

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...

Input

82 550/M so'm

$6.50

Output

412 750/M so'm

$32.50

data_array 1,000K context

Anthropic: Claude Sonnet 4

anthropic/claude-sonnet-4

star Featured Reasoning

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...

Input

49 530/M so'm

$3.90

Output

247 650/M so'm

$19.50

data_array 1,000K context

DeepSeek: DeepSeek V3.1

deepseek/deepseek-chat-v3.1

star Featured Reasoning

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context...

Input

3 467/M so'm

$0.27

Output

13 043/M so'm

$1.03

data_array 164K context

DeepSeek: DeepSeek V3.2

deepseek/deepseek-v3.2

star Featured Reasoning

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

Input

3 777/M so'm

$0.30

Output

5 666/M so'm

$0.45

data_array 131K context

DeepSeek: DeepSeek V4 Pro

deepseek/deepseek-v4-pro

star Featured Reasoning

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

Input

7 182/M so'm

$0.57

Output

14 364/M so'm

$1.13

data_array 1,049K context

DeepSeek: R1 0528

deepseek/deepseek-r1-0528

star Featured Reasoning

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active...

Input

8 255/M so'm

$0.65

Output

35 497/M so'm

$2.80

data_array 164K context

DeepSeek: R1 Distill Llama 70B

deepseek/deepseek-r1-distill-llama-70b

star Featured Reasoning

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across...

Input

13 208/M so'm

$1.04

Output

13 208/M so'm

$1.04

data_array 128K context

Google: Gemini 2.5 Flash

google/gemini-2.5-flash

star Featured Reasoning

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater...

Input

4 953/M so'm

$0.39

Output

41 275/M so'm

$3.25

data_array 1,049K context

Google: Gemini 2.5 Flash Lite

google/gemini-2.5-flash-lite

star Featured Reasoning

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

Input

1 651/M so'm

$0.13

Output

6 604/M so'm

$0.52

data_array 1,049K context

Google: Gemini 2.5 Flash Lite Preview 09-2025

google/gemini-2.5-flash-lite-preview-09-2025

star Featured Reasoning

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

Input

1 651/M so'm

$0.13

Output

6 604/M so'm

$0.52

data_array 1,049K context

Google: Gemini 2.5 Pro

google/gemini-2.5-pro

star Featured Reasoning

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

Input

20 638/M so'm

$1.63

Output

165 100/M so'm

$13.00

data_array 1,049K context

Google: Gemini 2.5 Pro Preview 05-06

google/gemini-2.5-pro-preview-05-06

star Featured Reasoning

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

Input

20 638/M so'm

$1.63

Output

165 100/M so'm

$13.00

data_array 1,049K context

Google: Gemini 2.5 Pro Preview 06-05

google/gemini-2.5-pro-preview

star Featured Reasoning

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

Input

20 638/M so'm

$1.63

Output

165 100/M so'm

$13.00

data_array 1,049K context

Google: Gemini 3 Flash Preview

google/gemini-3-flash-preview

star Featured Reasoning

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool...

Input

8 255/M so'm

$0.65

Output

49 530/M so'm

$3.90

data_array 1,049K context

Google: Gemini 3.1 Pro Preview

google/gemini-3.1-pro-preview

star Featured Reasoning

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

Input

33 020/M so'm

$2.60

Output

198 120/M so'm

$15.60

data_array 1,049K context

Google: Gemini 3.5 Flash

google/gemini-3.5-flash

star Featured Reasoning

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...

Input

24 765/M so'm

$1.95

Output

148 590/M so'm

$11.70

data_array 1,049K context

Google: Gemma 3 12B

google/gemma-3-12b-it

star Featured Reasoning

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

Input

826/M so'm

$0.07

Output

2 477/M so'm

$0.20

data_array 131K context

Google: Gemma 3 27B

google/gemma-3-27b-it

star Featured Reasoning

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

Input

1 321/M so'm

$0.10

Output

2 642/M so'm

$0.21

data_array 131K context

Google: Gemma 3 4B

google/gemma-3-4b-it

star Featured Reasoning

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

Input

826/M so'm

$0.07

Output

1 651/M so'm

$0.13

data_array 131K context

Google: Gemma 4 31B

google/gemma-4-31b-it

star Featured Reasoning

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

Input

1 981/M so'm

$0.16

Output

5 779/M so'm

$0.46

data_array 262K context

Google: Gemma 4 31B (free)

google/gemma-4-31b-it:free

free_breakfast Free star Featured Reasoning

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

Input

Free

Output

Free

data_array 262K context