AI Модели — CloudAPI

Mistral: Mistral Small 3

mistralai/mistral-small-24b-instruct-2501

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tuned versions designed...

Ввод

826/M so'm

$0.07

Вывод

1 321/M so'm

$0.10

data_array 33K context

Mistral: Mixtral 8x22B Instruct

mistralai/mixtral-8x22b-instruct

Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. Its strengths include: - strong math, coding,...

Ввод

33 020/M so'm

$2.60

Вывод

99 060/M so'm

$7.80

data_array 66K context

Mistral: Saba

mistralai/mistral-saba

Mistral Saba is a 24B-parameter language model specifically designed for the Middle East and South Asia, delivering accurate and contextually relevant responses while maintaining efficient performance. Trained on curated regional...

Ввод

3 302/M so'm

$0.26

Вывод

9 906/M so'm

$0.78

data_array 33K context

Mistral: Voxtral Small 24B 2507

mistralai/voxtral-small-24b-2507

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio...

Ввод

1 651/M so'm

$0.13

Вывод

4 953/M so'm

$0.39

data_array 32K context

MoonshotAI: Kimi K2 0711

moonshotai/kimi-k2

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for...

Ввод

9 411/M so'm

$0.74

Вывод

37 973/M so'm

$2.99

data_array 131K context

MoonshotAI: Kimi K2 0905

moonshotai/kimi-k2-0905

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32...

Ввод

9 906/M so'm

$0.78

Вывод

41 275/M so'm

$3.25

data_array 262K context

MythoMax 13B

gryphe/mythomax-l2-13b

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge

Ввод

991/M so'm

$0.08

Вывод

991/M so'm

$0.08

data_array 4K context

NVIDIA: Nemotron 3 Nano 30B A3B

nvidia/nemotron-3-nano-30b-a3b

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

Ввод

826/M so'm

$0.07

Вывод

3 302/M so'm

$0.26

data_array 262K context

NVIDIA: Nemotron 3 Nano 30B A3B (free)

nvidia/nemotron-3-nano-30b-a3b:free

free_breakfast Free

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

Ввод

Free

Вывод

Free

data_array 256K context

NVIDIA: Nemotron 3 Super

nvidia/nemotron-3-super-120b-a12b

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

Ввод

1 486/M so'm

$0.12

Вывод

7 430/M so'm

$0.59

data_array 1,000K context

NVIDIA: Nemotron 3 Super (free)

nvidia/nemotron-3-super-120b-a12b:free

free_breakfast Free

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

Ввод

Free

Вывод

Free

data_array 1,000K context

Prime Intellect: INTELLECT-3

prime-intellect/intellect-3

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). It offers state-of-the-art performance for its size across math,...

Ввод

3 302/M so'm

$0.26

Вывод

18 161/M so'm

$1.43

data_array 131K context

Qwen: Qwen-Plus

qwen/qwen-plus

Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination.

Ввод

4 293/M so'm

$0.34

Вывод

12 878/M so'm

$1.01

data_array 1,000K context

Qwen: Qwen2.5 7B Instruct

qwen/qwen-2.5-7b-instruct

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...

Ввод

660/M so'm

$0.05

Вывод

1 651/M so'm

$0.13

data_array 131K context

Qwen: Qwen3 235B A22B Instruct 2507

qwen/qwen3-235b-a22b-2507

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...

Ввод

1 486/M so'm

$0.12

Вывод

1 651/M so'm

$0.13

data_array 262K context

Qwen: Qwen3 30B A3B Instruct 2507

qwen/qwen3-30b-a3b-instruct-2507

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, with 3.3B active parameters per inference. It operates in non-thinking mode and is designed for high-quality instruction following, multilingual understanding, and...

Ввод

795/M so'm

$0.06

Вывод

3 187/M so'm

$0.25

data_array 131K context

Qwen: Qwen3.6 Max Preview

qwen/qwen3.6-max-preview

Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total parameters. It is optimized for agentic coding, tool use, and...

Ввод

17 170/M so'm

$1.35

Вывод

103 022/M so'm

$8.11

data_array 262K context

Qwen: Qwen3.7 Max

qwen/qwen3.7-max

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...

Ввод

20 638/M so'm

$1.63

Вывод

61 913/M so'm

$4.88

data_array 1,000K context

Reka Flash 3

rekaai/reka-flash-3

Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a...

Ввод

1 651/M so'm

$0.13

Вывод

3 302/M so'm

$0.26

data_array 66K context

Relace: Relace Search

relace/relace-search

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to RAG, relace-search performs agentic...

Ввод

16 510/M so'm

$1.30

Вывод

49 530/M so'm

$3.90

data_array 256K context

ReMM SLERP 13B

undi95/remm-slerp-l2-13b

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

Ввод

7 430/M so'm

$0.59

Вывод

10 732/M so'm

$0.85

data_array 6K context

Sao10K: Llama 3 8B Lunaris

sao10k/l3-lunaris-8b

Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, designed to balance creativity with improved logic and general knowledge....

Ввод

660/M so'm

$0.05

Вывод

826/M so'm

$0.07

data_array 8K context

Sao10K: Llama 3.1 70B Hanami x1

sao10k/l3.1-70b-hanami-x1

This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-70b).

Ввод

49 530/M so'm

$3.90

Вывод

49 530/M so'm

$3.90

data_array 16K context

Sao10K: Llama 3.1 Euryale 70B v2.2

sao10k/l3.1-euryale-70b

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.1](/models/sao10k/l3-euryale-70b).

Ввод

14 034/M so'm

$1.11

Вывод

14 034/M so'm

$1.11

data_array 131K context