AI Models

Use 300+ models from OpenAI, Anthropic, Google and others through a single API

memory 346 models

free_breakfast 30 free

business 58 providers

apps All 346 psychology Reasoning 136 visibility Vision 98 chat Chat 97 code Code 14 circle models.category.moderation 1

All Openai · 64 Qwen · 49 Google · 30 Mistralai · 19 Anthropic · 18 Meta-llama · 13 Z-ai · 12 Deepseek · 12 Nvidia · 11 Minimax · 8 Moonshotai · 6 Cohere · 5 Nousresearch · 5 Perplexity · 5 Amazon · 5

Qwen: Qwen3.5-Flash

qwen/qwen3.5-flash-02-23

Vision

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

data_array 1,000K context

Qwen: Qwen3.6 27B

qwen/qwen3.6-27b

Vision

Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid multimodal capabilities — accepting text, image, and video inputs...

data_array 262K context

Qwen: Qwen3.6 35B A3B

qwen/qwen3.6-35b-a3b

Vision

Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token. It uses a hybrid sparse mixture-of-experts architecture combining Gated...

data_array 262K context

Qwen: Qwen3.6 Flash

qwen/qwen3.6-flash

Vision

Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video input with a 1M token context window. Tiered pricing kicks in...

data_array 1,000K context

Qwen: Qwen3.6 Max Preview

qwen/qwen3.6-max-preview

Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total parameters. It is optimized for agentic coding, tool use, and...

data_array 262K context

Qwen: Qwen3.6 Plus

qwen/qwen3.6-plus

Vision

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers...

data_array 1,000K context

Qwen: Qwen3.7 Max

qwen/qwen3.7-max

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...

data_array 1,000K context

Qwen: Qwen3.7 Plus

qwen/qwen3.7-plus

Vision

Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilities with a comprehensive upgrade to its...

data_array 1,000K context

Reka Edge

rekaai/reka-edge

Vision

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding,...

data_array 16K context

Reka Flash 3

rekaai/reka-flash-3

Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a...

data_array 66K context

Relace: Relace Apply 3

relace/relace-apply-3

Code

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can apply updates from GPT-4o, Claude, and others into your files at...

data_array 256K context

Relace: Relace Search

relace/relace-search

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to RAG, relace-search performs agentic...

data_array 256K context

ReMM SLERP 13B

undi95/remm-slerp-l2-13b

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

data_array 6K context

Sao10K: Llama 3 8B Lunaris

sao10k/l3-lunaris-8b

Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, designed to balance creativity with improved logic and general knowledge....

data_array 8K context

Sao10K: Llama 3.1 70B Hanami x1

sao10k/l3.1-70b-hanami-x1

This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-70b).

data_array 16K context

Sao10K: Llama 3.1 Euryale 70B v2.2

sao10k/l3.1-euryale-70b

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.1](/models/sao10k/l3-euryale-70b).

data_array 131K context

Sao10K: Llama 3.3 Euryale 70B

sao10k/l3.3-euryale-70b

Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.2](/models/sao10k/l3-euryale-70b).

data_array 131K context

StepFun: Step 3.5 Flash

stepfun/step-3.5-flash

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token....

data_array 262K context

StepFun: Step 3.7 Flash

stepfun/step-3.7-flash

Vision

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

data_array 256K context

Switchpoint Router

switchpoint/router

Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. As the world of LLMs advances, our router gets smarter, ensuring you...

data_array 131K context

Tencent: Hunyuan A13B Instruct

tencent/hunyuan-a13b-instruct

Reasoning

Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought. It offers competitive benchmark...

data_array 131K context

Tencent: Hy3 preview

tencent/hy3-preview

Reasoning

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to...

data_array 262K context

TheDrummer: Cydonia 24B V4.1

thedrummer/cydonia-24b-v4.1

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

data_array 131K context

TheDrummer: Rocinante 12B

thedrummer/rocinante-12b

Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary with unique and expressive word choices - Enhanced creativity for vivid narratives -...

data_array 33K context