AI Models

Use 300+ models from OpenAI, Anthropic, Google and others through a single API

memory 346 models

free_breakfast 30 free

business 58 providers

apps All 346 psychology Reasoning 136 visibility Vision 98 chat Chat 97 code Code 14 circle models.category.moderation 1

All Openai · 64 Qwen · 49 Google · 30 Mistralai · 19 Anthropic · 18 Meta-llama · 13 Z-ai · 12 Deepseek · 12 Nvidia · 11 Minimax · 8 Moonshotai · 6 Cohere · 5 Nousresearch · 5 Perplexity · 5 Amazon · 5

NVIDIA: Nemotron 3.5 Content Safety (free)

nvidia/nemotron-3.5-content-safety:free

free_breakfast Free Vision

NVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrail model from NVIDIA, fine-tuned from Google Gemma-3-4B. It moderates both inputs to and responses from LLMs and VLMs, accepting...

data_array 128K context

OpenAI GPT Latest

~openai/gpt-latest

Vision

This model always redirects to the latest model in the OpenAI GPT family.

data_array 1,050K context

OpenAI GPT Mini Latest

~openai/gpt-mini-latest

Vision

This model always redirects to the latest model in the OpenAI GPT Mini family.

data_array 400K context

Perplexity: Sonar

perplexity/sonar

Vision

Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources. It is designed for companies seeking to integrate lightweight question-and-answer features...

data_array 127K context

Qwen: Qwen2.5 VL 72B Instruct

qwen/qwen2.5-vl-72b-instruct

Vision

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.

data_array 131K context

Qwen: Qwen3 VL 235B A22B Instruct

qwen/qwen3-vl-235b-a22b-instruct

Vision

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language use (VQA, document parsing, chart/table...

data_array 262K context

Qwen: Qwen3 VL 30B A3B Instruct

qwen/qwen3-vl-30b-a3b-instruct

Vision

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception...

data_array 262K context

Qwen: Qwen3.5 397B A17B

qwen/qwen3.5-397b-a17b

Vision

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers...

data_array 256K context

Qwen: Qwen3.5 Plus 2026-02-15

qwen/qwen3.5-plus-02-15

Vision

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency. In a variety of...

data_array 1,000K context

Qwen: Qwen3.5 Plus 2026-04-20

qwen/qwen3.5-plus-20260420

Vision

Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video input and produces text output, with a 1M token context window. This...

data_array 1,000K context

Qwen: Qwen3.5-122B-A10B

qwen/qwen3.5-122b-a10b

Vision

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of...

data_array 262K context

Qwen: Qwen3.5-27B

qwen/qwen3.5-27b

Vision

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of...

data_array 262K context

Qwen: Qwen3.5-35B-A3B

qwen/qwen3.5-35b-a3b

Vision

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall...

data_array 262K context

Qwen: Qwen3.5-Flash

qwen/qwen3.5-flash-02-23

Vision

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

data_array 1,000K context

Qwen: Qwen3.6 27B

qwen/qwen3.6-27b

Vision

Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid multimodal capabilities — accepting text, image, and video inputs...

data_array 262K context

Qwen: Qwen3.6 35B A3B

qwen/qwen3.6-35b-a3b

Vision

Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token. It uses a hybrid sparse mixture-of-experts architecture combining Gated...

data_array 262K context

Qwen: Qwen3.6 Flash

qwen/qwen3.6-flash

Vision

Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video input with a 1M token context window. Tiered pricing kicks in...

data_array 1,000K context

Qwen: Qwen3.6 Plus

qwen/qwen3.6-plus

Vision

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers...

data_array 1,000K context

Qwen: Qwen3.7 Plus

qwen/qwen3.7-plus

Vision

Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilities with a comprehensive upgrade to its...

data_array 1,000K context

Reka Edge

rekaai/reka-edge

Vision

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding,...

data_array 16K context

StepFun: Step 3.7 Flash

stepfun/step-3.7-flash

Vision

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

data_array 256K context

xAI: Grok 4.20 Multi-Agent

x-ai/grok-4.20-multi-agent

Vision

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...

data_array 2,000K context

xAI: Grok Build 0.1

x-ai/grok-build-0.1

Vision

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output, and is optimized for interactive coding...

data_array 256K context

Xiaomi: MiMo-V2.5

xiaomi/mimo-v2.5

Vision

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

data_array 1,049K context