The SMF Works Project — Where AI Meets Humanity
Live Pricing Index

LLM Pricing & Benchmarks

Compare costs, context windows, and benchmark scores across leading models.

Data last verified: 6/14/2026

Showing 17 of 17 models

ModelProviderInput / 1MOutput / 1MContextMMLUArena
DeepSeek-V3DeepSeek$0.140$0.28064K89.41318
gpt-4o-miniOpenAI$0.150$0.600128K821225
Gemini 2.5 FlashGoogle$0.150$0.6001.0M86.81320
Llama 4 MaverickMeta$0.200$0.600256K87.51330
Grok 3 MinixAI$0.300$0.500131K85.41280
DeepSeek-R1DeepSeek$0.550$2.1964K90.81354
Claude 4 HaikuAnthropic$0.625$2.50200K85.21245
Qwen 3 235B A22BOpenRouter$0.800$1.60128K86.51310
Llama 3.3 70BTogether AI$0.880$0.880131K83.51285
Gemini 2.5 ProGoogle$1.25$10.001.0M91.71415
GPT-4.1OpenAI$2.00$8.001.0M90.21372
gpt-4oOpenAI$2.50$10.00128K88.71287
Claude 4 SonnetAnthropic$3.00$15.00200K90.41379
Grok 3xAI$3.00$15.00131K88.91340
Claude 3.7 SonnetAnthropic$3.00$15.00200K88.51350
o3OpenAI$10.00$40.00200K92.91365
Claude 4 OpusAnthropic$15.00$75.00200K91.91398
DeepSeekdeepseek-chat

DeepSeek-V3

Strong open-weight model at very low cost.

Input $0.14/1MOutput $0.28/1M64K ctx
OpenAIgpt-4o-mini-2024-07-18

gpt-4o-mini

Fast, affordable small model for everyday tasks.

Input $0.15/1MOutput $0.6/1M128K ctx
Googlegemini-2.5-flash-preview-04-17

Gemini 2.5 Flash

Fast, cost-efficient Gemini with 1M context window.

Input $0.15/1MOutput $0.6/1M1.0M ctx
Metallama-4-maverick

Llama 4 Maverick

Open-weight multimodal model via API partners.

Input $0.2/1MOutput $0.6/1M256K ctx
xAIgrok-3-mini-latest

Grok 3 Mini

Fast, affordable Grok model for everyday tasks.

Input $0.3/1MOutput $0.5/1M131K ctx
DeepSeekdeepseek-reasoner

DeepSeek-R1

Reasoning model. Output is long due to chain-of-thought.

Input $0.55/1MOutput $2.19/1M64K ctx
Anthropicclaude-haiku-4-20250514

Claude 4 Haiku

Fast, cost-effective model for high-volume tasks.

Input $0.625/1MOutput $2.5/1M200K ctx
OpenRouterqwen3-235b-a22b

Qwen 3 235B A22B

Mixture-of-experts model available through unified API.

Input $0.8/1MOutput $1.6/1M128K ctx
Together AImeta-llama/Llama-3.3-70B-Instruct-Turbo

Llama 3.3 70B

Open-weight model hosted on serverless inference platform.

Input $0.88/1MOutput $0.88/1M131K ctx
Googlegemini-2.5-pro-preview-05-06

Gemini 2.5 Pro

1M token context window. Strong coding and reasoning.

Input $1.25/1MOutput $10/1M1.0M ctx
OpenAIgpt-4.1-2025-04-14

GPT-4.1

Long-context coding model with 1M token context window.

Input $2/1MOutput $8/1M1.0M ctx
OpenAIgpt-4o-2024-08-06

gpt-4o

Flagship multimodal model. Pricing per 1M tokens.

Input $2.5/1MOutput $10/1M128K ctx
Anthropicclaude-sonnet-4-20250514

Claude 4 Sonnet

Latest Sonnet with extended thinking. Pricing per 1M tokens.

Input $3/1MOutput $15/1M200K ctx
xAIgrok-3-latest

Grok 3

xAI flagship with real-time X data access.

Input $3/1MOutput $15/1M131K ctx
Anthropicclaude-3-7-sonnet-20250219

Claude 3.7 Sonnet

Prior-generation Claude Sonnet with extended thinking mode.

Input $3/1MOutput $15/1M200K ctx
OpenAIo3-2025-04-16

o3

Reasoning model. Higher latency, best for complex STEM tasks.

Input $10/1MOutput $40/1M200K ctx
Anthropicclaude-opus-4-20250514

Claude 4 Opus

Most capable Claude model for complex agentic workflows.

Input $15/1MOutput $75/1M200K ctx