Pricing
Model pricing and support matrix for Huddle01 AI inference.
All prices are in USD per 1M tokens.
Tier 1 models
| Model | Input | Output | Context | Supports |
|---|---|---|---|---|
gpt-5.4 | $2.50 | $20.00 | 1M | Text/Chat, Reasoning, Code |
gpt-5.4-pro | $30.00 | $180.00 | 1M | Text/Chat, Reasoning, Code |
gpt-4.1 | $2.00 | $8.00 | 1M | Text/Chat, Code |
gpt-4.1-mini | $0.40 | $1.60 | 1M | Text/Chat, Code |
gpt-4.1-nano | $0.10 | $0.40 | 1M | Text/Chat |
o3 | $2.00 | $8.00 | 200K | Reasoning, Text/Chat |
o4-mini | $1.10 | $4.40 | 200K | Reasoning, Text/Chat |
gpt-4o-mini | $0.15 | $0.60 | 128K | Text/Chat, Code, Vision |
claude-opus-4.6 | $5.00 | $25.00 | 1M | Text/Chat, Reasoning, Code |
claude-sonnet-4.6 | $3.00 | $15.00 | 1M | Text/Chat, Reasoning, Code |
claude-sonnet-4.5 | $3.00 | $15.00 | 1M | Text/Chat, Reasoning, Code |
claude-haiku-4.5 | $1.00 | $5.00 | 200K | Text/Chat, Code |
claude-sonnet-4 | $3.00 | $15.00 | 200K | Text/Chat, Reasoning, Code |
gemini-3.1-pro | $2.00 | $12.00 | 1M | Text/Chat, Reasoning, Vision |
gemini-3.1-flash-lite | $0.25 | $1.50 | 1M | Text/Chat, Low latency |
gemini-2.5-pro | $1.25 | $10.00 | 1M | Text/Chat, Reasoning, Vision |
gemini-2.5-flash | $0.15 | $0.60 | 1M | Text/Chat, Low latency, Vision |
Tier 2 models
| Model | Input | Output | Supports |
|---|---|---|---|
deepseek-v3.2 | $0.644 | $1.932 | Text/Chat, Code |
deepseek-r1 | $0.6325 | $2.5185 | Reasoning, Math, Code |
grok-4.1-fast | $3.30 | $16.50 | Text/Chat, Reasoning |
grok-3-mini | $0.34 | $1.70 | Text/Chat, Low latency |
qwen3-235b | $0.805 | $0.805 | Text/Chat, Reasoning |
qwen-2.5-coder-32b | $0.1265 | $0.1265 | Code, Text/Chat |
minimax-m2.5 | $1.20 | $7.20 | Text/Chat |
kimi-k2.5 | $1.20 | $6.00 | Long context, Text/Chat |
llama-4-maverick | $0.253 | $1.15 | Text/Chat, Code |
llama-3.3-70b | $0.132 | $0.132 | Text/Chat, Code |
mistral-large | $2.20 | $6.60 | Text/Chat, Reasoning |
codestral | $0.33 | $0.99 | Code generation |