Guide

Complete guide to Huddle01 AI inference, endpoint setup, supported workflows, and next steps.

Huddle01 AI Inference gives you one OpenAI-compatible endpoint for multiple model providers.

What is inference?

Inference means sending input (messages, prompt, or multimodal data) to a model and getting generated output in response.

With Huddle01:

You use one API key.
You hit one base URL.
You can switch models by changing only the model field.

Endpoint and auth

Base URL: https://gru.huddle01.io/v1
Auth header: Authorization: Bearer <HUDDLE_API_KEY>
Compatibility: OpenAI-compatible SDKs and HTTP APIs

API Examples

Connection setup, OpenAPI-style schema, and SDK request examples.

Pricing and Models

Model-wise input/output pricing and capabilities matrix.

Request Lifecycle

Understand what happens from API request to model response.

Apps and IDEs

Use Huddle01 in OpenCode and other OpenAI-compatible tools.

Best Practices

Production tips for reliability, security, and cost control.

Quick start flow

Create or copy your Huddle01 API key from dashboard.
Set your client base_url to https://gru.huddle01.io/v1.
Call chat/completions with one of the supported model IDs.
Read usage and spend from your existing dashboard/billing views.

Apps and IDEs

Most apps that support OpenAI-compatible providers need the same three fields:

Field	Value
Provider type	OpenAI-compatible
Base URL	`https://gru.huddle01.io/v1`
API key	Your Huddle01 AI Inference API key
Model	Any model ID from Pricing and Models

OpenCode

Create or update opencode.json in your project root, or place it in your global OpenCode config directory.

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "huddle01": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Huddle01 AI Inference",
      "options": {
        "baseURL": "https://gru.huddle01.io/v1",
        "apiKey": "{env:HUDDLE_API_KEY}"
      },
      "models": {
        "qwen3-coder": {
          "name": "Qwen3 Coder"
        },
        "deepseek-v3.2": {
          "name": "DeepSeek V3.2"
        },
        "glm-4.7": {
          "name": "GLM 4.7"
        },
        "minimax-m2.5": {
          "name": "MiniMax M2.5"
        }
      }
    }
  }
}

Then set your key and choose a model in OpenCode:

export HUDDLE_API_KEY="your_huddle01_key"
opencode

Inside OpenCode, run /models and select one of the huddle01 models.

Other apps

For tools like Cursor extensions, Continue, Cline, Open WebUI, LangChain, or custom OpenAI SDK clients, use the same setup:

Choose OpenAI Compatible as the provider.
Paste https://gru.huddle01.io/v1 as the base URL.
Paste your Huddle01 API key.
Set the model ID, for example qwen3-coder, deepseek-v3.2, glm-4.7, or minimax-m2.5.

If the app asks for a full endpoint instead of a base URL, use https://gru.huddle01.io/v1/chat/completions.