Overview

Introduction to the Huddle GPU API — deploy bare-metal NVIDIA GPUs on demand with per-second billing.

The Huddle GPU API gives you programmatic access to bare-metal NVIDIA GPUs (H100, H200, A100, RTX 6000, and more) across multiple datacenter regions. Deploy GPU clusters in minutes, pay per second, and manage everything through a simple REST API.

Base URL

https://gpu.huddleapis.com/api/v1

All endpoints are served over HTTPS. HTTP requests are rejected.

Authentication

Every request must include your API key in one of two ways:

# Option 1: X-API-Key header (recommended)
curl -H "X-API-Key: mk_your_key_here" \
  https://gpu.huddleapis.com/api/v1/billing/balance

# Option 2: Bearer token
curl -H "Authorization: Bearer mk_your_key_here" \
  https://gpu.huddleapis.com/api/v1/billing/balance

API keys use the mk_ prefix. You receive your first key when you register, and can create additional keys via the API Keys endpoints.

Keep your keys safe

API key secrets are shown once at creation time and cannot be retrieved later. Store them securely — treat them like passwords.

Quick Start

Here's the end-to-end workflow to go from zero to a running GPU instance:

# 1. Check your credit balance (minimum $20 required to deploy)
curl -H "X-API-Key: mk_your_key" \
  https://gpu.huddleapis.com/api/v1/billing/balance

# 2. Browse available GPU offers and pricing
curl -H "X-API-Key: mk_your_key" \
  https://gpu.huddleapis.com/api/v1/gpus/available

# 3. Upload your SSH public key
curl -X POST -H "X-API-Key: mk_your_key" \
  -H "Content-Type: application/json" \
  -d '{"name": "my-laptop", "public_key": "ssh-ed25519 AAAA... user@host"}' \
  https://gpu.huddleapis.com/api/v1/ssh-keys

# 4. Pick a compatible OS image
curl -H "X-API-Key: mk_your_key" \
  https://gpu.huddleapis.com/api/v1/images

# 5. Deploy a GPU cluster
curl -X POST -H "X-API-Key: mk_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "cluster_type": "1H100.80S.30V",
    "image": "ubuntu-22.04-cuda-12.4-cluster",
    "hostname": "my-training-run",
    "ssh_key_ids": ["your-ssh-key-id"],
    "location": "FIN-01"
  }' \
  https://gpu.huddleapis.com/api/v1/deployments/clusters

# 6. Once status is "running", SSH in
ssh root@<deployment_ip>

# 7. When done, delete the deployment to stop billing
curl -X DELETE -H "X-API-Key: mk_your_key" \
  https://gpu.huddleapis.com/api/v1/deployments/<deployment_id>

Response Format

All API responses use a consistent JSON envelope:

Success:

{
  "code": "OK",
  "data": { ... }
}

Error:

{
  "code": "BAD_REQUEST",
  "error": "cluster_type is required"
}

Paginated lists include a meta object with cursor-based pagination:

{
  "code": "OK",
  "data": [ ... ],
  "meta": {
    "total": 42,
    "cursor": "eyJpZCI6Imxhc3QtaWQifQ==",
    "has_more": true
  }
}

Pass cursor as a query parameter to fetch the next page of results.

Key Concepts

Concept	Description
Offer	A GPU instance type available for deployment, with specs, pricing, and region availability. Think of it as a product listing.
Cluster Type	The instance type identifier (e.g., `1H100.80S.30V` = 1x H100, 80GB SSD, 30 vCPUs). Used when deploying.
Deployment	A running GPU cluster instance. Has a lifecycle: `ordered` → `provisioning` → `running` → `deleted`.
Region	A datacenter location (e.g., `FIN-01` for Finland). GPU availability and pricing vary by region.
Image	A pre-built OS snapshot with CUDA drivers and GPU toolkits. Required when deploying.
Volume	Persistent storage (NVMe or HDD) attached to a deployment. Survives instance deletion for data reuse.
SSH Key	A public key uploaded to your account. Injected into deployments at provisioning time for SSH access.
Webhook	An HTTP endpoint you register to receive real-time deployment status notifications.
Waitlist	A queue system for GPUs that are currently at capacity. Get notified or auto-deploy when one opens up.
Credits	Prepaid USD balance used for billing. Per-second charges are deducted automatically while a deployment runs.

API Sections

Section	What it does
GPU Billing	Understand credits, billing cycles, minimum charges, and balance management
Offers	Browse available GPU types with real-time pricing and specs
Deployments	Deploy, manage, and delete GPU clusters
Waitlist	Queue for unavailable GPUs and get notified when capacity opens
Images	List OS images pre-configured with CUDA drivers
Volumes	Create and manage persistent storage
SSH Keys	Upload keys for deployment access
API Keys	Create and manage authentication keys
Webhooks	Subscribe to real-time deployment events
Regions	Discover available datacenter locations

Error Codes

All errors return a standard HTTP status code with a code field:

HTTP Status	Code	Meaning
400	`BAD_REQUEST`	Invalid request body or missing required fields
401	`UNAUTHORIZED`	Missing or invalid API key
402	`INSUFFICIENT_CREDITS`	Balance below the $20 minimum required to deploy
404	`NOT_FOUND`	Resource not found
409	`CONFLICT`	GPU type unavailable (no capacity) or idempotency conflict
500	`INTERNAL`	Server error — safe to retry with an idempotency key
502	`BAD_GATEWAY`	Upstream provider temporarily unavailable

Idempotency

For operations that create or delete resources, pass an Idempotency-Key header to enable safe retries:

curl -X POST -H "X-API-Key: mk_your_key" \
  -H "Idempotency-Key: deploy-training-run-001" \
  -H "Content-Type: application/json" \
  -d '{ ... }' \
  https://gpu.huddleapis.com/api/v1/deployments/clusters

If the same key is sent again while the original request is still processing, you'll receive a 409 Conflict. If the original request has completed, you'll receive the cached result — no duplicate resource is created.

When to use idempotency keys

Always use idempotency keys for deploy and delete operations, especially when retrying after network timeouts. This prevents accidental double deployments and double billing.

Overview

On this page