Overview
Introduction to the Huddle GPU API — deploy bare-metal NVIDIA GPUs on demand with per-second billing.
The Huddle GPU API gives you programmatic access to bare-metal NVIDIA GPUs (H100, H200, A100, RTX 6000, and more) across multiple datacenter regions. Deploy GPU clusters in minutes, pay per second, and manage everything through a simple REST API.
Base URL
https://gpu.huddleapis.com/api/v1All endpoints are served over HTTPS. HTTP requests are rejected.
Authentication
Every request must include your API key in one of two ways:
# Option 1: X-API-Key header (recommended)
curl -H "X-API-Key: mk_your_key_here" \
https://gpu.huddleapis.com/api/v1/billing/balance
# Option 2: Bearer token
curl -H "Authorization: Bearer mk_your_key_here" \
https://gpu.huddleapis.com/api/v1/billing/balanceAPI keys use the mk_ prefix. You receive your first key when you register, and can create additional keys via the API Keys endpoints.
Keep your keys safe
API key secrets are shown once at creation time and cannot be retrieved later. Store them securely — treat them like passwords.
Quick Start
Here's the end-to-end workflow to go from zero to a running GPU instance:
# 1. Check your credit balance (minimum $20 required to deploy)
curl -H "X-API-Key: mk_your_key" \
https://gpu.huddleapis.com/api/v1/billing/balance
# 2. Browse available GPU offers and pricing
curl -H "X-API-Key: mk_your_key" \
https://gpu.huddleapis.com/api/v1/gpus/available
# 3. Upload your SSH public key
curl -X POST -H "X-API-Key: mk_your_key" \
-H "Content-Type: application/json" \
-d '{"name": "my-laptop", "public_key": "ssh-ed25519 AAAA... user@host"}' \
https://gpu.huddleapis.com/api/v1/ssh-keys
# 4. Pick a compatible OS image
curl -H "X-API-Key: mk_your_key" \
https://gpu.huddleapis.com/api/v1/images
# 5. Deploy a GPU cluster
curl -X POST -H "X-API-Key: mk_your_key" \
-H "Content-Type: application/json" \
-d '{
"cluster_type": "1H100.80S.30V",
"image": "ubuntu-22.04-cuda-12.4-cluster",
"hostname": "my-training-run",
"ssh_key_ids": ["your-ssh-key-id"],
"location": "FIN-01"
}' \
https://gpu.huddleapis.com/api/v1/deployments/clusters
# 6. Once status is "running", SSH in
ssh root@<deployment_ip>
# 7. When done, delete the deployment to stop billing
curl -X DELETE -H "X-API-Key: mk_your_key" \
https://gpu.huddleapis.com/api/v1/deployments/<deployment_id>Response Format
All API responses use a consistent JSON envelope:
Success:
{
"code": "OK",
"data": { ... }
}Error:
{
"code": "BAD_REQUEST",
"error": "cluster_type is required"
}Paginated lists include a meta object with cursor-based pagination:
{
"code": "OK",
"data": [ ... ],
"meta": {
"total": 42,
"cursor": "eyJpZCI6Imxhc3QtaWQifQ==",
"has_more": true
}
}Pass cursor as a query parameter to fetch the next page of results.
Key Concepts
| Concept | Description |
|---|---|
| Offer | A GPU instance type available for deployment, with specs, pricing, and region availability. Think of it as a product listing. |
| Cluster Type | The instance type identifier (e.g., 1H100.80S.30V = 1x H100, 80GB SSD, 30 vCPUs). Used when deploying. |
| Deployment | A running GPU cluster instance. Has a lifecycle: ordered → provisioning → running → deleted. |
| Region | A datacenter location (e.g., FIN-01 for Finland). GPU availability and pricing vary by region. |
| Image | A pre-built OS snapshot with CUDA drivers and GPU toolkits. Required when deploying. |
| Volume | Persistent storage (NVMe or HDD) attached to a deployment. Survives instance deletion for data reuse. |
| SSH Key | A public key uploaded to your account. Injected into deployments at provisioning time for SSH access. |
| Webhook | An HTTP endpoint you register to receive real-time deployment status notifications. |
| Waitlist | A queue system for GPUs that are currently at capacity. Get notified or auto-deploy when one opens up. |
| Credits | Prepaid USD balance used for billing. Per-second charges are deducted automatically while a deployment runs. |
API Sections
| Section | What it does |
|---|---|
| GPU Billing | Understand credits, billing cycles, minimum charges, and balance management |
| Offers | Browse available GPU types with real-time pricing and specs |
| Deployments | Deploy, manage, and delete GPU clusters |
| Waitlist | Queue for unavailable GPUs and get notified when capacity opens |
| Images | List OS images pre-configured with CUDA drivers |
| Volumes | Create and manage persistent storage |
| SSH Keys | Upload keys for deployment access |
| API Keys | Create and manage authentication keys |
| Webhooks | Subscribe to real-time deployment events |
| Regions | Discover available datacenter locations |
Error Codes
All errors return a standard HTTP status code with a code field:
| HTTP Status | Code | Meaning |
|---|---|---|
| 400 | BAD_REQUEST | Invalid request body or missing required fields |
| 401 | UNAUTHORIZED | Missing or invalid API key |
| 402 | INSUFFICIENT_CREDITS | Balance below the $20 minimum required to deploy |
| 404 | NOT_FOUND | Resource not found |
| 409 | CONFLICT | GPU type unavailable (no capacity) or idempotency conflict |
| 500 | INTERNAL | Server error — safe to retry with an idempotency key |
| 502 | BAD_GATEWAY | Upstream provider temporarily unavailable |
Idempotency
For operations that create or delete resources, pass an Idempotency-Key header to enable safe retries:
curl -X POST -H "X-API-Key: mk_your_key" \
-H "Idempotency-Key: deploy-training-run-001" \
-H "Content-Type: application/json" \
-d '{ ... }' \
https://gpu.huddleapis.com/api/v1/deployments/clustersIf the same key is sent again while the original request is still processing, you'll receive a 409 Conflict. If the original request has completed, you'll receive the cached result — no duplicate resource is created.
When to use idempotency keys
Always use idempotency keys for deploy and delete operations, especially when retrying after network timeouts. This prevents accidental double deployments and double billing.