API Reference
GPUForge Fleet Orchestration API · v1

Authentication

All API endpoints require a Bearer token. Use your API key in every request.

Bearer Token Authentication

Pass your API key in the Authorization header on every request.

Authorization: Bearer YOUR_API_KEY
Step 1

Generate a key

Go to the API Keys tab in the dashboard and click "Generate Key". The raw key is shown once — copy it immediately.

Step 2

Add the header

Include Authorization: Bearer <key> in every API request. Missing header → demo mode (read-only).

Step 3

Revoke when done

Keys can be revoked any time from the dashboard. Revoked keys return 401 immediately and the audit trail is preserved.

Paste key to pre-fill curl examples

GPU Management

Manage GPU clusters, compute nodes, GPU registration, and metric ingestion.

GET
/api/clusters
List all GPU clusters with live job counts and GPU totals.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/clusters
▸ Response example
[ { "id": 1, "name": "us-east-prod", "region": "us-east-1", "status": "active", "node_count": 6, "gpu_count": 36, "active_jobs": 3, "queued_jobs": 2 } ]
POST
/api/clusters
Create a new GPU cluster. Returns the cluster record with a generated API key.
▸ Request body
FieldTypeRequiredDescription
namestringrequiredHuman-readable cluster name
regionstringoptionalAWS-style region code, e.g. "us-east-1"
curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"name": "my-cluster", "region": "us-west-2"}' \ https://www.gpuforg.com/api/clusters
GET
/api/clusters/:id
Get full cluster details including nodes and recent jobs.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/clusters/1
POST
/api/clusters/nodes
Register or heartbeat a compute node. Idempotent — updates existing node if hostname matches.
▸ Request body
FieldTypeRequiredDescription
cluster_idintegerrequiredParent cluster ID
hostnamestringrequiredUnique node hostname
ip_addressstringoptionalNode IP address
total_gpusintegeroptionalNumber of GPUs on this node
cpu_coresintegeroptionalCPU core count
ram_gbintegeroptionalRAM in GB
curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"cluster_id": 1, "hostname": "gpu-01", "ip_address": "10.0.1.10", "total_gpus": 8, "cpu_cores": 128, "ram_gb": 1024}' \ https://www.gpuforg.com/api/clusters/nodes
POST
/api/clusters/gpus
Register GPUs on a node. Idempotent per gpu_index. Returns only newly created records.
▸ Request body
FieldTypeRequiredDescription
node_idintegerrequiredParent node ID
gpusarrayrequiredArray of GPU objects (see below)
gpus[].gpu_indexintegeroptional0-based GPU index on the node
gpus[].modelstringoptionalGPU model string, e.g. "NVIDIA H100 80GB"
gpus[].vram_mbintegeroptionalVRAM in MB
curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"node_id": 1, "gpus": [{"gpu_index": 0, "model": "NVIDIA H100 80GB", "vram_mb": 81920}, {"gpu_index": 1, "model": "NVIDIA H100 80GB", "vram_mb": 81920}]}' \ https://www.gpuforg.com/api/clusters/gpus
POST
/api/clusters/metrics
Bulk ingest GPU telemetry. Updates GPU status (active/idle) based on utilization threshold (>10%).
▸ Request body
FieldTypeRequiredDescription
metricsarrayrequiredArray of metric objects
metrics[].gpu_idintegerrequiredGPU ID to record metrics for
metrics[].utilization_pctnumberoptionalGPU utilization 0–100
metrics[].memory_used_mbintegeroptionalVRAM used in MB
metrics[].memory_total_mbintegeroptionalTotal VRAM in MB
metrics[].temperature_cnumberoptionalTemperature in Celsius
metrics[].power_draw_wnumberoptionalPower draw in Watts
curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"metrics": [{"gpu_id": 1, "utilization_pct": 87.3, "memory_used_mb": 71680, "memory_total_mb": 81920, "temperature_c": 72.1, "power_draw_w": 410.5}]}' \ https://www.gpuforg.com/api/clusters/metrics
GET
/api/dashboard/gpu-allocation
GPU-to-job assignment map. Shows which GPUs are allocated to which running jobs with live utilization.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard/gpu-allocation

Tenant Management

Multi-tenant GPU quota management. Each tenant has a GPU quota enforced at job submission and scheduling time.

GET
/api/tenants
List all tenants with live GPU-in-use counts and running/queued job totals.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/tenants
▸ Response example
[ { "id": 1, "name": "NovAI Labs", "plan_tier": "enterprise", "gpu_quota": 10, "gpus_in_use": 8, "running_jobs": 2, "queued_jobs": 1, "status": "active" } ]
POST
/api/tenants
Create a new tenant with GPU quota.
▸ Request body
FieldTypeRequiredDescription
namestringrequiredTenant display name
gpu_quotaintegerrequiredMax GPUs this tenant can run simultaneously (min 1)
contact_emailstringoptionalBilling contact email
plan_tierstringoptionalPlan label: starter, pro, growth, enterprise (default: starter)
statusstringoptionalactive or inactive (default: active)
curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"name": "Acme ML", "gpu_quota": 16, "contact_email": "billing@acme.ai", "plan_tier": "enterprise"}' \ https://www.gpuforg.com/api/tenants
PUT
/api/tenants/:id
Update tenant fields. All fields optional — only provided fields are updated.
curl -X PUT -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"gpu_quota": 20, "plan_tier": "enterprise"}' \ https://www.gpuforg.com/api/tenants/1
DELETE
/api/tenants/:id
Delete a tenant. Unlinks all associated jobs (sets tenant_id to null) before deletion.
curl -X DELETE -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/tenants/1
GET
/api/tenants/:id/usage
Usage events + cost breakdown for a tenant. Aggregated by GPU type and by day.
▸ Query parameters
FieldTypeRequiredDescription
daysintegeroptionalLookback window in days, 1–90 (default: 30)
curl -H "Authorization: Bearer YOUR_KEY" \ "https://www.gpuforg.com/api/tenants/1/usage?days=30"

Job Management

Submit and manage GPU workloads. The scheduler runs every 30s to assign idle GPUs and enforce quotas.

POST
/api/jobs
Submit a job. Validates tenant quota and cluster capacity — jobs exceeding limits are inserted as "blocked" (not rejected).
▸ Request body
FieldTypeRequiredDescription
cluster_idintegerrequiredTarget cluster ID
namestringrequiredJob display name
gpu_countintegeroptionalNumber of GPUs to request (default: 1)
gpu_typestringoptionalGPU model preference
priorityintegeroptional1 = highest priority (default: 5)
tenant_idintegeroptionalTenant to bill and quota-check against
submitted_bystringoptionalSubmitter identifier for audit trail
estimated_duration_minintegeroptionalEstimated runtime in minutes
curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"cluster_id": 1, "name": "llama-3-70b-finetune", "gpu_count": 8, "gpu_type": "NVIDIA H100 80GB", "priority": 1, "tenant_id": 1}' \ https://www.gpuforg.com/api/jobs
▸ Response example
{ "id": 42, "name": "llama-3-70b-finetune", "status": "queued", "gpu_count": 8, "priority": 1, "block_reason": null, "submitted_at": "2026-05-09T00:30:00Z" }
PATCH
/api/jobs/:id
Update job status. Valid statuses: queued, running, completed, failed, cancelled, blocked.
▸ Request body
FieldTypeRequiredDescription
statusstringrequiredNew status value
block_reasonstringoptionalReason string when status = "blocked"
curl -X PATCH -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"status": "completed"}' \ https://www.gpuforg.com/api/jobs/42
GET
/api/dashboard/scheduling-metrics
Queue depth, wait times, throughput (24h), and live active queue with per-cluster breakdown.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard/scheduling-metrics
GET
/api/scheduler/logs
Recent scheduler cycle logs — what was processed, started, blocked, and why.
curl -H "Authorization: Bearer YOUR_KEY" \ "https://www.gpuforg.com/api/scheduler/logs?limit=20"

Billing

Invoice generation via Stripe, billing summary, and invoice status management.

POST
/api/tenants/:id/invoice
Calculate usage cost for a tenant and generate a Stripe payment link. Idempotent — returns existing pending invoice if one exists for the period.
▸ Query parameters
FieldTypeRequiredDescription
daysintegeroptionalBilling period in days, 1–90 (default: 30)
curl -X POST -H "Authorization: Bearer YOUR_KEY" \ "https://www.gpuforg.com/api/tenants/1/invoice?days=30"
▸ Response example
{ "invoice": { "id": 7, "status": "pending", "stripe_payment_url": "https://buy.stripe.com/..." }, "total_gpu_hours": 1842.5, "total_amount": 8770.25, "payment_url": "https://buy.stripe.com/..." }
GET
/api/billing/summary
Aggregate billing KPIs — total invoiced, paid, pending, overdue — with per-tenant breakdown and last 20 invoices.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/billing/summary
PATCH
/api/invoices/:id/status
Mark an invoice as paid, pending, or overdue. Valid values: paid, pending, overdue.
curl -X PATCH -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"status": "paid"}' \ https://www.gpuforg.com/api/invoices/7/status
GET
/api/rates
GPU rate card — $/hr per model. H100=$4.76, A100=$2.21, A6000=$1.28.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/rates

Alerts

Configurable fleet alerting. Rules are evaluated every 30s. Three built-in types: utilization low, quota breach, health degraded.

GET
/api/alerts/rules
List all alert rules with active event count per rule.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/alerts/rules
▸ Response example
[ { "id": 1, "rule_type": "idle_gpu", "threshold": 60, "enabled": true, "active_events": 0 }, { "id": 2, "rule_type": "quota_breach", "threshold": 90, "enabled": true, "active_events": 1 } ]
PATCH
/api/alerts/rules/:id
Enable or disable an alert rule.
curl -X PATCH -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"enabled": false}' \ https://www.gpuforg.com/api/alerts/rules/1
GET
/api/alerts/events
Recent alert events. Active (unresolved) events sorted first.
▸ Query parameters
FieldTypeRequiredDescription
limitintegeroptionalMax events to return, 1–100 (default: 50)
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/alerts/events

API Keys

Manage programmatic access keys. Raw keys are shown exactly once at creation — they are not stored and cannot be retrieved.

POST
/api/api-keys
Generate a new API key. The raw key (gfk_...) is returned once in the response — store it securely immediately.
▸ Request body
FieldTypeRequiredDescription
labelstringrequiredHuman-readable label for this key
tenant_idintegeroptionalAssociate key with a tenant
permissionsarrayoptionalPermission scopes: ["read", "write", "admin"] (default: ["read", "write"])
curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"label": "prod-metrics-ingestor", "permissions": ["read", "write"]}' \ https://www.gpuforg.com/api/api-keys
▸ Response example
{ "id": 3, "key": "gfk_a3f8b1c2d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0", "key_prefix": "gfk_a3f8b1c2", "label": "prod-metrics-ingestor", "permissions": ["read", "write"], "created_at": "2026-05-09T00:30:00Z", "revoked_at": null }
GET
/api/api-keys
List all API keys. Shows prefix (first 12 chars), permissions, and last used time. Raw keys are never returned.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/api-keys
DELETE
/api/api-keys/:id
Revoke an API key (soft delete). Revoked keys return 401 immediately. Audit trail is preserved.
curl -X DELETE -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/api-keys/3

Dashboard

Executive KPIs, fleet overview, revenue analytics, and revenue leakage detection.

GET
/api/dashboard/kpis
Executive KPIs: Fleet Utilization%, Revenue Leakage%, Quota Utilization (with per-tenant breakdown), Fleet Health, Active Jobs.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard/kpis
▸ Response example
{ "fleet_utilization": { "percentage": 73.6, "active_gpus": 26, "total_gpus": 36, "status": "good" }, "revenue_leakage": { "percentage": 12.4, "idle_allocated_gpus": 5, "direction": "down" }, "quota_utilization": { "percentage": 68.4, "gpus_in_use": 26, "total_quota": 38 }, "fleet_health": { "healthy": 26, "total": 36, "status": "has_issues" }, "active_jobs": { "running": 4, "queued": 3, "blocked": 2 } }
GET
/api/dashboard
Full fleet overview — node/GPU topology with latest metrics, stats summary, and 20 recent jobs.
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard
GET
/api/dashboard/revenue
Revenue analytics — per-tenant, by GPU type, daily trend, and total. Configurable lookback window.
▸ Query parameters
FieldTypeRequiredDescription
daysintegeroptionalLookback window, 1–90 (default: 30)
curl -H "Authorization: Bearer YOUR_KEY" \ "https://www.gpuforg.com/api/dashboard/revenue?days=30"
GET
/api/dashboard/leakage
Revenue leakage detection — idle allocated GPUs (<10% util in 30m), metered vs billed gaps (>5% discrepancy flagged), underutilized tenants (<30% avg in 24h).
curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard/leakage