GPUForge API Reference

API Reference

GPUForge Fleet Orchestration API · v1

Dashboard → Get API Key

Authentication

All API endpoints require a Bearer token. Use your API key in every request.

Authorization: Bearer YOUR_API_KEY

Step 1

Generate a key

Go to the API Keys tab in the dashboard and click "Generate Key". The raw key is shown once — copy it immediately.

Step 2

Add the header

Include Authorization: Bearer <key> in every API request. Missing header → demo mode (read-only).

Step 3

Revoke when done

Keys can be revoked any time from the dashboard. Revoked keys return 401 immediately and the audit trail is preserved.

🔑 Your API key Paste key to pre-fill curl examples

GPU Management

Manage GPU clusters, compute nodes, GPU registration, and metric ingestion.

GET

/api/clusters

List all GPU clusters with live job counts and GPU totals.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/clusters

▸ Response example

[ { "id": 1, "name": "us-east-prod", "region": "us-east-1", "status": "active", "node_count": 6, "gpu_count": 36, "active_jobs": 3, "queued_jobs": 2 } ]

POST

/api/clusters

Create a new GPU cluster. Returns the cluster record with a generated API key.

▸ Request body

Field	Type	Required	Description
name	string	required	Human-readable cluster name
region	string	optional	AWS-style region code, e.g. "us-east-1"

curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"name": "my-cluster", "region": "us-west-2"}' \ https://www.gpuforg.com/api/clusters

GET

/api/clusters/:id

Get full cluster details including nodes and recent jobs.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/clusters/1

POST

/api/clusters/nodes

▸ Request body

Field	Type	Required	Description
cluster_id	integer	required	Parent cluster ID
hostname	string	required	Unique node hostname
ip_address	string	optional	Node IP address
total_gpus	integer	optional	Number of GPUs on this node
cpu_cores	integer	optional	CPU core count
ram_gb	integer	optional	RAM in GB

curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"cluster_id": 1, "hostname": "gpu-01", "ip_address": "10.0.1.10", "total_gpus": 8, "cpu_cores": 128, "ram_gb": 1024}' \ https://www.gpuforg.com/api/clusters/nodes

POST

/api/clusters/gpus

▸ Request body

Field	Type	Required	Description
node_id	integer	required	Parent node ID
gpus	array	required	Array of GPU objects (see below)
gpus[].gpu_index	integer	optional	0-based GPU index on the node
gpus[].model	string	optional	GPU model string, e.g. "NVIDIA H100 80GB"
gpus[].vram_mb	integer	optional	VRAM in MB

curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"node_id": 1, "gpus": [{"gpu_index": 0, "model": "NVIDIA H100 80GB", "vram_mb": 81920}, {"gpu_index": 1, "model": "NVIDIA H100 80GB", "vram_mb": 81920}]}' \ https://www.gpuforg.com/api/clusters/gpus

POST

/api/clusters/metrics

Bulk ingest GPU telemetry. Updates GPU status (active/idle) based on utilization threshold (>10%).

▸ Request body

Field	Type	Required	Description
metrics	array	required	Array of metric objects
metrics[].gpu_id	integer	required	GPU ID to record metrics for
metrics[].utilization_pct	number	optional	GPU utilization 0–100
metrics[].memory_used_mb	integer	optional	VRAM used in MB
metrics[].memory_total_mb	integer	optional	Total VRAM in MB
metrics[].temperature_c	number	optional	Temperature in Celsius
metrics[].power_draw_w	number	optional	Power draw in Watts

curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"metrics": [{"gpu_id": 1, "utilization_pct": 87.3, "memory_used_mb": 71680, "memory_total_mb": 81920, "temperature_c": 72.1, "power_draw_w": 410.5}]}' \ https://www.gpuforg.com/api/clusters/metrics

GET

/api/dashboard/gpu-allocation

GPU-to-job assignment map. Shows which GPUs are allocated to which running jobs with live utilization.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard/gpu-allocation

Tenant Management

Multi-tenant GPU quota management. Each tenant has a GPU quota enforced at job submission and scheduling time.

GET

/api/tenants

List all tenants with live GPU-in-use counts and running/queued job totals.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/tenants

▸ Response example

[ { "id": 1, "name": "NovAI Labs", "plan_tier": "enterprise", "gpu_quota": 10, "gpus_in_use": 8, "running_jobs": 2, "queued_jobs": 1, "status": "active" } ]

POST

/api/tenants

Create a new tenant with GPU quota.

▸ Request body

Field	Type	Required	Description
name	string	required	Tenant display name
gpu_quota	integer	required	Max GPUs this tenant can run simultaneously (min 1)
contact_email	string	optional	Billing contact email
plan_tier	string	optional	Plan label: starter, pro, growth, enterprise (default: starter)
status	string	optional	active or inactive (default: active)

curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"name": "Acme ML", "gpu_quota": 16, "contact_email": "billing@acme.ai", "plan_tier": "enterprise"}' \ https://www.gpuforg.com/api/tenants

PUT

/api/tenants/:id

Update tenant fields. All fields optional — only provided fields are updated.

curl -X PUT -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"gpu_quota": 20, "plan_tier": "enterprise"}' \ https://www.gpuforg.com/api/tenants/1

DELETE

/api/tenants/:id

Delete a tenant. Unlinks all associated jobs (sets tenant_id to null) before deletion.

curl -X DELETE -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/tenants/1

GET

/api/tenants/:id/usage

Usage events + cost breakdown for a tenant. Aggregated by GPU type and by day.

▸ Query parameters

Field	Type	Required	Description
days	integer	optional	Lookback window in days, 1–90 (default: 30)

curl -H "Authorization: Bearer YOUR_KEY" \ "https://www.gpuforg.com/api/tenants/1/usage?days=30"

Job Management

Submit and manage GPU workloads. The scheduler runs every 30s to assign idle GPUs and enforce quotas.

POST

/api/jobs

Submit a job. Validates tenant quota and cluster capacity — jobs exceeding limits are inserted as "blocked" (not rejected).

▸ Request body

Field	Type	Required	Description
cluster_id	integer	required	Target cluster ID
name	string	required	Job display name
gpu_count	integer	optional	Number of GPUs to request (default: 1)
gpu_type	string	optional	GPU model preference
priority	integer	optional	1 = highest priority (default: 5)
tenant_id	integer	optional	Tenant to bill and quota-check against
submitted_by	string	optional	Submitter identifier for audit trail
estimated_duration_min	integer	optional	Estimated runtime in minutes

curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"cluster_id": 1, "name": "llama-3-70b-finetune", "gpu_count": 8, "gpu_type": "NVIDIA H100 80GB", "priority": 1, "tenant_id": 1}' \ https://www.gpuforg.com/api/jobs

▸ Response example

{ "id": 42, "name": "llama-3-70b-finetune", "status": "queued", "gpu_count": 8, "priority": 1, "block_reason": null, "submitted_at": "2026-05-09T00:30:00Z" }

PATCH

/api/jobs/:id

Update job status. Valid statuses: queued, running, completed, failed, cancelled, blocked.

▸ Request body

Field	Type	Required	Description
status	string	required	New status value
block_reason	string	optional	Reason string when status = "blocked"

curl -X PATCH -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"status": "completed"}' \ https://www.gpuforg.com/api/jobs/42

GET

/api/dashboard/scheduling-metrics

Queue depth, wait times, throughput (24h), and live active queue with per-cluster breakdown.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard/scheduling-metrics

GET

/api/scheduler/logs

Recent scheduler cycle logs — what was processed, started, blocked, and why.

curl -H "Authorization: Bearer YOUR_KEY" \ "https://www.gpuforg.com/api/scheduler/logs?limit=20"

Billing

Invoice generation via Stripe, billing summary, and invoice status management.

POST

/api/tenants/:id/invoice

Calculate usage cost for a tenant and generate a Stripe payment link. Idempotent — returns existing pending invoice if one exists for the period.

▸ Query parameters

Field	Type	Required	Description
days	integer	optional	Billing period in days, 1–90 (default: 30)

curl -X POST -H "Authorization: Bearer YOUR_KEY" \ "https://www.gpuforg.com/api/tenants/1/invoice?days=30"

▸ Response example

{ "invoice": { "id": 7, "status": "pending", "stripe_payment_url": "https://buy.stripe.com/..." }, "total_gpu_hours": 1842.5, "total_amount": 8770.25, "payment_url": "https://buy.stripe.com/..." }

GET

/api/billing/summary

Aggregate billing KPIs — total invoiced, paid, pending, overdue — with per-tenant breakdown and last 20 invoices.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/billing/summary

PATCH

/api/invoices/:id/status

Mark an invoice as paid, pending, or overdue. Valid values: paid, pending, overdue.

curl -X PATCH -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"status": "paid"}' \ https://www.gpuforg.com/api/invoices/7/status

GET

/api/rates

GPU rate card — $/hr per model. H100=$4.76, A100=$2.21, A6000=$1.28.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/rates

Alerts

Configurable fleet alerting. Rules are evaluated every 30s. Three built-in types: utilization low, quota breach, health degraded.

GET

/api/alerts/rules

List all alert rules with active event count per rule.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/alerts/rules

▸ Response example

[ { "id": 1, "rule_type": "idle_gpu", "threshold": 60, "enabled": true, "active_events": 0 }, { "id": 2, "rule_type": "quota_breach", "threshold": 90, "enabled": true, "active_events": 1 } ]

PATCH

/api/alerts/rules/:id

Enable or disable an alert rule.

curl -X PATCH -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"enabled": false}' \ https://www.gpuforg.com/api/alerts/rules/1

GET

/api/alerts/events

Recent alert events. Active (unresolved) events sorted first.

▸ Query parameters

Field	Type	Required	Description
limit	integer	optional	Max events to return, 1–100 (default: 50)

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/alerts/events

API Keys

Manage programmatic access keys. Raw keys are shown exactly once at creation — they are not stored and cannot be retrieved.

POST

/api/api-keys

Generate a new API key. The raw key (gfk_...) is returned once in the response — store it securely immediately.

▸ Request body

Field	Type	Required	Description
label	string	required	Human-readable label for this key
tenant_id	integer	optional	Associate key with a tenant
permissions	array	optional	Permission scopes: ["read", "write", "admin"] (default: ["read", "write"])

curl -X POST -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"label": "prod-metrics-ingestor", "permissions": ["read", "write"]}' \ https://www.gpuforg.com/api/api-keys

▸ Response example

{ "id": 3, "key": "gfk_a3f8b1c2d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0", "key_prefix": "gfk_a3f8b1c2", "label": "prod-metrics-ingestor", "permissions": ["read", "write"], "created_at": "2026-05-09T00:30:00Z", "revoked_at": null }

GET

/api/api-keys

List all API keys. Shows prefix (first 12 chars), permissions, and last used time. Raw keys are never returned.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/api-keys

DELETE

/api/api-keys/:id

Revoke an API key (soft delete). Revoked keys return 401 immediately. Audit trail is preserved.

curl -X DELETE -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/api-keys/3

Dashboard

Executive KPIs, fleet overview, revenue analytics, and revenue leakage detection.

GET

/api/dashboard/kpis

Executive KPIs: Fleet Utilization%, Revenue Leakage%, Quota Utilization (with per-tenant breakdown), Fleet Health, Active Jobs.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard/kpis

▸ Response example

{ "fleet_utilization": { "percentage": 73.6, "active_gpus": 26, "total_gpus": 36, "status": "good" }, "revenue_leakage": { "percentage": 12.4, "idle_allocated_gpus": 5, "direction": "down" }, "quota_utilization": { "percentage": 68.4, "gpus_in_use": 26, "total_quota": 38 }, "fleet_health": { "healthy": 26, "total": 36, "status": "has_issues" }, "active_jobs": { "running": 4, "queued": 3, "blocked": 2 } }

GET

/api/dashboard

Full fleet overview — node/GPU topology with latest metrics, stats summary, and 20 recent jobs.

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard

GET

/api/dashboard/revenue

Revenue analytics — per-tenant, by GPU type, daily trend, and total. Configurable lookback window.

▸ Query parameters

Field	Type	Required	Description
days	integer	optional	Lookback window, 1–90 (default: 30)

curl -H "Authorization: Bearer YOUR_KEY" \ "https://www.gpuforg.com/api/dashboard/revenue?days=30"

GET

/api/dashboard/leakage

Revenue leakage detection — idle allocated GPUs (<10% util in 30m), metered vs billed gaps (>5% discrepancy flagged), underutilized tenants (<30% avg in 24h).

curl -H "Authorization: Bearer YOUR_KEY" \ https://www.gpuforg.com/api/dashboard/leakage

Authentication

Bearer Token Authentication

Generate a key

Add the header

Revoke when done

GPU Management

Tenant Management

Job Management

Billing

Alerts

API Keys

Dashboard