Authentication
All API endpoints require a Bearer token. Use your API key in every request.
GPU Management
Manage GPU clusters, compute nodes, GPU registration, and metric ingestion.
▸ Response example
▸ Request body
| Field | Type | Required | Description |
|---|---|---|---|
| name | string | required | Human-readable cluster name |
| region | string | optional | AWS-style region code, e.g. "us-east-1" |
▸ Request body
| Field | Type | Required | Description |
|---|---|---|---|
| cluster_id | integer | required | Parent cluster ID |
| hostname | string | required | Unique node hostname |
| ip_address | string | optional | Node IP address |
| total_gpus | integer | optional | Number of GPUs on this node |
| cpu_cores | integer | optional | CPU core count |
| ram_gb | integer | optional | RAM in GB |
▸ Request body
| Field | Type | Required | Description |
|---|---|---|---|
| node_id | integer | required | Parent node ID |
| gpus | array | required | Array of GPU objects (see below) |
| gpus[].gpu_index | integer | optional | 0-based GPU index on the node |
| gpus[].model | string | optional | GPU model string, e.g. "NVIDIA H100 80GB" |
| gpus[].vram_mb | integer | optional | VRAM in MB |
▸ Request body
| Field | Type | Required | Description |
|---|---|---|---|
| metrics | array | required | Array of metric objects |
| metrics[].gpu_id | integer | required | GPU ID to record metrics for |
| metrics[].utilization_pct | number | optional | GPU utilization 0–100 |
| metrics[].memory_used_mb | integer | optional | VRAM used in MB |
| metrics[].memory_total_mb | integer | optional | Total VRAM in MB |
| metrics[].temperature_c | number | optional | Temperature in Celsius |
| metrics[].power_draw_w | number | optional | Power draw in Watts |
Tenant Management
Multi-tenant GPU quota management. Each tenant has a GPU quota enforced at job submission and scheduling time.
▸ Response example
▸ Request body
| Field | Type | Required | Description |
|---|---|---|---|
| name | string | required | Tenant display name |
| gpu_quota | integer | required | Max GPUs this tenant can run simultaneously (min 1) |
| contact_email | string | optional | Billing contact email |
| plan_tier | string | optional | Plan label: starter, pro, growth, enterprise (default: starter) |
| status | string | optional | active or inactive (default: active) |
▸ Query parameters
| Field | Type | Required | Description |
|---|---|---|---|
| days | integer | optional | Lookback window in days, 1–90 (default: 30) |
Job Management
Submit and manage GPU workloads. The scheduler runs every 30s to assign idle GPUs and enforce quotas.
▸ Request body
| Field | Type | Required | Description |
|---|---|---|---|
| cluster_id | integer | required | Target cluster ID |
| name | string | required | Job display name |
| gpu_count | integer | optional | Number of GPUs to request (default: 1) |
| gpu_type | string | optional | GPU model preference |
| priority | integer | optional | 1 = highest priority (default: 5) |
| tenant_id | integer | optional | Tenant to bill and quota-check against |
| submitted_by | string | optional | Submitter identifier for audit trail |
| estimated_duration_min | integer | optional | Estimated runtime in minutes |
▸ Response example
▸ Request body
| Field | Type | Required | Description |
|---|---|---|---|
| status | string | required | New status value |
| block_reason | string | optional | Reason string when status = "blocked" |
Billing
Invoice generation via Stripe, billing summary, and invoice status management.
▸ Query parameters
| Field | Type | Required | Description |
|---|---|---|---|
| days | integer | optional | Billing period in days, 1–90 (default: 30) |
▸ Response example
Alerts
Configurable fleet alerting. Rules are evaluated every 30s. Three built-in types: utilization low, quota breach, health degraded.
▸ Response example
▸ Query parameters
| Field | Type | Required | Description |
|---|---|---|---|
| limit | integer | optional | Max events to return, 1–100 (default: 50) |
API Keys
Manage programmatic access keys. Raw keys are shown exactly once at creation — they are not stored and cannot be retrieved.
gfk_...) is returned once in the response — store it securely immediately.▸ Request body
| Field | Type | Required | Description |
|---|---|---|---|
| label | string | required | Human-readable label for this key |
| tenant_id | integer | optional | Associate key with a tenant |
| permissions | array | optional | Permission scopes: ["read", "write", "admin"] (default: ["read", "write"]) |
▸ Response example
Dashboard
Executive KPIs, fleet overview, revenue analytics, and revenue leakage detection.
▸ Response example
▸ Query parameters
| Field | Type | Required | Description |
|---|---|---|---|
| days | integer | optional | Lookback window, 1–90 (default: 30) |