Workload API

FEATURE STATE: Kubernetes v1.35 [alpha](disabled by default)

The Workload API resource allows you to describe the scheduling requirements and structure of a multi-Pod application. While workload controllers provide runtime behavior for the workloads, the Workload API is supposed to provide scheduling constraints for the "true" workloads, such as Job and others.

What is a Workload?

The Workload API resource is part of the scheduling.k8s.io/v1alpha1 API group (and your cluster must have that API group enabled, as well as the GenericWorkload feature gate, before you can benefit from this API). This resource acts as a structured, machine-readable definition of the scheduling requirements of a multi-Pod application. While user-facing workloads like Jobs define what to run, the Workload resource determines how a group of Pods should be scheduled and how its placement should be managed throughout its lifecycle.

API structure

A Workload allows you to define a group of Pods and apply a scheduling policy to them. It consists of two sections: a list of pod groups and a reference to a controller.

Pod groups

The podGroups list defines the distinct components of your workload. For example, a machine learning job might have a driver group and a worker group.

Each entry in podGroups must have:

  1. A unique name that can be used in the Pod's Workload reference.
  2. A scheduling policy (basic or gang).
apiVersion: scheduling.k8s.io/v1alpha1
kind: Workload
metadata:
  name: training-job-workload
  namespace: some-ns
spec:
  controllerRef:
    apiGroup: batch
    kind: Job
    name: training-job
  podGroups:
  - name: workers
    policy:
      gang:
        # The gang is schedulable only if 4 pods can run at once
        minCount: 4

Referencing a workload controlling object

The controllerRef field links the Workload back to the specific high-level object defining the application, such as a Job or a custom CRD. This is useful for observability and tooling. This data is not used to schedule or manage the Workload.

Requesting DRA devices for a PodGroup

FEATURE STATE: Kubernetes v1.36 [alpha](disabled by default)

Devices available through Dynamic Resource Allocation (DRA) can be requested by a PodGroup through its spec.resourceClaims field:

apiVersion: scheduling.k8s.io/v1alpha2
kind: PodGroup
metadata:
  name: training-group
  namespace: some-ns
spec:
  ...
  resourceClaims:
  - name: pg-claim
    resourceClaimName: my-pg-claim
  - name: pg-claim-template
    resourceClaimTemplateName: my-pg-template

ResourceClaims associated with PodGroups can be shared by more than 256 Pods. ResourceClaims can also be generated from ResourceClaimTemplates for each PodGroup, allowing the devices allocated to each generated ResourceClaim to be shared by the Pods in each PodGroup.

For more details and a more complete example, see the DRA documentation.

What's next