Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: databricks/cli
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: databricks/cli
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: air-cli
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 8 commits
  • 69 files changed
  • 2 contributors

Commits on Jun 17, 2026

  1. AIR CLI Integration: Scaffold experimental AIR CLI command package (#…

    …5564)
    
    ## Changes
    - New `cmd/experimental/air/ ` package containing an air parent command
    plus 7 stub subcommands: run, status, list, logs, cancel, register-image
    - currently, all subcommands return an air <cmd> is not implemented yet
    error with representative flags mapped from the Python CLI. Registered
    under the hidden experimental group.
    - tools/list_embeds.py: text=True was changed to universal_newlines=True
    so the acceptance harness runs on Python 3.6. General tooling fix.
    
    ## Why
    The AI runtime CLI ships today as a separately installed Python wheel
    with its own auth, output, and packaging. Folding it into the main Go
    CLI gives users one databricks install with consistent profiles,
    authentication, and -o json output, and removes a parallel toolchain to
    maintain. Landing the package scaffold first lets the individual
    commands be ported in small, reviewable PRs (status is next) instead of
    one large drop. Every stub is wired and navigable, so the command tree
    and registration are reviewable now without functional code.
    
    ## Tests
    - Unit (cmd/experimental/air/): New() registers all six subcommands;
    each stub returns the not-implemented error.
    - Acceptance (acceptance/experimental/air/unimplemented/): runs every
    stub end-to-end and asserts the message + non-zero exit.
    
    test with:
    `go test ./cmd/experimental/air/...`
    `go test ./acceptance -run 'TestAccept/experimental/air'`
    riddhibhagwat-db authored Jun 17, 2026
    Configuration menu
    Copy the full SHA
    997f6bb View commit details
    Browse the repository at this point in the history

Commits on Jun 18, 2026

  1. AIR CLI Integration: Implement the air get command (#5600)

    ## Changes
    Implements `databricks experimental ai get RUN_ID`, the Go port of the
    Python `air get` command. It fetches the run via `Jobs.GetRun` and
    renders:
    - Core fields: run ID, status, submitted time, duration, retries,
    experiment, accelerators, creator (`User`), and the run's dashboard URL.
    - An MLflow deep-link, built from `jobs/runs/get-output` (the
    `gen_ai_compute_output` field is not modeled by the typed SDK, so it's
    fetched via a direct REST call).
    - For foreach/sweep runs, an iteration summary (counts + per iteration
    table) instead of the single-run view.
    - The run's training-config YAML, downloaded from the workspace and
    printed before the status (text mode only).
    
    ## Why
    `get` is the first real command integrated from the air cli and it sets
    the conventions the rest of the CLI will follow. The `{v, ts, data}`
    envelope mirrors the Python CLI so existing machine consumers keep
    working. The implementation is a faithful port of `handle_status` + the
    `cli_display` helpers, verified field-by-field against the Python
    source:
    - The text view shows the foreach branch
    (`_display_foreach_sweep_status`) and the training-config panel
    (`_fetch_and_display_yaml_config`); JSON output omits both, exactly
    matching `air get <run> --json`.
    - MLflow IDs live under an unmodeled `gen_ai_compute_output` field
    (direct REST call), and the MLflow link / YAML fetch are best-effort
    (logic matches python cli)
    
    ## Tests
    - Unit tests cover every formatting/extraction helper, `buildGetData`,
    and all template branches (single-run minimal/all-fields, sweep,
    sweep-with-no-tasks).
    - Mock-backed unit tests (mirroring the Python `unittest.mock` suite)
    cover `buildSweepInfo`, `printConfigYAML`, `mlflowURL` (over `httptest`,
    since it bypasses the typed SDK), and the `RunE` invalid-id / not-found
    branches.
    - An acceptance test (`acceptance/experimental/air/get`) runs the
    command end-to-end against a stubbed Jobs API: text output, `-o json`,
    and an invalid run ID.
    
    
    Manual verification outputs: 
    
    Successful run: 
    <img width="1529" height="74" alt="Screenshot 2026-06-17 at 1 17 30 PM"
    src="https://github.com/user-attachments/assets/ee10167e-52b2-4998-98af-4e9bb169b010"
    />
    <img width="1529" height="215" alt="Screenshot 2026-06-17 at 1 16 48 PM"
    src="https://github.com/user-attachments/assets/888fd89e-2e5b-450e-8d45-a87afef3b005"
    />
    
    <img width="1517" height="362" alt="Screenshot 2026-06-17 at 11 56
    00 AM"
    src="https://github.com/user-attachments/assets/008c90a4-f753-4646-b995-a9cbc40176fe"
    />
    <img width="1529" height="295" alt="Screenshot 2026-06-17 at 2 05 21 PM"
    src="https://github.com/user-attachments/assets/37da6e6c-efe9-494e-96df-dbcf392f7a17"
    />
    
    Failed run: 
    <img width="1529" height="212" alt="Screenshot 2026-06-17 at 1 11 31 PM"
    src="https://github.com/user-attachments/assets/0f15bb4d-8c89-42d4-808e-b432a7f317e4"
    />
    <img width="1529" height="59" alt="Screenshot 2026-06-17 at 1 13 22 PM"
    src="https://github.com/user-attachments/assets/d3fa5390-9e3b-4b42-9a71-e1eb1a7d4975"
    />
    
    <img width="1529" height="403" alt="Screenshot 2026-06-17 at 1 15 52 PM"
    src="https://github.com/user-attachments/assets/b8c3eb62-1ef6-4633-9104-3e99d34340d0"
    />
    <img width="1529" height="338" alt="Screenshot 2026-06-17 at 2 04 48 PM"
    src="https://github.com/user-attachments/assets/1a34ce4f-025b-4139-8f0a-0f40e16bba6c"
    />
    riddhibhagwat-db authored Jun 18, 2026
    Configuration menu
    Copy the full SHA
    b952417 View commit details
    Browse the repository at this point in the history
  2. AIR CLI Integration: air run Command Pt. 1 - Add GPU accelerator ty…

    …pe and compute config model (#5602)
    
    ## Changes
    Adds `experimental/air/cmd/compute.go` , which is the `gpuType` model
    and `compute` which is the block validation that the `air run`
    configuration layer depends on.
    Specifically: 
    - the training service accelerator types were added (`GPU_1xA10`,
    `GPU_8xH100`, `GPU_1xH100`)
    - `parseGPUType` resolves a YAML accelerator type string
    - `gpusPerNode` is the per node partition count based on the type name 
    - `computeConfig` and `validate()` are the port of the python
    `ComputeConfig` validators
    
    ## Why
    This is the first, leaf-most piece of the `air run` port for the AIR CLI
    and the root of the config validation layer dependencies. This piece for
    compute does not depend on anything else so it lands first as a small
    and fully unit-tested unit.
    Note that we also use exact case sensitive parsing since a potential
    typo in the user's YAML could misroute the run. Additionally, we only
    support `GPU_*` training service types (legacy MAPI types (eg.
    `h100_80gb`) are no longer supported and intentionally deprecated in
    this port. However, they still have their own display map for historical
    runs to be able to be displayed (but no new runs can use the MAPI path).
    Rendering them in get is unaffected since format.go keeps its own
    display map for historical runs.
    
    ## Tests
    Table-driven unit tests in compute_test.go: parseGPUType for valid types
    and rejected inputs (wrong casing, legacy types, unknown, empty);
    gpusPerNode counts plus its invalid-type error; and
    computeConfig.validate across valid configs and every failure mode
    (unknown/legacy type, non-positive count, non-multiple count, dual-pool
    conflict). go build, go test, and golangci-lint are clean.
    riddhibhagwat-db authored Jun 18, 2026
    Configuration menu
    Copy the full SHA
    f1601b2 View commit details
    Browse the repository at this point in the history

Commits on Jun 23, 2026

  1. AIR CLI Integration: render air get run as styled boxes (#5654)

    ## What
    
    Replaces the plain-text view of `air get run <id>` with a one-shot,
    styled terminal renderer built on **lipgloss** (layout/styling) and
    **termenv** (hyperlinks + color-profile detection). It builds the full
    string and writes it once — no streaming, spinner, or redraw.
    
    The view is two boxes:
    
    - **Configuration** — the resolved run config YAML (inline
    `yaml_parameters`, the downloaded `yaml_parameters_file_path`, or a
    synthesized fallback), colorized line by line.
    - **Metadata** — Run ID, Status, Submitted, Retries, Max Retries,
    Duration, Experiment, MLflow Run, User, Accelerators, Environment. Run
    ID and MLflow Run are OSC 8 hyperlinks.
    
    ## Look & feel
    
    - Boxes share a light-purple border/title, warm Oat neutrals, and a
    restrained accent palette (blue for keys/links; green/amber/red reserved
    for the status dot).
    - Honors `--no-color` / `NO_COLOR` / non-TTY via `termenv.Ascii`: no
    escape codes, and links degrade to the bare label (the URLs remain
    available in `-o json` as `dashboard_url` / `mlflow_url`).
    
    ## Scope
    
    - Sweep (foreach) runs and JSON output are unchanged.
    - `termenv` becomes a direct dependency (annotated `// MIT` in `go.mod`,
    added to `NOTICE`).
    
    ## Testing
    
    - Unit tests in `render_test.go` / `mlflow_test.go` cover the box, field
    list, link fallback, config sourcing, and the MLflow run-name fetch.
    - Acceptance output regenerated (`acceptance/experimental/air/get`).
    - `go build ./...`, `./task lint-q` (0 issues), and the air + acceptance
    suites pass.
    
    This pull request and its description were written by Isaac.
    
    ---------
    
    Co-authored-by: Maggie Wang <141875985+maggiewang-db@users.noreply.github.com>
    riddhibhagwat-db and maggiewang-db authored Jun 23, 2026
    Configuration menu
    Copy the full SHA
    e118e67 View commit details
    Browse the repository at this point in the history
  2. AIR CLI Integration: air list Functionality & UI (Interfacing with …

    …Training Service) (#5684)
    
    ## Changes
    Add `air list` as a browsable view of the caller's recent AIR training
    runs.
    - Data source: the `AiWorkflowService.ListTrainingWorkflows` RPC (`GET
    /api/2.0/ai-training/workflows`), called directly via `client.Do` since
    the endpoint is `PUBLIC_UNDOCUMENTED` and not modeled by the SDK. The
    server does the AIR filtering, creator scoping, MLflow-ID resolution,
    and pagination, so no Jobs-API logic lives in the CLI.
    - Interactive table: in a terminal `air list` renders an inline,
    navigable table (Bubble Tea + Lip Gloss + termenv): `↑/↓` move a row,
    `←/→` page (20 rows/page), `Enter` opens the run's MLflow page, `q`
    quits. Status is colored by state and the MLflow column is a short
    clickable hyperlink.
    - Non-interactive: piped output, an explicit `--limit`, and empty
    results print the table once; `-o json` emits the air `{v,ts,data}`
    envelope unchanged.
    - Flags: `--limit` (default: all), `--active`, `--all-users`, and
    client-side `--filter` keys (`experiment`, `accelerator_type`,
    `num_accelerators`). Gateway timeouts (e.g. HTTP 504 on `--all-users`)
    return an actionable message.
    - Adds `cmdio.IsPagerSupported`; promotes `termenv` to a direct
    dependency
    
    ## Why
    The `ai-training` service now owns the AIR-specific run logic
    server-side, so `air list` should call its RPC rather than
    reimplementing run discovery against the Jobs API. The interactive table
    gives a browsable run list on par with the Python `air` CLI and
    `databricks jobs list-runs`.
    
    ## Tests
    - Unit: RPC transport, `TrainingWorkflow`→row mapping, `--filter`
    matching, status/accelerator/timestamp helpers, and the TUI model
    (navigation, paging, 20-row page cap, window scroll, quit, static
    render).
    - Acceptance: `acceptance/experimental/air/list` (text + JSON) plus
    `help` updates; `unimplemented` no longer covers `air list`
    
    
    Manual verification output: 
    <img width="1444" height="596" alt="Screenshot 2026-06-22 at 11 52
    41 AM"
    src="https://github.com/user-attachments/assets/2e4a5917-8562-44ed-bb1d-a1cb1398731c"
    />
    riddhibhagwat-db authored Jun 23, 2026
    Configuration menu
    Copy the full SHA
    bd3f934 View commit details
    Browse the repository at this point in the history

Commits on Jun 24, 2026

  1. AIR CLI Integration: collapse air get run back to `air get JOB_RUN_…

    …ID` (#5685)
    
    ## Why
    
    We decided to cut the `get run` sub-resource. The run-status command is
    now just `air get <id>` — flat, with no `run` subcommand.
    
    ## Changes
    
    - Removed the `get` parent group and its `run` subcommand;
    `newGetCommand` is the run-status command itself (`Use: "get
    JOB_RUN_ID"`, `ExactArgs(1)`).
    - No change to output behavior — the styled config box, `JOB_RUN_ID`
    naming, `Job Link` header, status table, and sweep view are all
    unchanged.
    - Regenerated the `experimental/air/get` and `experimental/air/help`
    acceptance outputs; updated doc comments and tests that referenced `air
    get run`.
    
    ## Tests
    
    - Added `TestGetCommandShape`: asserts `Use == "get JOB_RUN_ID"`, no
    registered subcommands, and exactly one arg required.
    - Updated the existing `get` unit tests (invalid id, not-found
    text/JSON, templates, `buildGetData`) to the new entry point.
    - `experimental/air/{get,help}` acceptance regenerated; full air unit +
    acceptance suites pass.
    
    This pull request and its description were written by Isaac.
    riddhibhagwat-db authored Jun 24, 2026
    Configuration menu
    Copy the full SHA
    ca7c0f3 View commit details
    Browse the repository at this point in the history

Commits on Jun 30, 2026

  1. AIR CLI Integration: Adding support for air run configuration (#5657)

    ## Changes
    Ports the air run YAML config schema and its structural validation from
    the Python CLI (cli/sdk/config.py) to Go, under experimental/air/cmd/.
    
    - Schema (runconfig.go): the top-level runConfig plus the nested
    environment (with docker_image), code_source/snapshot/git, and
    permission blocks. Reuses the compute model from the parent branch.
    Includes custom YAML unmarshalers for the three polymorphic fields that
    don't map to a single Go type: environment.dependencies (string path or
    inline list), environment.version (string or int), and git.remote (bool
    or remote-name string).
    - Loader (runconfig_load.go): loadRunConfig decodes a YAML file with
    KnownFields(true) — mirroring pydantic's extra="forbid" so unknown keys
    are rejected — then runs the validation pass.
    - Validation: every structural rule from the Python schema — required
    fields, the experiment_name/mlflow_run_name task-key regex and length
    caps, secret-ref scope/key format, the environment
    docker-image/dependencies/version exclusivity rules, git
    branch-xor-commit and remote-requires-branch rules, code_source snapshot
    requirements, and include_paths relative/no-traversal checks.
    
    Two deliberate divergences from the Python schema, both following from
    the training-service-only port:
    - The compute.node_pool_id / compute.pool_name fields were already
    dropped on the parent branch.
    - The top-level priority field is dropped here: it's a node-pool
    queue-ordering knob (it requires a pool in Python) with no meaning for
    serverless workloads.
    
    ## Why
    "Structural" validation (types, required fields, format/cross-field
    rules) needs no workspace access, so it's a self-contained, fully
    unit-testable unit that's worth landing on its own ahead of the launch
    logic. Splitting it out keeps the upcoming handle_run PR focused on
    orchestration rather than mixing in ~900 lines of schema.
    
    The extra="forbid" / KnownFields behavior is load-bearing: it's what
    turns a typo'd or stale config key into an actionable error instead of a
    silently-ignored field, so it's preserved faithfully. This is stacked on
    air-integration-m2-1 (the compute model).
    
    ## Tests
    New unit tests in runconfig_test.go (62 subtests, table-driven),
    covering:
    - Loading a minimal config and a full-featured config (all blocks
    populated).
    - Each polymorphic union decoding both of its forms (dependencies string
    vs list, git.remote bool vs string, default-unset).
    - Unknown-field rejection at top level and nested — including explicit
    cases asserting the dropped priority field and the not-yet-ported
    _bases_ key surface as errors.
    - Every validation rule's failure mode, plus file-level errors (missing
    file, empty file).
    
    go test ./experimental/air/... passes; ./task lint-q reports 0 issues.
    riddhibhagwat-db authored Jun 30, 2026
    Configuration menu
    Copy the full SHA
    60adcaa View commit details
    Browse the repository at this point in the history
  2. AIR CLI Integration: air run end to end command (#5710)

    ## Changes
    Implements the `air run` happy path on top of the config schema (#5657),
    submitting a one-time training run through the Jobs API. Five commits,
    one per phase:
    
    1. run config launch accessors: flatten the validated config into launch
    values (timeout seconds, retry default, requirements file-vs-inline,
    runtime version).
    2. wire run command (load, validate, dry-run): air run -f <config> loads
    + structurally validates the YAML; `--dry-run` validates offline (no
    workspace/auth) and returns; `--override/--watch` are rejected for now
    with clear errors (ported in future PR).
    3. pre-submit resolution: resolve current user / workspace home / a
    unique cli_launch dir, and ensure a custom `experiment_directory`
    exists.
    4. upload launch artifacts: write training_config.yaml (1 MB cap),
    command.sh, requirements.yaml (file or synthesized from inline deps),
    `env_vars.json` / `secret_env_vars.json`, and hyperparameters.yaml into
    the launch dir via a workspace filer.
    5. assemble + submit: build the native `ai_runtime_task` payload and
    `POST /api/2.2/jobs/runs/submit` directly, then print the run id +
    dashboard URL (or a JSON envelope).
    
    Submission uses the **native `ai_runtime_task`** task (BYOT task type)
    and it talks only to the Jobs API (which internally routes to training
    service endpoint) and has no genai-mapi forwarding (the MAPI path is
    deprecated). It isn't modeled by the typed SDK in go, so the payload is
    a custom struct posted to the raw endpoint. The proto is lean: env vars
    and secrets ship as co-located `env_vars.json` / `secret_env_vars.json`
    files rather than inline, and `requirements.yaml` /
    `hyperparameters.yaml` are derived server-side from the command
    directory.
    
    **Deferred, with explicit "not yet supported" errors (no silent
    drops):** `code_source` snapshot packaging, `--watch` log streaming, and
    `usage_policy_name`. `environment.docker_image` is accepted by the
    schema as scaffolding but not conveyed in the payload (the native path
    has no docker field). `node_pool_id` / `pool_name` / `priority` remain
    dropped (new AIR CLI does not support pool placement).
    
    ## Why
    `air run` is the core of the migration for AIR CLI. Splitting it into
    per-phase commits keeps each reviewable in isolation, and stacking on
    the schema PR keeps that PR focused. Regarding some specific decisions:
    - We maintain the native ai_runtime_task (and not the genai_compute_task
    interfacing with mapi) as a hand built struct posted to the raw
    endpoint. This is so that we can interface with jobs directly (and
    jobs.SubmitTask only knows gen_ai_compute_task and this typed struct
    also omits the env-vars/secrets/requirements fields that are needed for
    the run) and make sure we also stay off the deprecated genai-mapi
    forwarding path.
    - `--dry-run` is decoupled from auth. It validates the config locally
    and returns before any workspace call, so config validation works fully
    offline (matching the Python CLI). Only actual submission requires an
    authenticated workspace client.
    
    ## Tests
    - Unit tests for every phase: launch accessors, pre-submit resolution
    (incl. ensureExperimentDirectory create/exists/not-a-directory),
    artifact assembly + upload, payload assembly, and submitWorkload
    end-to-end against a fake workspace.
    - New acceptance/experimental/air/run test covering --dry-run (text +
    JSON), the --override/--watch guards, an invalid config, and missing
    --file.
    - Updated the unimplemented acceptance test (removed run, now
    implemented).
    
    `go test ./experimental/air/...`, `go test ./acceptance -run
    TestAccept/experimental/air`, and `./task lint-q` all pass.
    
    **Manual verification tests (all pass):** 
    - Dry run (offline, no auth) 
    > - command only 
    > - full run config 
    > - json output 
    
    - actual run submission 
    > - throws error when profile is not set 
    > - submission loop: submitted, can see the run in `air list` and `air
    get` and mlflow environment was created
    > - same run id gets ouputted when run submitted with the SAME
    idempotency key
    > - new run gets created when run submitted with SAME config but
    DIFFERENT idempotency key
    
    - `--watch` and `--override` return an informative error message (since
    they are not supported yet, but are valid flags)
    - usage_policy_name set in config throws error: usage_policy_name is not
    yet supported
    - code_source set in config throws error: code_source is not yet
    supported
    - missing --file throws informative error: required flag(s) "file" not
    set
    - invalid config (e.g. experiment_name: bad.name, or num_accelerators
    not a multiple of the per-node count) throws field-specific validation
    error
    
    
    **How to test locally for manual verification:**
    
    Checkout & build:
    ```bash
    git fetch origin
    git checkout air-integration-m2-3        # this PR (stacked on air-integration-m2-2)
    ./task build
    ```
    
    Sample configs:
    
    ```bash
    cat > /tmp/min.yaml <<'YAML'
    experiment_name: air-cuj
    command: python train.py
    compute: {accelerator_type: GPU_1xH100, num_accelerators: 1}
    YAML
    ```
    ```bash
    cat > /tmp/full.yaml <<'YAML'
    experiment_name: full-run
    command: |
      pip install -r requirements.txt
      python train.py
    compute: {accelerator_type: GPU_8xH100, num_accelerators: 16}
    environment: {dependencies: [torch==2.3.0], version: 5}
    env_variables: {WANDB_PROJECT: demo}
    secrets: {HF_TOKEN: my_scope/hf_token}
    parameters: {lr: 0.001, epochs: 3}
    mlflow_run_name: full-run-v2
    max_retries: 2
    timeout_minutes: 120
    YAML
    ```
    
    Automated tests
    
    ```bash
    go test ./experimental/air/...                      # unit (incl. submitWorkload vs a fake workspace)
    go test ./acceptance -run TestAccept/experimental/air   # acceptance (run + unimplemented)
    ./task lint-q                                        # lint changed files
    ```
    
    Dry run: 
    ```bash
    ./cli experimental air run -f /tmp/min.yaml --dry-run   
    # note that this command will, in the final version, be databricks experimental air run 
    ./cli experimental air run -f /tmp/full.yaml --dry-run
    ./cli experimental air run -f /tmp/min.yaml --dry-run -o json
    
    ```
    
    Actual run submission: 
    ```bash
    PROFILE=<your-dev-profile>
    
    # no auth configured → fails fast (exit 1)
    env -u DATABRICKS_HOST -u DATABRICKS_TOKEN ./cli experimental air run -f /tmp/min.yaml
    #> Error: ... (cannot configure default credentials / auth)
    
    # submit → prints run_id + dashboard URL
    ./cli experimental air run -f /tmp/min.yaml -p $PROFILE -o json
    #> { "data": { "status":"SUBMITTED", "run_id":"<id>", "dashboard_url":"<host>/jobs/runs/<id>" } }
    
    # verify in the workspace: open dashboard_url (run exists), and the MLflow experiment was created.
    ./cli experimental air get <run_id> -p $PROFILE         # run state
    ./cli experimental air list -p $PROFILE                 # run appears in the list
    
    # idempotency — SAME key returns the SAME run_id (no new run)
    ./cli experimental air run -f /tmp/min.yaml -p $PROFILE --idempotency-key demo-key-1 -o json   # run_id = X
    ./cli experimental air run -f /tmp/min.yaml -p $PROFILE --idempotency-key demo-key-1 -o json   # run_id = X (same)
    
    # idempotency — DIFFERENT key creates a NEW run
    ./cli experimental air run -f /tmp/min.yaml -p $PROFILE --idempotency-key demo-key-2 -o json   # run_id = Y (new)
    ```
    
    Unsupported flags (asserting that error is thrown): 
    ```bash
    ./cli experimental air run -f /tmp/min.yaml --dry-run --watch
    #> Error: --watch is not yet supported
    ./cli experimental air run -f /tmp/min.yaml --dry-run --override compute.num_accelerators=8
    #> Error: --override is not yet supported
    
    # usage_policy_name (needs a workspace to reach the submit guard)
    printf 'experiment_name: t\ncommand: x\ncompute: {accelerator_type: GPU_1xH100, num_accelerators: 1}\nusage_policy_name: my-policy\n' > /tmp/policy.yaml
    ./cli experimental air run -f /tmp/policy.yaml -p $PROFILE
    #> Error: usage_policy_name is not yet supported
    
    # code_source
    printf 'experiment_name: t\ncommand: x\ncompute: {accelerator_type: GPU_1xH100, num_accelerators: 1}\ncode_source: {type: snapshot, snapshot: {root_path: .}}\n' > /tmp/code.yaml
    air run -f /tmp/code.yaml -p $PROFILE
    #> Error: code_source is not yet supported
    
    ```
    
    Validation errors for field-specific message (exit 1, offline):
    ```bash
    # missing --file
    air run --dry-run
    #> Error: required flag(s) "file" not set
    
    # invalid experiment_name + num_accelerators not a multiple of the per-node count
    printf 'experiment_name: bad.name\ncommand: x\ncompute: {accelerator_type: GPU_8xH100, num_accelerators: 3}\n' > /tmp/bad.yaml
    air run -f /tmp/bad.yaml --dry-run
    #> Error: invalid experiment_name "bad.name": only alphanumeric characters, hyphens (-), and underscores (_) are allowed
    #  (and, once the name is fixed: compute.num_accelerators for GPU_8xH100 must be a multiple of 8, got 3)
    ```
    riddhibhagwat-db authored Jun 30, 2026
    Configuration menu
    Copy the full SHA
    fc0ba3e View commit details
    Browse the repository at this point in the history
Loading