Comparing main...air-cli · databricks/cli

Commits on Jun 17, 2026

AIR CLI Integration: Scaffold experimental AIR CLI command package (#…

…5564)

## Changes
- New `cmd/experimental/air/ ` package containing an air parent command
plus 7 stub subcommands: run, status, list, logs, cancel, register-image
- currently, all subcommands return an air <cmd> is not implemented yet
error with representative flags mapped from the Python CLI. Registered
under the hidden experimental group.
- tools/list_embeds.py: text=True was changed to universal_newlines=True
so the acceptance harness runs on Python 3.6. General tooling fix.

## Why
The AI runtime CLI ships today as a separately installed Python wheel
with its own auth, output, and packaging. Folding it into the main Go
CLI gives users one databricks install with consistent profiles,
authentication, and -o json output, and removes a parallel toolchain to
maintain. Landing the package scaffold first lets the individual
commands be ported in small, reviewable PRs (status is next) instead of
one large drop. Every stub is wired and navigable, so the command tree
and registration are reviewable now without functional code.

## Tests
- Unit (cmd/experimental/air/): New() registers all six subcommands;
each stub returns the not-implemented error.
- Acceptance (acceptance/experimental/air/unimplemented/): runs every
stub end-to-end and asserts the message + non-zero exit.

test with:
`go test ./cmd/experimental/air/...`
`go test ./acceptance -run 'TestAccept/experimental/air'`

riddhibhagwat-db authored Jun 17, 2026

997f6bb

Commits on Jun 18, 2026

AIR CLI Integration: Implement the air get command (#5600 )

## Changes
Implements `databricks experimental ai get RUN_ID`, the Go port of the
Python `air get` command. It fetches the run via `Jobs.GetRun` and
renders:
- Core fields: run ID, status, submitted time, duration, retries,
experiment, accelerators, creator (`User`), and the run's dashboard URL.
- An MLflow deep-link, built from `jobs/runs/get-output` (the
`gen_ai_compute_output` field is not modeled by the typed SDK, so it's
fetched via a direct REST call).
- For foreach/sweep runs, an iteration summary (counts + per iteration
table) instead of the single-run view.
- The run's training-config YAML, downloaded from the workspace and
printed before the status (text mode only).

## Why
`get` is the first real command integrated from the air cli and it sets
the conventions the rest of the CLI will follow. The `{v, ts, data}`
envelope mirrors the Python CLI so existing machine consumers keep
working. The implementation is a faithful port of `handle_status` + the
`cli_display` helpers, verified field-by-field against the Python
source:
- The text view shows the foreach branch
(`_display_foreach_sweep_status`) and the training-config panel
(`_fetch_and_display_yaml_config`); JSON output omits both, exactly
matching `air get <run> --json`.
- MLflow IDs live under an unmodeled `gen_ai_compute_output` field
(direct REST call), and the MLflow link / YAML fetch are best-effort
(logic matches python cli)

## Tests
- Unit tests cover every formatting/extraction helper, `buildGetData`,
and all template branches (single-run minimal/all-fields, sweep,
sweep-with-no-tasks).
- Mock-backed unit tests (mirroring the Python `unittest.mock` suite)
cover `buildSweepInfo`, `printConfigYAML`, `mlflowURL` (over `httptest`,
since it bypasses the typed SDK), and the `RunE` invalid-id / not-found
branches.
- An acceptance test (`acceptance/experimental/air/get`) runs the
command end-to-end against a stubbed Jobs API: text output, `-o json`,
and an invalid run ID.


Manual verification outputs: 

Successful run: 
<img width="1529" height="74" alt="Screenshot 2026-06-17 at 1 17 30 PM"
src="https://github.com/user-attachments/assets/ee10167e-52b2-4998-98af-4e9bb169b010"
/>
<img width="1529" height="215" alt="Screenshot 2026-06-17 at 1 16 48 PM"
src="https://github.com/user-attachments/assets/888fd89e-2e5b-450e-8d45-a87afef3b005"
/>

<img width="1517" height="362" alt="Screenshot 2026-06-17 at 11 56
00 AM"
src="https://github.com/user-attachments/assets/008c90a4-f753-4646-b995-a9cbc40176fe"
/>
<img width="1529" height="295" alt="Screenshot 2026-06-17 at 2 05 21 PM"
src="https://github.com/user-attachments/assets/37da6e6c-efe9-494e-96df-dbcf392f7a17"
/>

Failed run: 
<img width="1529" height="212" alt="Screenshot 2026-06-17 at 1 11 31 PM"
src="https://github.com/user-attachments/assets/0f15bb4d-8c89-42d4-808e-b432a7f317e4"
/>
<img width="1529" height="59" alt="Screenshot 2026-06-17 at 1 13 22 PM"
src="https://github.com/user-attachments/assets/d3fa5390-9e3b-4b42-9a71-e1eb1a7d4975"
/>

<img width="1529" height="403" alt="Screenshot 2026-06-17 at 1 15 52 PM"
src="https://github.com/user-attachments/assets/b8c3eb62-1ef6-4633-9104-3e99d34340d0"
/>
<img width="1529" height="338" alt="Screenshot 2026-06-17 at 2 04 48 PM"
src="https://github.com/user-attachments/assets/1a34ce4f-025b-4139-8f0a-0f40e16bba6c"
/>

riddhibhagwat-db authored Jun 18, 2026

b952417

AIR CLI Integration: air run Command Pt. 1 - Add GPU accelerator ty…

…pe and compute config model (#5602)

## Changes
Adds `experimental/air/cmd/compute.go` , which is the `gpuType` model
and `compute` which is the block validation that the `air run`
configuration layer depends on.
Specifically: 
- the training service accelerator types were added (`GPU_1xA10`,
`GPU_8xH100`, `GPU_1xH100`)
- `parseGPUType` resolves a YAML accelerator type string
- `gpusPerNode` is the per node partition count based on the type name 
- `computeConfig` and `validate()` are the port of the python
`ComputeConfig` validators

## Why
This is the first, leaf-most piece of the `air run` port for the AIR CLI
and the root of the config validation layer dependencies. This piece for
compute does not depend on anything else so it lands first as a small
and fully unit-tested unit.
Note that we also use exact case sensitive parsing since a potential
typo in the user's YAML could misroute the run. Additionally, we only
support `GPU_*` training service types (legacy MAPI types (eg.
`h100_80gb`) are no longer supported and intentionally deprecated in
this port. However, they still have their own display map for historical
runs to be able to be displayed (but no new runs can use the MAPI path).
Rendering them in get is unaffected since format.go keeps its own
display map for historical runs.

## Tests
Table-driven unit tests in compute_test.go: parseGPUType for valid types
and rejected inputs (wrong casing, legacy types, unknown, empty);
gpusPerNode counts plus its invalid-type error; and
computeConfig.validate across valid configs and every failure mode
(unknown/legacy type, non-positive count, non-multiple count, dual-pool
conflict). go build, go test, and golangci-lint are clean.

riddhibhagwat-db authored Jun 18, 2026

f1601b2

Commits on Jun 23, 2026

AIR CLI Integration: render air get run as styled boxes (#5654 )

## What

Replaces the plain-text view of `air get run <id>` with a one-shot,
styled terminal renderer built on **lipgloss** (layout/styling) and
**termenv** (hyperlinks + color-profile detection). It builds the full
string and writes it once — no streaming, spinner, or redraw.

The view is two boxes:

- **Configuration** — the resolved run config YAML (inline
`yaml_parameters`, the downloaded `yaml_parameters_file_path`, or a
synthesized fallback), colorized line by line.
- **Metadata** — Run ID, Status, Submitted, Retries, Max Retries,
Duration, Experiment, MLflow Run, User, Accelerators, Environment. Run
ID and MLflow Run are OSC 8 hyperlinks.

## Look & feel

- Boxes share a light-purple border/title, warm Oat neutrals, and a
restrained accent palette (blue for keys/links; green/amber/red reserved
for the status dot).
- Honors `--no-color` / `NO_COLOR` / non-TTY via `termenv.Ascii`: no
escape codes, and links degrade to the bare label (the URLs remain
available in `-o json` as `dashboard_url` / `mlflow_url`).

## Scope

- Sweep (foreach) runs and JSON output are unchanged.
- `termenv` becomes a direct dependency (annotated `// MIT` in `go.mod`,
added to `NOTICE`).

## Testing

- Unit tests in `render_test.go` / `mlflow_test.go` cover the box, field
list, link fallback, config sourcing, and the MLflow run-name fetch.
- Acceptance output regenerated (`acceptance/experimental/air/get`).
- `go build ./...`, `./task lint-q` (0 issues), and the air + acceptance
suites pass.

This pull request and its description were written by Isaac.

---------

Co-authored-by: Maggie Wang <141875985+maggiewang-db@users.noreply.github.com>

riddhibhagwat-db and maggiewang-db authored Jun 23, 2026

e118e67

AIR CLI Integration: air list Functionality & UI (Interfacing with …

…Training Service) (#5684)

## Changes
Add `air list` as a browsable view of the caller's recent AIR training
runs.
- Data source: the `AiWorkflowService.ListTrainingWorkflows` RPC (`GET
/api/2.0/ai-training/workflows`), called directly via `client.Do` since
the endpoint is `PUBLIC_UNDOCUMENTED` and not modeled by the SDK. The
server does the AIR filtering, creator scoping, MLflow-ID resolution,
and pagination, so no Jobs-API logic lives in the CLI.
- Interactive table: in a terminal `air list` renders an inline,
navigable table (Bubble Tea + Lip Gloss + termenv): `↑/↓` move a row,
`←/→` page (20 rows/page), `Enter` opens the run's MLflow page, `q`
quits. Status is colored by state and the MLflow column is a short
clickable hyperlink.
- Non-interactive: piped output, an explicit `--limit`, and empty
results print the table once; `-o json` emits the air `{v,ts,data}`
envelope unchanged.
- Flags: `--limit` (default: all), `--active`, `--all-users`, and
client-side `--filter` keys (`experiment`, `accelerator_type`,
`num_accelerators`). Gateway timeouts (e.g. HTTP 504 on `--all-users`)
return an actionable message.
- Adds `cmdio.IsPagerSupported`; promotes `termenv` to a direct
dependency

## Why
The `ai-training` service now owns the AIR-specific run logic
server-side, so `air list` should call its RPC rather than
reimplementing run discovery against the Jobs API. The interactive table
gives a browsable run list on par with the Python `air` CLI and
`databricks jobs list-runs`.

## Tests
- Unit: RPC transport, `TrainingWorkflow`→row mapping, `--filter`
matching, status/accelerator/timestamp helpers, and the TUI model
(navigation, paging, 20-row page cap, window scroll, quit, static
render).
- Acceptance: `acceptance/experimental/air/list` (text + JSON) plus
`help` updates; `unimplemented` no longer covers `air list`


Manual verification output: 
<img width="1444" height="596" alt="Screenshot 2026-06-22 at 11 52
41 AM"
src="https://github.com/user-attachments/assets/2e4a5917-8562-44ed-bb1d-a1cb1398731c"
/>

riddhibhagwat-db authored Jun 23, 2026

bd3f934

Commits on Jun 24, 2026

AIR CLI Integration: collapse air get run back to `air get JOB_RUN_…

…ID` (#5685)

## Why

We decided to cut the `get run` sub-resource. The run-status command is
now just `air get <id>` — flat, with no `run` subcommand.

## Changes

- Removed the `get` parent group and its `run` subcommand;
`newGetCommand` is the run-status command itself (`Use: "get
JOB_RUN_ID"`, `ExactArgs(1)`).
- No change to output behavior — the styled config box, `JOB_RUN_ID`
naming, `Job Link` header, status table, and sweep view are all
unchanged.
- Regenerated the `experimental/air/get` and `experimental/air/help`
acceptance outputs; updated doc comments and tests that referenced `air
get run`.

## Tests

- Added `TestGetCommandShape`: asserts `Use == "get JOB_RUN_ID"`, no
registered subcommands, and exactly one arg required.
- Updated the existing `get` unit tests (invalid id, not-found
text/JSON, templates, `buildGetData`) to the new entry point.
- `experimental/air/{get,help}` acceptance regenerated; full air unit +
acceptance suites pass.

This pull request and its description were written by Isaac.

riddhibhagwat-db authored Jun 24, 2026

ca7c0f3

Commits on Jun 30, 2026

AIR CLI Integration: Adding support for air run configuration (#5657 )

## Changes
Ports the air run YAML config schema and its structural validation from
the Python CLI (cli/sdk/config.py) to Go, under experimental/air/cmd/.

- Schema (runconfig.go): the top-level runConfig plus the nested
environment (with docker_image), code_source/snapshot/git, and
permission blocks. Reuses the compute model from the parent branch.
Includes custom YAML unmarshalers for the three polymorphic fields that
don't map to a single Go type: environment.dependencies (string path or
inline list), environment.version (string or int), and git.remote (bool
or remote-name string).
- Loader (runconfig_load.go): loadRunConfig decodes a YAML file with
KnownFields(true) — mirroring pydantic's extra="forbid" so unknown keys
are rejected — then runs the validation pass.
- Validation: every structural rule from the Python schema — required
fields, the experiment_name/mlflow_run_name task-key regex and length
caps, secret-ref scope/key format, the environment
docker-image/dependencies/version exclusivity rules, git
branch-xor-commit and remote-requires-branch rules, code_source snapshot
requirements, and include_paths relative/no-traversal checks.

Two deliberate divergences from the Python schema, both following from
the training-service-only port:
- The compute.node_pool_id / compute.pool_name fields were already
dropped on the parent branch.
- The top-level priority field is dropped here: it's a node-pool
queue-ordering knob (it requires a pool in Python) with no meaning for
serverless workloads.

## Why
"Structural" validation (types, required fields, format/cross-field
rules) needs no workspace access, so it's a self-contained, fully
unit-testable unit that's worth landing on its own ahead of the launch
logic. Splitting it out keeps the upcoming handle_run PR focused on
orchestration rather than mixing in ~900 lines of schema.

The extra="forbid" / KnownFields behavior is load-bearing: it's what
turns a typo'd or stale config key into an actionable error instead of a
silently-ignored field, so it's preserved faithfully. This is stacked on
air-integration-m2-1 (the compute model).

## Tests
New unit tests in runconfig_test.go (62 subtests, table-driven),
covering:
- Loading a minimal config and a full-featured config (all blocks
populated).
- Each polymorphic union decoding both of its forms (dependencies string
vs list, git.remote bool vs string, default-unset).
- Unknown-field rejection at top level and nested — including explicit
cases asserting the dropped priority field and the not-yet-ported
_bases_ key surface as errors.
- Every validation rule's failure mode, plus file-level errors (missing
file, empty file).

go test ./experimental/air/... passes; ./task lint-q reports 0 issues.

riddhibhagwat-db authored Jun 30, 2026

60adcaa

AIR CLI Integration: air run end to end command (#5710 )

## Changes
Implements the `air run` happy path on top of the config schema (#5657),
submitting a one-time training run through the Jobs API. Five commits,
one per phase:

1. run config launch accessors: flatten the validated config into launch
values (timeout seconds, retry default, requirements file-vs-inline,
runtime version).
2. wire run command (load, validate, dry-run): air run -f <config> loads
+ structurally validates the YAML; `--dry-run` validates offline (no
workspace/auth) and returns; `--override/--watch` are rejected for now
with clear errors (ported in future PR).
3. pre-submit resolution: resolve current user / workspace home / a
unique cli_launch dir, and ensure a custom `experiment_directory`
exists.
4. upload launch artifacts: write training_config.yaml (1 MB cap),
command.sh, requirements.yaml (file or synthesized from inline deps),
`env_vars.json` / `secret_env_vars.json`, and hyperparameters.yaml into
the launch dir via a workspace filer.
5. assemble + submit: build the native `ai_runtime_task` payload and
`POST /api/2.2/jobs/runs/submit` directly, then print the run id +
dashboard URL (or a JSON envelope).

Submission uses the **native `ai_runtime_task`** task (BYOT task type)
and it talks only to the Jobs API (which internally routes to training
service endpoint) and has no genai-mapi forwarding (the MAPI path is
deprecated). It isn't modeled by the typed SDK in go, so the payload is
a custom struct posted to the raw endpoint. The proto is lean: env vars
and secrets ship as co-located `env_vars.json` / `secret_env_vars.json`
files rather than inline, and `requirements.yaml` /
`hyperparameters.yaml` are derived server-side from the command
directory.

**Deferred, with explicit "not yet supported" errors (no silent
drops):** `code_source` snapshot packaging, `--watch` log streaming, and
`usage_policy_name`. `environment.docker_image` is accepted by the
schema as scaffolding but not conveyed in the payload (the native path
has no docker field). `node_pool_id` / `pool_name` / `priority` remain
dropped (new AIR CLI does not support pool placement).

## Why
`air run` is the core of the migration for AIR CLI. Splitting it into
per-phase commits keeps each reviewable in isolation, and stacking on
the schema PR keeps that PR focused. Regarding some specific decisions:
- We maintain the native ai_runtime_task (and not the genai_compute_task
interfacing with mapi) as a hand built struct posted to the raw
endpoint. This is so that we can interface with jobs directly (and
jobs.SubmitTask only knows gen_ai_compute_task and this typed struct
also omits the env-vars/secrets/requirements fields that are needed for
the run) and make sure we also stay off the deprecated genai-mapi
forwarding path.
- `--dry-run` is decoupled from auth. It validates the config locally
and returns before any workspace call, so config validation works fully
offline (matching the Python CLI). Only actual submission requires an
authenticated workspace client.

## Tests
- Unit tests for every phase: launch accessors, pre-submit resolution
(incl. ensureExperimentDirectory create/exists/not-a-directory),
artifact assembly + upload, payload assembly, and submitWorkload
end-to-end against a fake workspace.
- New acceptance/experimental/air/run test covering --dry-run (text +
JSON), the --override/--watch guards, an invalid config, and missing
--file.
- Updated the unimplemented acceptance test (removed run, now
implemented).

`go test ./experimental/air/...`, `go test ./acceptance -run
TestAccept/experimental/air`, and `./task lint-q` all pass.

**Manual verification tests (all pass):** 
- Dry run (offline, no auth) 
> - command only 
> - full run config 
> - json output 

- actual run submission 
> - throws error when profile is not set 
> - submission loop: submitted, can see the run in `air list` and `air
get` and mlflow environment was created
> - same run id gets ouputted when run submitted with the SAME
idempotency key
> - new run gets created when run submitted with SAME config but
DIFFERENT idempotency key

- `--watch` and `--override` return an informative error message (since
they are not supported yet, but are valid flags)
- usage_policy_name set in config throws error: usage_policy_name is not
yet supported
- code_source set in config throws error: code_source is not yet
supported
- missing --file throws informative error: required flag(s) "file" not
set
- invalid config (e.g. experiment_name: bad.name, or num_accelerators
not a multiple of the per-node count) throws field-specific validation
error


**How to test locally for manual verification:**

Checkout & build:
```bash
git fetch origin
git checkout air-integration-m2-3        # this PR (stacked on air-integration-m2-2)
./task build
```

Sample configs:

```bash
cat > /tmp/min.yaml <<'YAML'
experiment_name: air-cuj
command: python train.py
compute: {accelerator_type: GPU_1xH100, num_accelerators: 1}
YAML
```
```bash
cat > /tmp/full.yaml <<'YAML'
experiment_name: full-run
command: |
  pip install -r requirements.txt
  python train.py
compute: {accelerator_type: GPU_8xH100, num_accelerators: 16}
environment: {dependencies: [torch==2.3.0], version: 5}
env_variables: {WANDB_PROJECT: demo}
secrets: {HF_TOKEN: my_scope/hf_token}
parameters: {lr: 0.001, epochs: 3}
mlflow_run_name: full-run-v2
max_retries: 2
timeout_minutes: 120
YAML
```

Automated tests

```bash
go test ./experimental/air/...                      # unit (incl. submitWorkload vs a fake workspace)
go test ./acceptance -run TestAccept/experimental/air   # acceptance (run + unimplemented)
./task lint-q                                        # lint changed files
```

Dry run: 
```bash
./cli experimental air run -f /tmp/min.yaml --dry-run   
# note that this command will, in the final version, be databricks experimental air run 
./cli experimental air run -f /tmp/full.yaml --dry-run
./cli experimental air run -f /tmp/min.yaml --dry-run -o json

```

Actual run submission: 
```bash
PROFILE=<your-dev-profile>

# no auth configured → fails fast (exit 1)
env -u DATABRICKS_HOST -u DATABRICKS_TOKEN ./cli experimental air run -f /tmp/min.yaml
#> Error: ... (cannot configure default credentials / auth)

# submit → prints run_id + dashboard URL
./cli experimental air run -f /tmp/min.yaml -p $PROFILE -o json
#> { "data": { "status":"SUBMITTED", "run_id":"<id>", "dashboard_url":"<host>/jobs/runs/<id>" } }

# verify in the workspace: open dashboard_url (run exists), and the MLflow experiment was created.
./cli experimental air get <run_id> -p $PROFILE         # run state
./cli experimental air list -p $PROFILE                 # run appears in the list

# idempotency — SAME key returns the SAME run_id (no new run)
./cli experimental air run -f /tmp/min.yaml -p $PROFILE --idempotency-key demo-key-1 -o json   # run_id = X
./cli experimental air run -f /tmp/min.yaml -p $PROFILE --idempotency-key demo-key-1 -o json   # run_id = X (same)

# idempotency — DIFFERENT key creates a NEW run
./cli experimental air run -f /tmp/min.yaml -p $PROFILE --idempotency-key demo-key-2 -o json   # run_id = Y (new)
```

Unsupported flags (asserting that error is thrown): 
```bash
./cli experimental air run -f /tmp/min.yaml --dry-run --watch
#> Error: --watch is not yet supported
./cli experimental air run -f /tmp/min.yaml --dry-run --override compute.num_accelerators=8
#> Error: --override is not yet supported

# usage_policy_name (needs a workspace to reach the submit guard)
printf 'experiment_name: t\ncommand: x\ncompute: {accelerator_type: GPU_1xH100, num_accelerators: 1}\nusage_policy_name: my-policy\n' > /tmp/policy.yaml
./cli experimental air run -f /tmp/policy.yaml -p $PROFILE
#> Error: usage_policy_name is not yet supported

# code_source
printf 'experiment_name: t\ncommand: x\ncompute: {accelerator_type: GPU_1xH100, num_accelerators: 1}\ncode_source: {type: snapshot, snapshot: {root_path: .}}\n' > /tmp/code.yaml
air run -f /tmp/code.yaml -p $PROFILE
#> Error: code_source is not yet supported

```

Validation errors for field-specific message (exit 1, offline):
```bash
# missing --file
air run --dry-run
#> Error: required flag(s) "file" not set

# invalid experiment_name + num_accelerators not a multiple of the per-node count
printf 'experiment_name: bad.name\ncommand: x\ncompute: {accelerator_type: GPU_8xH100, num_accelerators: 3}\n' > /tmp/bad.yaml
air run -f /tmp/bad.yaml --dry-run
#> Error: invalid experiment_name "bad.name": only alphanumeric characters, hyphens (-), and underscores (_) are allowed
#  (and, once the name is fixed: compute.num_accelerators for GPU_8xH100 must be a multiple of 8, got 3)
```

riddhibhagwat-db authored Jun 30, 2026

fc0ba3e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparing changes

Open a pull request

Uh oh!

Commits on Jun 17, 2026

Commits on Jun 18, 2026

Commits on Jun 23, 2026

Commits on Jun 24, 2026

Commits on Jun 30, 2026

This comparison is taking too long to generate.

Uh oh!