Skip to content

Conversation

@win5923
Copy link
Member

@win5923 win5923 commented Jan 31, 2026

Description

As title.

Context:

  • Push-based shuffle isn't used by default, and we haven't touched the code in years.
  • Ray Data has premerge (must pass before you can land a PR) and postmerge CI. To make it easier to ship, we can move unimportant tests to postmerge.

Related issues

Closes #60417

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

@win5923 win5923 requested a review from a team as a code owner January 31, 2026 11:17
@win5923 win5923 requested a review from aslonnie January 31, 2026 11:18
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request moves the test_push_based_shuffle test to a post-merge CI job, likely to improve pre-merge CI speed. The changes correctly tag the test in the Bazel file and update the CI configuration to run it in new post-merge jobs. My review includes a suggestion to refactor the new CI jobs to improve maintainability by reducing code duplication.

Comment on lines +196 to +217
- label: ":database: data: postmerge arrow v9 tests"
tags:
- data
- skip-on-premerge
instance_type: medium
commands:
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //python/ray/air/... data
--build-name data9build-py3.10 --python-version 3.10
--only-tags skip-on-premerge
depends_on: data9build-multipy

- label: ":database: data: postmerge arrow v23 tests"
tags:
- data
- skip-on-premerge
instance_type: medium
commands:
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //python/ray/air/... data
--build-name datalbuild-py3.10 --python-version 3.10
--only-tags skip-on-premerge
depends_on: datalbuild-multipy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These two new post-merge test steps are nearly identical. To improve maintainability and reduce code duplication, you can combine them into a single matrix step. This is consistent with how other tests are configured in this file.

- label: ":database: data: postmerge arrow {{matrix.arrow_version}} tests"
  matrix:
    setup:
      arrow_version:
        - "v9"
        - "v23"
    adjustments:
      - with:
          arrow_version: "v9"
        build_name: "data9build-py3.10"
        depends_on: "data9build-multipy"
      - with:
          arrow_version: "v23"
        build_name: "datalbuild-py3.10"
        depends_on: "datalbuild-multipy"
  tags:
    - data
    - skip-on-premerge
  instance_type: medium
  commands:
    - >
      bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //python/ray/air/... data
      --build-name {{matrix.build_name}} --python-version 3.10
      --only-tags skip-on-premerge
  depends_on: "{{matrix.depends_on}}"
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

@win5923 win5923 force-pushed the test_push_based_shuffle branch from 33e7846 to ca41fb0 Compare January 31, 2026 11:54
Signed-off-by: win5923 <ken89@kimo.com>
@win5923 win5923 force-pushed the test_push_based_shuffle branch from ca41fb0 to ee4864a Compare January 31, 2026 12:14
@ray-gardener ray-gardener bot added data Ray Data-related issues devprod community-contribution Contributed by the community labels Jan 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community data Ray Data-related issues devprod

1 participant