-
Notifications
You must be signed in to change notification settings - Fork 7.2k
[Data][Test] Fix flaky test_map_fusion_with_concurrency_arg due to non-deterministic ordering #60637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[Data][Test] Fix flaky test_map_fusion_with_concurrency_arg due to non-deterministic ordering #60637
Conversation
…n-deterministic ordering The test `test_map_fusion_with_concurrency_arg` fails intermittently when concurrency=2 is used with 2 blocks. The two blocks are processed concurrently, and there's no guarantee about the order they complete. Block 2 (values 5-9) can complete before block 1 (values 0-4), resulting in [5,6,7,8,9,0,1,2,3,4] instead of [0,1,2,3,4,5,6,7,8,9]. Since this test validates operator fusion behavior rather than output ordering, sort the results before comparison to make it robust against concurrent execution order variations. Signed-off-by: slfan1989 <slfan1989@apache.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request fixes a flaky test test_map_fusion_with_concurrency_arg by sorting the output before making an assertion. This correctly handles the non-deterministic block ordering from concurrent execution. My review includes a suggestion to use a set comparison instead of sorted, which would align with the pattern used in other tests in this file for order-independent comparisons.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: slfan1989 <55643692+slfan1989@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
| down_name = "Map(Map)" | ||
|
|
||
| assert extract_values("id", ds.take_all()) == list(range(10)) | ||
| assert set(extract_values("id", ds.take_all())) == set(range(10)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using set instead of sorted loses duplicate detection
Low Severity
The test uses set() for comparison instead of sorted() as stated in the PR description. Using set() cannot detect duplicate values in the output - if concurrency bugs caused the same item to be processed twice (e.g., returning 11 values with one duplicate), the set() comparison would still pass while sorted() would correctly fail. This weakens the test's ability to catch real concurrency-related bugs in the map fusion logic.


Description
Fix a flaky test in
test_operator_fusion.pywheretest_map_fusion_with_concurrency_argfails intermittently due to non-deterministic block ordering when using concurrency=2.The test fails when the two data blocks (containing values 0-4 and 5-9) are processed concurrently. Due to race conditions, block 2 may complete before block 1, resulting in output order [5, 6, 7, 8, 9, 0, 1, 2, 3, 4] instead of the expected [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].
Since this test validates operator fusion behavior rather than output ordering, this PR sorts the results before comparison to make the test robust against concurrent execution order variations.
Related issues
Related to flaky test failures in CI when running test_map_fusion_with_concurrency_arg with parameterized concurrency configurations.
https://buildkite.com/ray-project/microcheck/builds/37063/steps/canvas?sid=019c130a-0406-436a-b12f-1f4cb5842e19
Additional information
Test failure example:
Change:
The test still validates the core functionality (operator fusion logic with different concurrency settings) while being resilient to non-deterministic execution ordering.