Skip to content

Conversation

@machichima
Copy link
Contributor

Description

Try type casting if struct field types mismatch when backfilling missing fields

Related issues

Closes #60628

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: machichima <nary12321@gmail.com>
@machichima machichima requested a review from a team as a code owner February 1, 2026 11:07
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an ArrowInvalid error that occurs when backfilling missing fields in struct columns, particularly from map tasks. The issue arises from type mismatches between struct fields in different blocks (e.g., int64 vs. float64) that are not handled after schema unification. The proposed change correctly identifies these type discrepancies and explicitly casts the array to the unified field type. The implementation is clean, includes robust error handling with an informative ValueError, and effectively resolves the bug. The changes look good to me.

Signed-off-by: machichima <nary12321@gmail.com>
@machichima
Copy link
Contributor Author

@bveeramani PTAL, thank you!

@ray-gardener ray-gardener bot added data Ray Data-related issues community-contribution Contributed by the community labels Feb 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community data Ray Data-related issues

1 participant