-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Support native advanced queries in LightningStore #318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces native support for advanced queries in LightningStore, enabling filtering, sorting, and pagination across rollouts, attempts, spans, workers, and resources. The changes modernize the query interface while maintaining backward compatibility through deprecated parameter aliases.
Key changes:
- Introduced
PaginatedResulttype that behaves like a sequence while carrying pagination metadata (limit, offset, total) - Added comprehensive filtering with
_mustgroups and AND/OR logic combinators - Extended all query methods with
sort_by,sort_order,limit, andoffsetparameters - Refactored adapters to accept
Sequence[Span]instead ofList[Span]for better type flexibility
Reviewed Changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| agentlightning/types/core.py | Defines new PaginatedResult, FilterOptions, SortOptions, and FilterField types for query interface |
| agentlightning/store/base.py | Updates LightningStore interface with expanded query method signatures |
| agentlightning/store/collection_based.py | Implements filtering/sorting/pagination using collection primitives, adds deprecation warnings |
| agentlightning/store/collection/memory.py | Refactors filter matching logic to support _must groups, moves to module-level functions |
| agentlightning/store/collection/base.py | Moves type definitions to core.py for broader visibility |
| agentlightning/store/client_server.py | Updates server routes and client methods to handle new query parameters |
| agentlightning/store/threading.py | Pass-through wrapper updated for new signatures |
| agentlightning/store/memory.py | Adapts get_running_rollouts to convert set to list for within filter |
| agentlightning/adapter/ | Updates adapters to accept Sequence[Span] for broader compatibility |
| tests/store/ | Comprehensive test coverage for new filtering, sorting, pagination features |
| docs/ | Updates documentation to reflect new query capabilities |
| examples/ | Migrates from deprecated status/rollout_ids to status_in/rollout_id_in |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
agentlightning/types/core.py
Outdated
| def __repr__(self) -> str: | ||
| first_item_repr = repr(self.items[0]) if self.items else "empty" | ||
| items_repr = f"[{first_item_repr}, ...]" if len(self.items) > 1 else first_item_repr | ||
| slice_repr = f"{self.offset}:{self.offset + self.limit}" if self.limit >= 0 else f"{self.offset}:" |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The __repr__ implementation shows incorrect slice notation when limit == -1. When limit is -1 (unlimited), the slice f"{self.offset}:" is displayed correctly, but the condition check should be self.limit == -1 instead of self.limit >= 0.
Current code shows (1: of 5) when limit=-1 and offset=1, but the slice should represent [1:] which means "from offset 1 to end". The representation is technically correct but could be clearer.
| slice_repr = f"{self.offset}:{self.offset + self.limit}" if self.limit >= 0 else f"{self.offset}:" | |
| slice_repr = f"{self.offset}:" if self.limit == -1 else f"{self.offset}:{self.offset + self.limit}" |
| FilterOptions = Mapping[ | ||
| Union[str, Literal["_aggregate", "_must"]], | ||
| Union[FilterField, Literal["and", "or"], Mapping[str, FilterField]], | ||
| ] | ||
| """A mapping of field name -> operator dict. | ||
|
|
||
| Each operator dict can contain: | ||
|
|
||
| - "exact": value for exact equality. | ||
| - "within": iterable of allowed values. | ||
| - "contains": substring to search for in string fields. | ||
|
|
||
| The filter can also have a special field called "_aggregate" that can be used to specify the logic | ||
| to combine the results of the filters: | ||
|
|
||
| - "and": all conditions must match. This is the default value if not specified. | ||
| - "or": at least one condition must match. | ||
|
|
||
| All conditions within a field and between different fields are | ||
| stored in a unified pool and combined using `_aggregate`. | ||
|
|
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FilterOptions type uses Union[FilterField, Literal["and", "or"], Mapping[str, FilterField]] for values, but this allows _aggregate to be mapped to a FilterField or Mapping[str, FilterField], which doesn't make semantic sense. The _aggregate key should only map to Literal["and", "or"], and _must should only map to Mapping[str, FilterField] or a sequence of such mappings.
Consider using a more precise type definition with separate TypedDict for the structure, or add validation to catch incorrect usage at runtime.
| FilterOptions = Mapping[ | |
| Union[str, Literal["_aggregate", "_must"]], | |
| Union[FilterField, Literal["and", "or"], Mapping[str, FilterField]], | |
| ] | |
| """A mapping of field name -> operator dict. | |
| Each operator dict can contain: | |
| - "exact": value for exact equality. | |
| - "within": iterable of allowed values. | |
| - "contains": substring to search for in string fields. | |
| The filter can also have a special field called "_aggregate" that can be used to specify the logic | |
| to combine the results of the filters: | |
| - "and": all conditions must match. This is the default value if not specified. | |
| - "or": at least one condition must match. | |
| All conditions within a field and between different fields are | |
| stored in a unified pool and combined using `_aggregate`. | |
| class FilterOptions(TypedDict, total=False): | |
| """A mapping of field name -> operator dict, with special keys for aggregation and must conditions. | |
| Each field name maps to a FilterField. | |
| Special keys: | |
| - "_aggregate": specifies the logic to combine the results of the filters ("and" or "or"). | |
| - "_must": a mapping or sequence of mappings for required conditions. | |
| """ | |
| _aggregate: Literal["and", "or"] | |
| _must: Union[Mapping[str, FilterField], Sequence[Mapping[str, FilterField]]] | |
| # Other fields map to FilterField | |
| # Use NotRequired for optional fields (Python 3.11+), otherwise total=False covers it. | |
| # Example: field_name: FilterField | |
| # For Python <3.11, we can't use NotRequired, so total=False is used above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do this in future.
|
/ci |
|
🚀 CI Watcher for correlation id-3543099762-mi3fhx1d triggered by comment 3543099762
✅ All runs completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| resolved_rollout_ids = rollout_id_in | ||
| elif rollout_ids is not None: | ||
| warnings.warn("rollout_ids is deprecated, use rollout_id_in instead", DeprecationWarning, stacklevel=3) | ||
| resolved_rollout_ids = rollout_ids |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation warning for rollout_ids is missing the stacklevel parameter. It should be stacklevel=3 to match the pattern used for the status parameter on line 409, ensuring the warning points to the caller's code rather than this implementation.
| filters["status"] = {"within": list(status_in)} | ||
| if worker_id_contains is not None: | ||
| filters["worker_id"] = {"contains": worker_id_contains} | ||
| filters["_aggregate"] = filter_logic |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filter_logic is always added to filters, even when no actual filter conditions are present. This differs from query_rollouts (line 423-424) which only adds _aggregate when filter_logic == 'or'. This inconsistency could cause unexpected behavior if the collection implementation treats an empty filter dict with _aggregate differently than a truly empty filter dict.
| filters["_aggregate"] = filter_logic | |
| if filters: | |
| filters["_aggregate"] = filter_logic |
| return await self.collections.workers.query( | ||
| filter=filters or None, |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expression filters or None will evaluate to None when filters is {'_aggregate': 'and'}, but this still represents a filter with aggregate logic. This could lead to incorrect behavior. Consider checking if not filters or filters == {'_aggregate': 'and'} or only setting _aggregate when there are actual filter conditions, similar to the pattern in query_rollouts.
| logger.warning(f"Contradictory exact filters for field '{pk}': {exact_value} != {candidate}") | ||
| return () |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The warning for contradictory exact filters is only implemented in _iter_matching_items for primary key prefix optimization. Consider whether this validation should also be performed in the general _item_matches_filters function to catch contradictory filters on non-primary-key fields as well.
No description provided.