-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Support OTLP in LightningStore #313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/ci |
|
🚀 CI Watcher for correlation id-3533819908-mhz4uekm triggered by comment 3533819908
✅ All runs completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for the OpenTelemetry Protocol (OTLP) in LightningStore, enabling standardized trace collection and export. The implementation introduces OTLP/HTTP endpoint handling, trace ingestion from protobuf format, and integration with existing tracing infrastructure.
Key Changes
- Added OTLP protobuf parsing and span conversion utilities with support for gzip compression
- Integrated native OTLP exporters into OtelTracer and AgentOpsTracer with automatic endpoint switching
- Extended LightningStore capabilities to expose OTLP/HTTP trace endpoints
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| agentlightning/utils/otlp.py | New module implementing OTLP request handling, protobuf conversion, and LightningStoreOTLPExporter |
| agentlightning/tracer/otel.py | Refactored to support OTLP exporters, moved LightningSpanProcessor from agentops, added enable/disable OTLP methods |
| agentlightning/tracer/agentops.py | Updated to inherit from OtelTracer, removed duplicate LightningSpanProcessor implementation |
| agentlightning/store/base.py | Extended LightningStoreCapabilities with otlp_traces field and added otlp_traces_endpoint method |
| agentlightning/store/client_server.py | Implemented OTLP traces endpoint handler, updated capability checks to use .get() |
| agentlightning/store/memory.py | Added otlp_traces capability (false), fixed typo "Manully" → "Manually" |
| agentlightning/llm_proxy.py | Updated LightningSpanExporter to use OTLP when supported, changed RLock to Lock, made _maybe_flush synchronous |
| agentlightning/instrumentation/agentops.py | Enhanced BypassableAuthenticatedOTLPExporter to inherit from LightningStoreOTLPExporter |
| agentlightning/types/tracer.py | Added ROLLOUT_ID, ATTEMPT_ID, SPAN_SEQUENCE_ID constants to SpanNames enum |
| agentlightning/tracer/base.py | Added lifespan() context manager for tracer lifecycle management |
| examples/minimal/write_traces.py | New example demonstrating OTLP and AgentOps trace writing |
| examples/minimal/vllm_server.py | New example for programmatically managing vLLM servers |
| examples/minimal/llm_proxy.py | New example showing LLM proxy setup with vLLM and OpenAI |
| agentlightning/cli/store.py | Added --log-level argument for configurable logging |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
agentlightning/tracer/otel.py
Outdated
| tracer_provider.resource.merge( | ||
| Resource.create( | ||
| { | ||
| SpanNames.ROLLOUT_ID: rollout_id, | ||
| SpanNames.ATTEMPT_ID: attempt_id, | ||
| } | ||
| ) | ||
| ) |
Copilot
AI
Nov 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resource is being merged but the result is not assigned back. Resource.merge() returns a new Resource object rather than modifying in place. This line should be: tracer_provider.resource = tracer_provider.resource.merge(...) to actually update the resource.
| @property | ||
| def disable_store_submission(self) -> bool: | ||
| return self._disable_store_submission | ||
|
|
||
| @disable_store_submission.setter | ||
| def disable_store_submission(self, value: bool) -> None: | ||
| self._disable_store_submission = value |
Copilot
AI
Nov 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The disable_store_submission property setter and getter on lines 202-207 should have type hints for clarity. Add -> bool to the getter and : bool to the setter parameter for consistency with the rest of the codebase.
| def __enter__(_): # type: ignore | ||
| with self._lock: | ||
| self._store, self._rollout_id, self._attempt_id = store, rollout_id, attempt_id | ||
| self._last_trace = None | ||
| self._spans = [] | ||
| return self | ||
|
|
||
| def __exit__(_, exc_type, exc, tb): # type: ignore |
Copilot
AI
Nov 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normal methods should have 'self', rather than '_', as their first parameter.
| def __enter__(_): # type: ignore | |
| with self._lock: | |
| self._store, self._rollout_id, self._attempt_id = store, rollout_id, attempt_id | |
| self._last_trace = None | |
| self._spans = [] | |
| return self | |
| def __exit__(_, exc_type, exc, tb): # type: ignore | |
| def __enter__(self): # type: ignore | |
| with self._lock: | |
| self._store, self._rollout_id, self._attempt_id = store, rollout_id, attempt_id | |
| self._last_trace = None | |
| self._spans = [] | |
| return self | |
| def __exit__(self, exc_type, exc, tb): # type: ignore |
| def __enter__(_): # type: ignore | ||
| with self._lock: | ||
| self._store, self._rollout_id, self._attempt_id = store, rollout_id, attempt_id | ||
| self._last_trace = None | ||
| self._spans = [] | ||
| return self | ||
|
|
||
| def __exit__(_, exc_type, exc, tb): # type: ignore |
Copilot
AI
Nov 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normal methods should have 'self', rather than '_', as their first parameter.
| def __enter__(_): # type: ignore | |
| with self._lock: | |
| self._store, self._rollout_id, self._attempt_id = store, rollout_id, attempt_id | |
| self._last_trace = None | |
| self._spans = [] | |
| return self | |
| def __exit__(_, exc_type, exc, tb): # type: ignore | |
| def __enter__(self): # type: ignore | |
| with self._lock: | |
| self._store, self._rollout_id, self._attempt_id = store, rollout_id, attempt_id | |
| self._last_trace = None | |
| self._spans = [] | |
| return self | |
| def __exit__(self, exc_type, exc, tb): # type: ignore |
|
/ci |
|
🚀 CI Watcher for correlation id-3535423523-mhznx596 triggered by comment 3535423523
✅ All runs completed. |
|
/ci |
|
🚀 CI Watcher for correlation id-3535486356-mhzpl4c2 triggered by comment 3535486356
✅ All runs completed. |
|
/ci |
|
🚀 CI Watcher for correlation id-3535748507-mhzs2znt triggered by comment 3535748507
✅ All runs completed. |
|
/ci |
|
🚀 CI Watcher for correlation id-3536155776-mhzz0zfi triggered by comment 3536155776
✅ All runs completed. |
OTLP Specification: https://opentelemetry.io/docs/specs/otlp/