Skip to content

EMSG version 0 metadata events fire immediately instead of at correct presentation time in DASH VOD playback #3024

@mebhupendra

Description

@mebhupendra

Version

Media3 1.8.0

More version details

EMSG version 0 boxes in DASH VOD streams are being rendered immediately upon parsing instead of at their correct presentation time. This happens for all segments after the first one.

This bug only affects EMSG version 0 boxes. EMSG version 1 is not affected because it uses absolute timestamps read directly from the emsg box.

Root Cause: In FragmentedMp4Extractor.onLeafAtomRead(), the field segmentIndexEarliestPresentationTimeUs is only updated when haveOutputSeekMap is false (i.e., for the first sidx box only). For subsequent segments, this field retains its initial value (0 or C.TIME_UNSET).

When processing EMSG v0 boxes in onEmsgLeafAtomRead(), the sample time is calculated as:

presentationTimeDeltaUs = Util.scaleLargeTimestamp(atom.readUnsignedInt(), C.MICROS_PER_SECOND, timescale);
if (segmentIndexEarliestPresentationTimeUs != C.TIME_UNSET) {
  sampleTimeUs = segmentIndexEarliestPresentationTimeUs + presentationTimeDeltaUs;
}

Since segmentIndexEarliestPresentationTimeUs is not updated for segments after the first (remains 0), sampleTimeUs effectively equals just presentationTimeDeltaUs (the relative offset within the segment). This results in a much smaller timestamp than the actual playback position, causing MetadataRenderer to render these events immediately.

Devices that reproduce the issue

All devices - seems to be logic bug in FragmentedMp4Extractor

Devices that do not reproduce the issue

No response

Reproducible in the demo app?

Yes

Reproduction steps

  1. Play a DASH VOD stream with multiple segments containing EMSG version 0 boxes (e.g., ID3 metadata with presentation_time_delta)

  2. Observe that metadata events from the first segment fire at correct times

  3. Observe that metadata events from subsequent segments fire immediately when the segment is loaded, not at their intended presentation time

  4. Note: EMSG version 1 boxes (with absolute presentation_time) work correctly

Expected result

EMSG metadata events should fire at their correct presentation time:

sampleTimeUs = segmentIndexEarliestPresentationTimeUs + presentationTimeDeltaUs

Where segmentIndexEarliestPresentationTimeUs is the earliest presentation time from the current segment's sidx box.

Actual result

EMSG metadata events fire immediately because segmentIndexEarliestPresentationTimeUs remains 0 (or the value from the first segment), making sampleTimeUs much smaller than the current playback position.

Media

https://nonprd-hybrik-output.s3.us-east-1.amazonaws.com/creative/clear_dash/720p/20260122/dash/main.mpd

Bug Report

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions