-
-
Notifications
You must be signed in to change notification settings - Fork 31
Description
Summary
Implement Server-Sent Events (SSE) Last-Event-ID functionality to enable the sidecar to resume event streaming from a specific point when the overlay reconnects. This will prevent event loss during disconnections and allow clients to catch up on missed events.
Motivation
Currently, when the overlay UI disconnects and reconnects to the sidecar's /stream endpoint, it loses all events that occurred during the disconnection. With SSE Last-Event-ID support, the overlay can automatically resume from where it left off, ensuring no telemetry data is missed.
Current Architecture
Sidecar (Backend)
- SSE Streaming: Located in
packages/sidecar/src/routes/stream/index.tsandpackages/sidecar/src/routes/stream/streaming.ts - Event Storage: Uses a circular
MessageBuffer(500 events by default) inpackages/sidecar/src/messageBuffer.ts - Event IDs: Each envelope gets a unique
__spotlight_envelope_idgenerated usinguuidv7()inpackages/sidecar/src/parser/processEnvelope.ts(line 47) - Current Stream Implementation: Subscribers receive all new events as they arrive, but no mechanism exists to replay historical events
Overlay (Frontend)
- Connection: Uses native
EventSourceAPI inpackages/overlay/src/sidecar.ts - Storage: Events stored in Zustand store (
packages/overlay/src/telemetry/store/) - Current Behavior: No tracking of last received event ID; complete state loss on reconnection
Implementation Requirements
1. Sidecar Changes
a) Modify SSE Stream Handler (packages/sidecar/src/routes/stream/index.ts)
- Read
Last-Event-IDheader from incoming requests - When
Last-Event-IDis provided:- Query the
MessageBufferfor events after the provided ID - Send historical events first (with their original IDs)
- Then continue streaming new events
- Query the
b) Update SSE Message Format (packages/sidecar/src/routes/stream/streaming.ts)
- Add
idfield to SSE messages using the__spotlight_envelope_id - Current format:
stream.writeSSE({ event: `${container.getContentType()}${base64Indicator}`, data: JSON.stringify(parsedEnvelope.envelope), });
- New format should include:
stream.writeSSE({ id: envelope.envelope[0].__spotlight_envelope_id, // Add this event: `${container.getContentType()}${base64Indicator}`, data: JSON.stringify(parsedEnvelope.envelope), });
c) Extend MessageBuffer (packages/sidecar/src/messageBuffer.ts)
- Add method to read events starting from a specific envelope ID:
readFrom(lastEventId: string): T[]
- This should leverage the existing circular buffer (currently stores 500 events)
- Consider buffer overflow scenarios (when requested ID has been evicted)
2. Overlay Changes
a) EventSource Connection (packages/overlay/src/sidecar.ts)
- The browser's
EventSourceAPI automatically sendsLast-Event-IDheader on reconnection - No explicit changes needed for reconnection handling
- However, may want to track last event ID for debugging/logging purposes
b) State Management (packages/overlay/src/telemetry/store/)
- Track the last received envelope ID in the store
- Handle potential duplicate events during reconnection (envelope IDs can be used for deduplication)
- Current stores already use Maps with IDs, making deduplication straightforward
c) Connection State Handling
- Consider showing a "catching up" state in the UI when replaying historical events
- Track metrics: number of events caught up, time to sync, etc.
3. Edge Cases to Handle
-
Buffer Overflow: Requested
Last-Event-IDis older than the oldest event in the buffer- Send an error event or specific marker
- Client should handle full state reset
-
Invalid Event ID: Client sends an ID that never existed
- Treat as new connection, start from current position
-
Duplicate Events: During replay, client might receive events it already has
- Use
__spotlight_envelope_idfor deduplication - Overlay's
Map-based storage already prevents duplicates by ID
- Use
-
Multiple Content Types: Current implementation supports different event types
- Ensure ID tracking works across all content types
- Currently:
application/x-sentry-envelope,application/x-sentry-envelope;base64
-
Performance: Replaying large numbers of events on reconnection
- Consider batch size limits
- May need throttling for replay
Implementation Steps
Phase 1: Basic SSE ID Support
- Add
idfield to SSE messages in sidecar - Verify EventSource receives and tracks IDs
- Test manual reconnection
Phase 2: Historical Event Replay
- Implement
readFrom()in MessageBuffer - Add Last-Event-ID header handling in stream endpoint
- Test with simulated disconnections
Phase 3: Edge Cases & Polish
- Handle buffer overflow scenarios
- Add deduplication in overlay
- Add UI indicators for catch-up state
- Performance testing with large event counts
Phase 4: Testing
- Unit tests for MessageBuffer.readFrom()
- Integration tests for reconnection scenarios
- E2E tests simulating network interruptions
Technical Details
SSE ID Format: Use existing uuidv7() generated __spotlight_envelope_id
- Format: UUIDv7 (time-ordered, suitable for sequential tracking)
- Example:
0190d9a7-0b52-7000-a000-000000000000
Buffer Retention: Current circular buffer holds 500 events
- Consider making this configurable
- Document that events older than buffer size cannot be replayed
Alternative ID Strategy Considerations:
- Current: Envelope ID (one per envelope, which may contain multiple events)
- Alternative: Per-event IDs (more granular but more complex)
- Recommendation: Start with envelope IDs for simplicity
Related Files
Sidecar:
packages/sidecar/src/routes/stream/index.ts- Stream endpointpackages/sidecar/src/routes/stream/streaming.ts- SSE utilitiespackages/sidecar/src/messageBuffer.ts- Event bufferpackages/sidecar/src/parser/processEnvelope.ts- Envelope ID generationpackages/sidecar/src/utils/eventContainer.ts- Event container wrapper
Overlay:
packages/overlay/src/sidecar.ts- EventSource connectionpackages/overlay/src/telemetry/store/slices/envelopesSlice.ts- Envelope storagepackages/overlay/src/telemetry/store/slices/eventsSlice.ts- Event storage
References
This implementation would significantly improve the reliability of Spotlight's real-time telemetry streaming, especially in development environments where network conditions may be unstable or when developers need to restart services frequently.