You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(otel): Update README with provider transformers and Mastra support
- Document provider transformers pattern for framework-specific data transformations
- Add Mastra to architecture diagram and v1 instrumentation section
- Explain v1 framework detection via instrumentation scope name
- Document event type determination for v1 frameworks ($ai_span for root spans)
- Add section on adding new provider transformers
- Update design decisions with provider transformers and v1 event type logic
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
- Metadata: `gen_ai.request.model`, `gen_ai.usage.input_tokens`, etc.
76
81
77
-
**Processing**: When a span contains `prompt` or `completion` attributes (after extraction), the transformer recognizes it as v1 and sends the event immediately without caching. This works because v1 spans are self-contained.
82
+
**Processing**: The transformer recognizes v1 in two ways:
-`$ai_trace`: Root spans (no parent) for v2 frameworks
125
+
-`$ai_span`: All other spans, including root spans from v1 frameworks
126
+
127
+
**v1 Detection**: Checks for `prompt` or `completion` attributes OR framework scope name (e.g., `@mastra/otel`). v1 spans bypass the event merger.
116
128
117
-
Detects v1 vs v2 by checking for `prompt` or `completion`in extracted attributes. v1 spans bypass the event merger.
129
+
**Event Type Logic**: For v1 frameworks like Mastra, root spans are marked as `$ai_span` (not `$ai_trace`) to ensure they appear in the tree hierarchy. This is necessary because `TraceQueryRunner` filters out `$ai_trace` events from the events array.
**posthog_native.py**: Extracts PostHog-specific attributes prefixed with `posthog.ai.*`. These take precedence in the waterfall.
143
155
144
-
**genai.py**: Extracts OpenTelemetry GenAI semantic convention attributes (`gen_ai.*`). Handles indexed message fields by collecting attributes like `gen_ai.prompt.0.role` into structured message arrays.
156
+
**genai.py**: Extracts OpenTelemetry GenAI semantic convention attributes (`gen_ai.*`). Handles indexed message fields by collecting attributes like `gen_ai.prompt.0.role` into structured message arrays. Supports provider-specific transformations for frameworks that use custom OTEL formats.
157
+
158
+
**providers/**: Framework-specific transformers for handling custom OTEL formats:
159
+
160
+
-**base.py**: Abstract base class defining the provider transformer interface (`can_handle()`, `transform_prompt()`, `transform_completion()`)
161
+
-**mastra.py**: Transforms Mastra's wrapped message format (e.g., `{"messages": [...]}` for input, `{"text": "...", "files": [], ...}` for output) into standard PostHog format. Detected by instrumentation scope name `@mastra/otel`.
145
162
146
163
## Event Schema
147
164
@@ -237,14 +254,74 @@ v2 can send multiple log events in a single HTTP request. The ingestion layer gr
237
254
238
255
### v1/v2 Detection
239
256
240
-
Rather than requiring explicit configuration, the transformer auto-detects instrumentation version by checking for `prompt` or `completion` attributes. This allows both patterns to coexist without configuration.
257
+
Rather than requiring explicit configuration, the transformer auto-detects instrumentation version by:
258
+
259
+
1. Checking for `prompt` or `completion` attributes (after extraction)
260
+
2. Detecting framework via instrumentation scope name (e.g., `@mastra/otel`)
261
+
262
+
This allows both patterns to coexist without configuration, and supports frameworks that don't follow standard attribute conventions.
263
+
264
+
### Provider Transformers
265
+
266
+
Some frameworks (like Mastra) wrap OTEL data in custom structures that don't match standard GenAI conventions. Provider transformers detect these frameworks (via instrumentation scope or attribute prefixes) and unwrap their data into standard format. This keeps framework-specific logic isolated while maintaining compatibility with the core transformer pipeline.
267
+
268
+
**Example**: Mastra wraps prompts as `{"messages": [{"role": "user", "content": [...]}]}` where content is an array of `{"type": "text", "text": "..."}` objects. The Mastra transformer unwraps this into standard `[{"role": "user", "content": "..."}]` format.
269
+
270
+
### Event Type Determination for v1 Frameworks
271
+
272
+
v1 frameworks create root spans that should appear in the tree hierarchy alongside their children. These root spans are marked as `$ai_span` (not `$ai_trace`) because `TraceQueryRunner` filters out `$ai_trace` events from the events array. This ensures v1 framework traces display correctly with proper parent-child relationships in the UI.
241
273
242
274
### TTL-Based Cleanup
243
275
244
276
The event merger uses 60-second TTL on cache entries. This automatically cleans up orphaned data from incomplete traces (e.g., lost log packets) without requiring background jobs or manual cleanup.
245
277
246
278
## Extending the System
247
279
280
+
### Adding New Provider Transformers
281
+
282
+
Create a new transformer in `conventions/providers/`:
0 commit comments