Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement performance tracing API for SDK #1203

Open
11 tasks done
abhaysood opened this issue Sep 8, 2024 · 0 comments · May be fixed by #1405
Open
11 tasks done

Implement performance tracing API for SDK #1203

abhaysood opened this issue Sep 8, 2024 · 0 comments · May be fixed by #1405
Assignees
Labels
android android related feature new features

Comments

@abhaysood
Copy link
Contributor

abhaysood commented Sep 8, 2024

This issue is still being worked on and any of the details can change until finalized

Summary

Provide the ability to collect traces organized as a DAG, which we have all seen with Open Telemetry and many other tracing tools.

Image
Screenshot from Honeycomb

Introduction

Tracing helps understand how long certain operations take to complete, from the moment they begin until they finish, including all the intermediate steps, dependencies, and parallel activities that occur during execution.

A trace represents the entire operation, which could be a complete user journey like onboarding, further divided into multiple steps like login, create profile, etc. A trace is represented by a trace_id.

A span is the fundamental building block of a trace. A span represents a single unit of work. This could be a HTTP request, a database query, a function call, etc. Each span contains information about the operation — when it started, how long it took and whether it completed successfully or not. A span is identified using a span_id and a user defined name.

Imagine ability to see the following when trying to debug cold launch time:

Image
An example cold launch trace. This is not the actual design for the dashboard.

To achieve this, spans in a trace are organized as a Directed Acyclic Graph (DAG). Which means spans can have a parent span and each span can have multiple children. This is done by adding a parent_id to each span, whose value is the span_id of it's parent.

Span

A span has the following structure (field names are not final):

{
 "trace_id": "---",
 "span_id": "---",
 "parent_id": "---",
  "session_id": "---",
 "name": "---",
 "status": "---",
 "start_time": "",
 "duration": ---,
 "events": ["---"],
 "attribute": { 
},
 "user_defined_attributes": {},
 "attachments": []
}

trace_id

Created when a new root span is created.
A root span is created when no parent is set explicitly and no parent context is available.
16 bytes (128-bit) represented as 32 lowercase hex characters

span_id

Unique identifier for a span. Created when a new span is started.
8 bytes (64-bit) represented as 16 lowercase hex characters.

parent_id

Optional parent span id, used to show a parent-child relationship between spans.

name

The name of the span follows Open Telemetry Semantic Conventions wherever possible. Span names can also take any other user defined names which aren't part of the otel semantic conventions.

status

One of unset, ok or error. Signifies whether the operation performed as part of the span was successful or not.

start_time

Milliseconds since epoch when the span was started.

duration

The duration of the span, calculated using a monotonic clock.

events

Spans can contain events with attributes which are key value pairs. Similar to a custom event.

attributes

Attributes are key value pairs which add more context to the span. The keys follow Open Telemetry Semantic Conventions wherever possible. However, they can also take any other user defined names which aren't part of the otel semantic conventions.

SDK API Reference

Start a span

A span can be started using startSpan function.

val span: Span = Measure.startSpan("span-name")

A span can also be started by providing the start time, this is useful in cases where a certain operation has already started but there wasn't any way to access the Measure APIs in that part of the code.

val span: Span = Measure.startSpan("span-name", timestamp = System.currentTimeMillis())

End a span

A span can be ended using end function. Status is mandatory to set when ending a span.

val span: Span = Measure.startSpan("span-name")
span.end(Status.Ok)

A span can also be ended by providing the end time, this is useful in cases where a certain operation has already ended but there wasn't any way to access the Measure APIs in that part of the code.

val span: Span = Measure.startSpan("span-name)
span.end(Status.Ok, timestamp = System.currentTimeMillis())

Set parent span

val parentSpan: Span = Measure.startSpan("parent-span")
val childSpan: Span = Measure.startSpan("child-span").setParent(parentSpan)

Deferred span start

The span builder API allows pre-configuring a span without starting it immediately.

val spanBuilder: SpanBuilder = Measure.createSpan("span-name")
val span: Span = spanBuilder.startSpan()

The span builder also allows setting parent or no parent directly using the builder:

val spanBuilder: SpanBuilder = Measure.createSpan("span-name")
  .setNoParent()
val span: Span = spanBuilder.startSpan()
val spanBuilder: SpanBuilder = Measure.createSpan("span-name")
  .setParent(parentSpan)
val span: Span = spanBuilder.startSpan()

Add checkpoint to a span

val span: Span = Measure.startSpan("span-name").setCheckpoint("name")

Distributed tracing

To maintian the trace context from the SDK to all HTTP requests, a W3C trace context header can be injected automatically by the SDK.

traceparent header

A HTTP header used to propagate trace context across different systems.

It follows the W3C trace-context rules: version-traceId-spanId-traceflags.

version — 2 hex digits, always 00 for now.
traceId — 32 hex digits
spanId — 16 hex digits
traceflags — 2 hex digits (01 = sampled, 00 = not sampled)

Tasks

@abhaysood abhaysood self-assigned this Sep 8, 2024
@abhaysood abhaysood converted this from a draft issue Sep 8, 2024
@abhaysood abhaysood moved this from Todo to In Progress in Measure Roadmap Sep 8, 2024
@abhaysood abhaysood added docs user facing documentation feature new features labels Sep 8, 2024
@abhaysood abhaysood changed the title Write spec for performance tracing in Measure Implement performance tracing API for SDK Oct 24, 2024
@abhaysood abhaysood added android android related and removed docs user facing documentation labels Oct 24, 2024
abhaysood added a commit that referenced this issue Dec 12, 2024
* Implement core APIs for span collection
* Implement W3C trace context compatible span id & trace id
* Process and store spans
* Implement span checkpoints
* Modify batching and export to include spans in the payload
* Add limits to span name, checkpoint name and number of checkpoints per span
* Rename EventProcessor, EventStore, EventExporter with
SignalProcessor, SignalStore, Exporter as they now handle
both span and event data
* Add documentation
* Update public API
* Modify IdProvider to also handle span and trace ID creation

closes #1203
abhaysood added a commit that referenced this issue Dec 13, 2024
* Implement core APIs for span collection
* Implement W3C trace context compatible span id & trace id
* Process and store spans
* Implement span checkpoints
* Modify batching and export to include spans in the payload
* Add limits to span name, checkpoint name and number of checkpoints per span
* Rename EventProcessor, EventStore, EventExporter with
SignalProcessor, SignalStore, Exporter as they now handle
both span and event data
* Add documentation
* Update public API
* Modify IdProvider to also handle span and trace ID creation
* Refactor launch tracker to use elapsed realtime

closes #1203
abhaysood added a commit that referenced this issue Dec 13, 2024
* Implement core APIs for span collection
* Implement W3C trace context compatible span id & trace id
* Process and store spans
* Implement span checkpoints
* Modify batching and export to include spans in the payload
* Add limits to span name, checkpoint name and number of checkpoints per span
* Rename EventProcessor, EventStore, EventExporter with
SignalProcessor, SignalStore, Exporter as they now handle
both span and event data
* Add documentation
* Update public API
* Modify IdProvider to also handle span and trace ID creation
* Refactor launch tracker to use elapsed realtime

closes #1203
abhaysood added a commit that referenced this issue Dec 13, 2024
* Implement core APIs for span collection
* Implement W3C trace context compatible span id & trace id
* Process and store spans
* Implement span checkpoints
* Modify batching and export to include spans in the payload
* Add limits to span name, checkpoint name and number of checkpoints per span
* Rename EventProcessor, EventStore, EventExporter with
SignalProcessor, SignalStore, Exporter as they now handle
both span and event data
* Add documentation
* Update public API
* Modify IdProvider to also handle span and trace ID creation
* Refactor launch tracker to use elapsed realtime

closes #1203
abhaysood added a commit that referenced this issue Dec 16, 2024
* Implement core APIs for span collection
* Implement W3C trace context compatible span id & trace id
* Process and store spans
* Implement span checkpoints
* Modify batching and export to include spans in the payload
* Add limits to span name, checkpoint name and number of checkpoints per span
* Rename EventProcessor, EventStore, EventExporter with
SignalProcessor, SignalStore, Exporter as they now handle
both span and event data
* Add documentation
* Update public API
* Modify IdProvider to also handle span and trace ID creation
* Refactor launch tracker to use elapsed realtime

closes #1203
abhaysood added a commit that referenced this issue Dec 16, 2024
* Implement core APIs for span collection
* Implement W3C trace context compatible span id & trace id
* Process and store spans
* Implement span checkpoints
* Modify batching and export to include spans in the payload
* Add limits to span name, checkpoint name and number of checkpoints per span
* Rename EventProcessor, EventStore, EventExporter with
SignalProcessor, SignalStore, Exporter as they now handle
both span and event data
* Add documentation
* Update public API
* Modify IdProvider to also handle span and trace ID creation
* Refactor launch tracker to use elapsed realtime

closes #1203
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
android android related feature new features
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

1 participant