Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-create tracing spans for log groups #2309

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

goodspark
Copy link
Contributor

@goodspark goodspark commented Aug 21, 2023

This will make it easy for CI users to trace and analyze their CI jobs without requiring additional work or scripting on their part. They can just look at tracing information on their tracing provider.

Example with Datadog tracing:
image

Associated CI job:
Screenshot 2023-08-21 173042

Comment on lines 546 to 553
if bytes.HasPrefix(p, []byte("~~~ ")) || bytes.HasPrefix(p, []byte("--- ")) || bytes.HasPrefix(p, []byte("+++ ")) {
s.Close()
operationName := bytes.SplitN(p[4:], []byte("\n"), 1)[0]
s.span, s.ctx = opentracing.StartSpanFromContext(s.ctx, string(operationName))
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I'm not clear about - can we assume Write calls are for each line? Or will we get subsets of a line and/or multiple lines inside the given byte array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In testing, it seems at least there will be multiple lines. I'm not sure if it will be called mid-line. It'll be simpler to assume not, since we won't have to maintain a buffer of writes.

@goodspark goodspark force-pushed the trace-log-groups branch 2 times, most recently from bcbb7b6 to 4ee6553 Compare August 22, 2023 00:27

func (s *spanMakerWriter) FinishIfActive() {
if s.span != nil {
s.span.Finish()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing missing is being able to mark a span as an error. I think we'd need to pass in the return of p.WaitResult here to know the process' exit code.

This will make it easy for CI users to trace and analyze their CI jobs
without requiring additional work or scripting on their part. They can
just look at tracing information on their tracing provider.
Copy link
Contributor

@triarius triarius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea @goodspark, be we foresee some issues with the implementation.

We also would really appreciate in implementation for opentelemetry. It should not be that much more difficult.

@@ -532,6 +535,32 @@ func round(d time.Duration) time.Duration {
}
}

// spanMakerWriter is an io.Writer shim that captures logs and automatically creates trace spans for the log group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is getting pretty big. Can you put spanMakerWriter and its methods in a new file?

And I think its name should reflect that it creates spans from log groups. spansFromLogGroupWriter perhaps?

s.span, _ = opentracing.StartSpanFromContext(s.ctx, operationName)
}
return s.w.Write(p)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the complete line is not guaranteed to be contained in a single Write. The +++, etc could occur in the middle of p. Even worse, it could be split across consecutive calls to Write.

I think what we need here is a Writer that has a state machine that will capture what's input between (for example) +++ and then next \n and creates a new span as appropriate. It will also have to have an upper bound on how much it captures to prevent an adversary from crafting an input that will cause the agent to run out of memory. Of course, a limit is also useful because the tracing backend will have a limit on how long a span's name can be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants