Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(profiling): shrink the size of ZendFrame #2709

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

morrisonlevi
Copy link
Collaborator

@morrisonlevi morrisonlevi commented Jun 11, 2024

PROF-9925

Description

Previously, each frame was 7 * 64 bits. They are now 3 * 64 bits, at the cost that there are slightly more String copies than before. This PR is an attempt to see if this trade-off is worth it.

The current status is that when things are fetched from cache, or if the inserting thing returns a borrowed Cow<str> such as for filenames, then we have the same number of copies as before. Function names are going to return an owned Cow<str>, though, and in this case we have an extra copy compared to before. The benchmarks that the PR bot posts do not capture this case.

Reviewer checklist

  • Test coverage seems ok.
  • Appropriate labels assigned.

@github-actions github-actions bot added the profiling Relates to the Continuous Profiler label Jun 11, 2024
@pr-commenter
Copy link

pr-commenter bot commented Jun 11, 2024

Benchmarks

Benchmark execution time: 2024-06-12 23:26:42

Comparing candidate commit a8fc7ff in PR branch levi/smaller-frames with baseline commit 9f4a6a5 in branch master.

Found 5 performance improvements and 0 performance regressions! Performance is the same for 24 metrics, 7 unstable metrics.

scenario:walk_stack/1

  • 🟩 wall_time [-556.277ns; -528.070ns] or [-2.347%; -2.228%]

scenario:walk_stack/99

  • 🟩 wall_time [-725.636ns; -696.953ns] or [-2.885%; -2.771%]

scenario:walk_stack_instructions/1

  • 🟩 instructions [-6.2K instructions; -6.2K instructions] or [-5.991%; -5.980%]

scenario:walk_stack_instructions/50

  • 🟩 instructions [-5.5K instructions; -5.4K instructions] or [-5.260%; -5.231%]

scenario:walk_stack_instructions/99

  • 🟩 instructions [-4.3K instructions; -4.3K instructions] or [-4.142%; -4.114%]

@morrisonlevi morrisonlevi force-pushed the levi/smaller-frames branch 4 times, most recently from 43f7779 to d06f75e Compare June 12, 2024 00:54
Copy link
Collaborator Author

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes about how the total number of copies doesn't change, despite using the ThinString type which allocates a str copy.

Some(str.to_string())
let thin_str: ThinStr<'a> = unsafe { core::mem::transmute(non_null) };
let str: &'a str = thin_str.into();
Some(Cow::Borrowed(str))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reworked section allows us to lend the string from the string cache to the caller. The caller then allocates and copies it into a ThinString, which is equivalent to the str.to_string() it did before. This means the number of string copies is a wash from the previous version to this version (for this branch).

}
None => {
let string = f()?;
let string = f(self.frame)?;
Copy link
Collaborator Author

@morrisonlevi morrisonlevi Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now returns a Cow<str>. For file names, if the files are already utf-8, then the copy is avoided but we'll later make a copy into a ThinString anyway. So for usual cases, the number of copies compared to the previous version is the same (for this branch).

We have an extra copy compared to before for two cases:

  1. If the filename is not valid utf-8.
  2. The function name is built for the first time (not from cache).

The benchmarks have high hit-rates in the cache (perfect hit rates), so neither of these show up in the benchmark.

@morrisonlevi morrisonlevi force-pushed the levi/smaller-frames branch from d06f75e to df75a9b Compare June 12, 2024 16:18
Base automatically changed from levi/cache-longer to master June 12, 2024 23:11
Previously, each frame was 7 * 64 bits. They are now 3 * 64 bits, at
the cost that every `String` gets copied to a `ThinString`.
@codecov-commenter
Copy link

codecov-commenter commented Jun 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.40%. Comparing base (9f4a6a5) to head (a8fc7ff).

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #2709      +/-   ##
============================================
+ Coverage     79.12%   79.40%   +0.28%     
  Complexity     2212     2212              
============================================
  Files           201      201              
  Lines         22451    22544      +93     
============================================
+ Hits          17764    17901     +137     
+ Misses         4687     4643      -44     
Flag Coverage Δ
tracer-extension 78.63% <ø> (+0.48%) ⬆️
tracer-php 80.52% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

see 12 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f4a6a5...a8fc7ff. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Relates to the Continuous Profiler tracing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants