Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel runners with matrix #36

Closed
wants to merge 1 commit into from
Closed

Conversation

mntlty
Copy link

@mntlty mntlty commented Mar 12, 2023

Leveraging matrix strategies we can parallelize the build process for each platform. To me these are breaking changes, and deserves a version bump of the Action from v1 to v2.

To test this approach:

  • I created a test repo using gh extension create
  • I copied build_and_release.sh from this repo into my test one and modified it
  • I modified the release.yml workflow file to invoke the modified build_and_release.sh
  • confirmed I was able to install and run version 0.3.0 of the test extension with gh
  • Applied the relevant changes in this PR

This approach has a number of benefits:

  1. The overall process is faster. In my test repo I measured the time it took with the current behavior at 3m 45s to the changes at 37s. This is far from scientific but does indicate that there are likely to be substantial improvements in execution time.
  2. For a single workflow run each matrix value for platform creates an individual job , which makes it easier to see the results for each platform.
  3. In the event of job failures, individual platform builds can be retried, rather than all of them.
  4. By moving the platforms list out build_and_release.sh it is more accessible to users, as they can modify the values in YAML, and more extensible as they can add or remove platforms that they would like to support in their workflow file.

Other considerations:

a. I am not very familiar with Actions billing, and it is possible that having more runners each performing less work is more expensive than a single runner which works for longer.
b. The template files in the cli, for example https://github.com/cli/cli/blob/78ffa73f4442e3ebfb1121dc13931e45a14a96e2/pkg/cmd/extension/ext_tmpls/goBinWorkflow.yml would need to be updated to take advantage of this change.
c. I'm not sure if there's a better way to test custom Actions than the steps I outlined above.
d. I am not a bash expert by any means, the script looks correct to me and works 🤞.
e. I learned that the list in go tool dist list includes platforms that are not in the current platforms list. I don't know if more should be included in the template or if this covers all of the major platforms and is sufficient as is.
f. I don't believe that this breaks any assumptions for users who provide their own build_and_release.sh but please sanity check that for me!

@mntlty mntlty requested review from vilmibm and mislav March 13, 2023 13:39
@vilmibm
Copy link
Contributor

vilmibm commented Mar 15, 2023

I think this is a solid step forward for the action. My only concern would be billing implications; to that end, it might be wise to have this be opt-in just so users can decide if they want parallel builds or not.

@mislav I think you had some feedback for this work?

@mntlty
Copy link
Author

mntlty commented Mar 16, 2023

I think this is a solid step forward for the action. My only concern would be billing implications; to that end, it might be wise to have this be opt-in just so users can decide if they want parallel builds or not.

@mislav I think you had some feedback for this work?

@vilmibm that makes sense! I took a quick look at the billing implications of this change.

according to https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#about-billing-for-github-actions

GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners. For private repositories, each GitHub account receives a certain amount of free minutes and storage for use with GitHub-hosted runners, depending on the product used with the account. Any usage beyond the included amounts is controlled by spending limits.

my understanding is that for extensions in a public repo or using self-hosted runners, this would be a net win as there are not billing implications. for private extensions, this could make the precompile more expensive.

The current behavior clocked in at 3m33s
Screenshot 2023-03-15 at 5 30 11 PM

while the changed behavior consumed more aggregate minutes
Screenshot 2023-03-15 at 5 32 24 PM

Would releasing this Action as V2 be considered opt in, or would it be better to continue with a single version and have users opt in with an input value to the Action?

Copy link
Contributor

@mislav mislav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the suggestion!

Personally, I'm not convinced that this problem is worth solving. A 4-minute overhead to a release at most several times a week (I don't imagine anyone updates their extension more often that that) should not be a dealbreaker for anyone. For addressing what I see as a non-issue, I feel that the approach proposed here is heavy-handed, especially because it entails a backwards-incompatible change, because it introduces a race condition (more on that below), and because it breaks generating the checksums file.

My secondary concern is that this explodes the amount of GitHub Actions computing resources per each extension and that the Go development environment (and all necessary Go modules) will now need to be fetched >14 times instead of just once, leading to much more network I/O than with the previous solution which was more efficient in that manner.

However, if you really want to speed up builds for your extension, I don't see why defining a build matrix in your own workflow file couldn't be an opt-in approach. We could make changes to the build_and_release.sh script to support that along the lines that you've proposed here. Here's the gist of what I'm thinking:

  1. If platform input wasn't specified, make a build for each of the default platforms in a single job. This is the approach we already implement today.
  2. If platform input was specified as per your matrix approach, make a build for only that platform.

I believe with this approach we can keep backwards compatibility but also give an option to extension authors to specify their own matrix.

However, in the matrix mode, a few more things would have to be ironed out:

  • Right now, build_and_release.sh creates or updates a GitHub Release associated with the git tag that was pushed. The way it does that is by checking for an existing release and then creating one if it doesn't exist. However, if several parallel jobs are competing to be the first to create a release, then the existing logic presents a race condition. It's likely that two or more CI runs might both try to create a Release for the same tag because at the time they've checked, one didn't exist.

  • In GPG_FINGERPRINT mode, a checksums.txt file is generated and signed containing information about all the assets in the release. However, if each run creates only a single asset, there will need to be a separate CI run (after the whole matrix has completed) to calculate and store checksums for each of the assets.

I'm open to collaborating on overcoming these hurdles, should you still be interested in pursuing this. Does my assessment make sense?

@mntlty mntlty closed this Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants