Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alert the providers team when the registry is not able to publish #5683

Open
iwahbe opened this issue Oct 11, 2024 · 3 comments
Open

Alert the providers team when the registry is not able to publish #5683

iwahbe opened this issue Oct 11, 2024 · 3 comments
Labels
kind/enhancement Improvements or new features

Comments

@iwahbe
Copy link
Member

iwahbe commented Oct 11, 2024

Hello!

  • Vote on this issue by adding a 👍 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

The providers team does not have the bandwidth to monitor docs update PRs after the provider upgrade has finished. There needs to be a mechanism to alert us when docs fail to publish.

Ideally that mechanism would distinguish between "the release is bad, and must be adjusted" vs "the registry failed to publish, please try again".

Affected area/feature

@iwahbe iwahbe added kind/enhancement Improvements or new features needs-triage Needs attention from the triage team labels Oct 11, 2024
@github-project-automation github-project-automation bot moved this to 🤔 Triage in Docs 📚 Oct 11, 2024
@thoward thoward moved this from 🤔 Triage to 🧳 Backlog in Docs 📚 Oct 11, 2024
@thoward thoward removed the needs-triage Needs attention from the triage team label Oct 11, 2024
@thoward
Copy link
Contributor

thoward commented Oct 11, 2024

@sean1588 Can we use #docs-ops alerting system for this?

@sean1588
Copy link
Member

@thoward , alerting for registry publishing failures have always reported into the #docs-ops channel. I actually merged a PR to move that alerting over to #registry-ops, which is a channel I just configured a couple days ago to handle this so that it can be shared with the providers side of the house.

see: #5636

Ideally that mechanism would distinguish between "the release is bad, and must be adjusted" vs "the registry failed to publish, please try again".

@iwahbe, @guineveresaenger - currently this handles the case of registry failed to publish which could be for various reasons including the release being bad and needing to be adjusted. I need to do some thinking around the ability to add some distinguishing around this. Also let me know if you want to explore other alerting options here other than slack. This is something we have already had in place so just ported it over to another channel that is more specific to registry so these don't keep getting lost in the noise of #docs-ops.

@iwahbe
Copy link
Member Author

iwahbe commented Oct 11, 2024

Reiterating a message from Pulumi's internal slack:

The providers team has moved away from slack alerts to issue creation, since that tracks resolution of the issue. Our ideal case is that registry build failures create P1 issues (if possible, labeled so that we can filter only for issues caused by invalid provider builds). We can then materialize these issues in our ops dashboards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Improvements or new features
Projects
Status: 🧳 Backlog
Development

No branches or pull requests

3 participants