Skip to content
This repository has been archived by the owner on Dec 7, 2023. It is now read-only.

Add dataflow integration test #226

Merged
merged 67 commits into from
Feb 28, 2023
Merged

Add dataflow integration test #226

merged 67 commits into from
Feb 28, 2023

Conversation

cisaacstern
Copy link
Member

Towards debugging #220
Closes #225 when complete

As outlined in #225 (comment), the first step towards this testing is gaining manual (PR-label-based) control over when Heroku Review Apps are built. To recap, without this we would either:

  1. Automatically build a review app for every commit of every PR (therefore triggering a dataflow integration test at every commit of every PR) - too costly / messy
  2. Require a Heroku team member to manually click Create on the Heroku Dashboard when a review app is desired - too toilsome

@cisaacstern cisaacstern added build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. and removed build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. labels Jan 24, 2023
@cisaacstern
Copy link
Member Author

🧠 ⛈️ Brainstorm: Heroku Review Apps + Heroku container stack requires building Dockerfile on Heroku at each push. IIUC, there is no way to cache Docker layers for Heroku Review Apps (open to being shown some feature I'm missing, of course).

Building our current Dockerfile on Heroku seems to take about 10-13 minutes. When you add about 7 minutes for the subsequent Dataflow integration test (based on timing observed from working on pangeo-forge/pangeo-forge-runner#52), that makes the full integration testing cycle 20 minutes. That's a long time for iterating a solution to something like #220! 🙀

What's more, Heroku Container Registry doesn't work with Review Apps:

Screen Shot 2023-01-24 at 10 40 13 AM

As a workaround, perhaps we can:

  • Pre-build the Dockerfile and push it to a public container registry (e.g. Docker Hub)
  • Add a new Dockerfile.heroku which is just:
    FROM pangeo/forge-orchestrator:${SOME_TAG}  # pushed from `Dockerfile.prebuild`
  • And in heroku.yml say:
    build:
      docker:
        web: Dockerfile.heroku

This would make our Review App "build" time just a matter of however long it takes to pull from Docker Hub (presumably shorter than 10-13 minutes).

I wouldn't want this to introduce another layer of brittleness/toil, requiring manual builds which could fall out of sync with the PR. So this model could follow the example of https://github.com/pangeo-forge/pangeo-forge-orchestrator/blob/main/.github/workflows/push-dataflow-image.yaml and automatically build-and-push images when PRs change the Dockerfile.

@cisaacstern cisaacstern added build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. and removed build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. labels Jan 24, 2023
@cisaacstern cisaacstern added build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. and removed build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. labels Jan 25, 2023
@cisaacstern cisaacstern removed the build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. label Jan 25, 2023
@cisaacstern cisaacstern added build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. and removed build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. labels Jan 25, 2023
@cisaacstern cisaacstern added the build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. label Jan 25, 2023
@cisaacstern cisaacstern added build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. and removed build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. labels Jan 25, 2023
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 January 25, 2023 01:42 Inactive
@cisaacstern cisaacstern added build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. and removed build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. labels Jan 25, 2023
@cisaacstern
Copy link
Member Author

cisaacstern commented Jan 25, 2023

Good progress here! What works:

  • Adding build-review-app label to this PR initiates build of Heroku Review App
  • Removing build-review-app label deletes the Heroku Review App

What needs to be fixed:

  • The review app build is succeeding but release is failing due to a terraform state issue
  • The SHA being passed to the Heroku API call looks wrong. I think I'm probably not using head.sha.
  • This sha vs head.sha issue might be why we're not seeing the deployment registered on the PR discussion thread here. Generally, we want the Heroku API call made by the build workflow to successfully register a deployment environment.
  • The new Dockerfile.heroku definitely seems to speed up Review App builds, but the image it is pulling from Docker Hub was pushed there manually. Add a workflow that automates this to reduce toil/brittleness.

@cisaacstern cisaacstern removed the build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. label Jan 25, 2023
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 14, 2023 20:44 Inactive
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 14, 2023 21:55 Inactive
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 14, 2023 22:21 Inactive
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 14, 2023 22:59 Inactive
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 15, 2023 00:53 Inactive
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 15, 2023 01:07 Inactive
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 15, 2023 01:16 Inactive
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 15, 2023 01:23 Inactive
@cisaacstern
Copy link
Member Author

cisaacstern commented Feb 15, 2023

Parametrize source files for test PR and see if we can reproduce errors discussed in #220

This is almost complete. Just making a few notes on it here so I remember where to pick up with it tomorrow (which can also be useful for forthcoming documentation).

  • Dataflow integration test now assumes all source PRs are active against pforgetest/test-staged-recipes

  • The test uses a syntax {pr_number}::{recipe_id} to figure out what pr numbers on pforgetest/test-staged-recipes to replicate for the test, and within that replicated pr, what specific recipe to run.

  • By default, the test runs against 22::gpcp-from-gcs

  • Additional test targets can be specified by adding a label to the orchestrator PR (e.g. this current PR) in the format {also-test:{pr_number}::{recipe_id}.

  • Currently the jobs are being spawned correctly, but there is a crash occurring where they are trying to use the same ref name to make their duplicate PRs, causing the test session which gets to the PR creation step second to error out with Reference already exists
    Screen Shot 2023-02-14 at 5 28 55 PM

  • This is because the duplicated PRs attempt to use the workflow run number for the ref name, so I just need to add one more layer of specificity here, tied to the job name somehow.

One last thought: it's possible certain PRs will require a longer Dataflow timeout, which if so, I can add as part of the label syntax, making it, e.g. {pr_number}::{recipe_id}::{timeout}s.

@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 15, 2023 22:55 Inactive
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 16, 2023 02:19 Inactive
@cisaacstern cisaacstern marked this pull request as ready for review February 16, 2023 02:22
@cisaacstern cisaacstern temporarily deployed to pforge-pr-226 February 16, 2023 02:22 Inactive
@cisaacstern
Copy link
Member Author

While not every single task listed above is complete here, I'm going to merge this now, because the core tests all pass, and also because some enhancements made here to review app builds will be very useful for starting to work on #233.

@cisaacstern cisaacstern merged commit aa5ce7d into main Feb 28, 2023
@cisaacstern cisaacstern deleted the test-dataflow branch February 28, 2023 18:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
also-test:15::noaa-oisst-avhrr-only build-review-app Add this label to PRs to build a Review App and trigger a Dataflow integration test against it. test-dataflow Add this label to PRs to trigger Dataflow integration test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integration testing
1 participant