Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved outputs to analyze integration test results #445

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

aspeake
Copy link
Collaborator

@aspeake aspeake commented Nov 16, 2024

Fixes #415

Introduces a class to compare integration test results on a branch with the results stored on master. A previous PR (#440) added all integration test results to master. This PR provides a way of evaluating the differences between the working branch and master, which include:

  • Store plot pdfs as CI artifacts so that they can be visually compared against master
  • New script compare_results.py to:
    • Output differences in keys between the branches' agg_results and ecm_results (output to agg_results_key_diffs.csv and ecm_results_key_diffs.csv)
    • Output percent differences in values between the branches' agg_results and ecm_results, as long as values meet an absolute threshold and the differences meet a percent threshold (output to agg_results_value_diffs.csv and ecm_results_value_diffs.csv)
    • Output percent differences in values between branches' Summary_Data-MAP.xlsx and Summary_Data-TP.xlsx (output to Summary_Data-MAP_percent_diffs.csv and Summary_Data-TP_percent_diffs.csv)
  • Update the Github Actions workflow so that when there are differences between the branch and master agg_results.json or ecm_results.json, then:
    • Commit new results and plots (same as before)
    • Pull down agg_results.json, ecm_results.json, Summary_Data-TP.xlsx, Summary_Data-MAP.xlsx from master, store in tests/integration_tests/results_base
    • Run tests/integration_tests/compare_results.py
    • Store the output csvs described above as CI artifacts

Example Outputs
Example CI artifacts are found at https://github.com/trynthink/scout/actions/runs/11943859504

Example *_results_key_diffs.csv:
image
image

Example *_results_value_diffs.csv:
image
image

Example Summary_Data-*_percent_diffs.xlsx:
Same format as original xlsx files, but values are the percent differences
image

@aspeake aspeake force-pushed the ci_outputs_2 branch 2 times, most recently from 1e36329 to 9e38c4e Compare November 18, 2024 22:45
@aspeake aspeake added this to the v1.1.0 milestone Nov 19, 2024

return key_diffs

def compare_dict_values(self, dict1, dict2, percent_threshold=10, abs_threshold=1000):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should the threshold be when deciding to report percent changes of json values? Should the thresholds for agg_results be different from ecm_results?

note - percent threshold means that only differences >= to that will be reported, absolute threshold only reports differences if the original values exceed that number to prevent outputting large percent diffs due to small numbers.


!tests/integration_testing/results/plots/tech_potential/*.xlsx
!tests/integration_testing/results/plots/max_adopt_potential/*.xlsx
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To overwrite the ignored .xlsx files specified above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Output visualizations/metrics for changes of integration test results
1 participant