-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add visual regression tests with playwright #41
Conversation
291eef6
to
405fa27
Compare
9d22386
to
e58a8f2
Compare
@CAM-Gerlach this is a first draft of the visual regression testing. Please tell me what you think so far. I am currently checking the diff between the actual screenshot and the expected result. We should probably check that diff with a possible percentage of error (there seems to be a slight difference in rendering fonts between Windows/Mac/Ubuntu). I am not sure which is the best, or if there is another more sensible approach? |
Thanks, looking great so far! Some general comments:
I'd think per-OS screenshots, if you can manage that, since we'd need to pick and test a threshold very carefully to avoid false negatives, and there's the risk that it could miss significant changes. |
Thanks for the review, those are all very good points I'll finish this on Monday 😊 |
e58a8f2
to
a67fdb1
Compare
Are you suggesting this for the build artifacts? Or would you like this to be done for references that are stored in the repo as well? |
5f0c09e
to
0e90d0d
Compare
That does not work |
b66e806
to
44cb1b6
Compare
FYI, I mean to suggest this when I replied before, but you might want to remove
Ideally for both, but most importantly for the reference screenshots, since they will be stored in the repo and each version will take up space for all time, so the smaller we can get them, the better. Note that this should be done as part of the GitHub workflow that generates them, rather than inside the test code itself. |
Yeah, I'm sure that's what happened. We can do a best effort on this PR, merge it and then test it and follow up if we need to fix anything with that part. Or, you can merge it on your fork and test it there,, since that should work. BTW, you can get rid of all build and install steps and replace all that with just Let me know if you need anything! |
Right, thanks! I was a bit confused why it was behaving this way but didn't look into it.
👍🏽
Yes, I was doing this yesterday and it does not work properly. I'm looking into it. |
I am getting there: martinRenou#8 Commenting "Please generate reference screenshots" in the PR properly triggers a commit of the references :) I still need to polish a bit and fix some remaining issues, but this is taking shape |
db1199b
to
c44b772
Compare
4986787
to
23d1039
Compare
I resolved all of your comments :) As you can see from the references, we can reproduce this issue #38. I'll try to fix it in a separate PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for all your hard work on this, @martinRenou ! This looks really great. I just have some relatively minor comments, and then this should be good to go.
Top level comments:
- I suggest running
optipng
on the reference PNGs output by the CI (as well as those committed here); since they will be committed irrevocably to the repo once merged, and will inexorably increase its size, the smaller we can make them, the better. - To be more clear, consistent and descriptive about what we're referring to, follow idiomatic English convention and avoid confusion with more typical meanings of "references" in English ("bibliography", people that will vouch for your character, etc), could you rename all instances of
references
toreference_screenshots
or similar (as appropriate)?
To note, I'm not super comfortable with asyncio beyond the very basics (since we don't really use it in my main lines of work) and this doesn't seem to be too performance-critical given the complexity, but I suppose this is acceptable since it is only in the tests and mostly isolated to the one fixture...so long as you're willing to help fix any async-related issues that may come up.
Thanks again!
366cd64
to
5ecb849
Compare
I've implemented the compression with Pillow instead: https://github.com/spyder-ide/docrepr/pull/41/files#diff-a31c7ed5d35f5ed8233994868c54d625b18e6bacb6794344c4531e62bd9dde59R97-R99 this logic also works for the reference screenshots (it has been used for the references on this PR). I am not sure optipng can be easily installed on all platforms? Maybe Pillow's compression should be enough? I can decrease the quality even more if you want.
At first I implemented it without asyncio, as playwright provides a sync API: https://github.com/microsoft/playwright-python#example The issue was that the sync API was not cleaning up resources and shutting down the browser properly (it was generating many warnings that needed to be filtered). We can roll-back to the sync API but we'll have to add filters. Also there was a warning that I was not able to filter for some reason, that's why I gave up and used the async API. |
Thanks for the review! |
Thanks for the quick and through response!
Yep, I'd noticed that—thanks. Unfortunately, Pillow's optimization still appears to be leaving a lot on the table, assuming its what you're using for the current reference screenshots—running It actually shouldn't be that tough to install optipng on all GHA platforms—on Windows, its just
Makes sense, thanks 👍 I saw you used the non-async API before, but I figured there was a reason you switched (and noticed most of the warnings were gone). |
0bbb0d8
to
f3f87ae
Compare
Just added You can see that it runs properly here: Which was triggered by martinRenou#24 |
@martinRenou LGTM, thanks! Now that #40 is merged, the output should change for a number of tests; to avoid an extra copy of the images and them getting out of sync, do you mind rebasing this and using the CIs on your fork to generate the updated reference screenshots for each OS (make sure you use the CI-optimized ones, since different |
f3f87ae
to
f0d8f29
Compare
f0d8f29
to
5ede725
Compare
Rebased! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @martinRenou !
Fix #39