Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tests that verify behavior of generated code + generator errors/warnings #1156

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

eli-bl
Copy link
Collaborator

@eli-bl eli-bl commented Oct 31, 2024

This creates two new categories of end-to-end tests that I'm calling "functional" tests:

  1. Tests that run the generator against a valid spec (defined inline in the test code, specific to each test class, rather than in large spec files) and then import and execute pieces of the generated code. Unlike the "golden record"-based tests, these don't make assertions about the exact implementation of the generated code; they treat it as a black box and verify its behavior. These are in functional_tests/generated_code_execution.
  2. Tests that run the generator against an invalid spec (again defined inline) and make assertions about its error/warning output. These are in functional_tests/generator_failure_cases. The reason I've put these in a separate directory is that they do not involve executing generated code; they're basically black-box tests of the generator.

I've added a bunch of tests in both categories, to cover generated model classes. There aren't currently any tests of endpoint behavior, but I think those could be pretty easily implemented using a mock HTTP client.

The intended benefits are:

  • These tests prove that the generated code behaves as intended from the point of view of an application using the client. That's unlike the existing end-to-end tests (where we are verifying what the code looks like and only inferring what we think it will do), and the unit tests of the generator (where we're verifying the logic for the internal data structures that will be used to generate the code).
  • They're more convenient for rapid development of a feature, as tests are lighter-weight and self-contained.
  • They can replace many (most?) of the current low-level unit tests, which are too tightly coupled to internal implementation details and difficult to maintain.
  • In case of a regression in the generator, failures in these tests can be more specific and easier to interpret. The "golden record"-based tests would also fail, but the output from those is often quite verbose and it can be hard to see which part of the diff is relevant, especially if the error is duplicated in many model/endpoint files.

This PR removes quite a few low-level unit tests that are now redundant; the coverage check in CI is proof that the relevant code paths are still covered.

I believe we could also greatly reduce the size of the "golden record"-based tests, and reduce those current very large specs to cover a smaller range of things, like the overall code structure of the generated client. I think it is safe to say that tests for things like "does a property with such-and-such schema get correctly translated into a model class attribute, with the right serialization/deserialization logic" would be better handled in the new functional test style. But I haven't removed any of those in this PR, since I think that's a bigger task.

@@ -168,18 +145,17 @@ def test_literal_enums_end_to_end():
)
)
def test_meta(meta: str, generated_file: Optional[str], expected_file: Optional[str]):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rewritten this and other functions in this file to use the same generate_client context manager, even though the logic in the tests is fairly simple, for two reasons:

  1. It enforces consistent state cleanup. Previously, failures in some of these tests would leave generated files behind in the project directory.
  2. I think it makes the tests a bit easier to read by reducing boilerplate, so what's left is the logic specific to that test.

@eli-bl eli-bl requested a review from dbanty October 31, 2024 22:44
@eli-bl
Copy link
Collaborator Author

eli-bl commented Oct 31, 2024

By the way, I think this mechanism would be especially useful for testing something like my discriminator PR, #1156 - since that is ultimately about validation behavior.

@eli-bl eli-bl force-pushed the live-generated-code-tests branch 2 times, most recently from 58b3950 to 27478a5 Compare November 8, 2024 21:18
@eli-bl eli-bl force-pushed the live-generated-code-tests branch 2 times, most recently from 2370ba2 to e57aa39 Compare November 8, 2024 22:43
@eli-bl eli-bl force-pushed the live-generated-code-tests branch 5 times, most recently from e87dca8 to b21d2e6 Compare November 11, 2024 23:13
@eli-bl eli-bl changed the title add tests that verify actual behavior of generated code add tests that verify behavior of generated code + generator errors/warnings Nov 11, 2024
@eli-bl eli-bl force-pushed the live-generated-code-tests branch 2 times, most recently from 52a235c to e369b2e Compare November 12, 2024 02:29
@@ -11,7 +11,7 @@ jobs:
test:
strategy:
matrix:
python: [ "3.8", "3.9", "3.10", "3.11", "3.12", "3.13" ]
python: [ "3.9", "3.10", "3.11", "3.12", "3.13" ]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there's another PR extant for dropping 3.8, but I had to remove 3.8 from the testing matrix here because 3.8 has some differences in the typing module that would make some of the new test assertions a pain to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant