Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforcing the not_valid: true option when creating constraints, including foreign keys #36

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

numbata
Copy link
Contributor

@numbata numbata commented Mar 6, 2024

This pull request introduces a new rule that requires the use of the not_valid: true option when creating general and foreign key constraints. By using the not_valid: true option, we can enhance database performance and ensure stability during schema changes that involve foreign keys and other constraints.

The Cop also includes test scenarios to ensure accurate detection when the not_valid: true option is omitted from both add_constraint and add_foreign_key methods. The tests confirm that no offenses occur when the option is used correctly.

Background and Motivation

Introducing foreign key constraints on large tables can adversely affect database operations due to prolonged locking and the need for immediate validation. These effects are magnified when multiple constraints are added or during periods of high database activity. By encouraging the use of the NOT VALID option, this rule update makes it easier to apply constraints without immediate validation, thereby reducing lock duration and improving overall performance.

@numbata numbata changed the title Enforcing the `not_valid: true' option when creating constraints, including foreign keys Enforcing the not_valid: true option when creating constraints, including foreign keys Mar 6, 2024
@numbata
Copy link
Contributor Author

numbata commented Mar 21, 2024

@cyberdelia Is there any chance for this PR to get some attention?

@numbata
Copy link
Contributor Author

numbata commented Apr 3, 2024

@bbatsov can I ask you to look at this PR and see if you can maybe have it merged?

@cyberdelia
Copy link
Collaborator

This seems odd to require your constraints to not be valid by default across all your migrations. Feels like we should maybe promote the reverse and explicitly allow not_valid: true to exists by forcing people to disable the cop when they use it?

@bbatsov
Copy link
Contributor

bbatsov commented Apr 4, 2024

I'll defer to @cyberdelia here.

@numbata
Copy link
Contributor Author

numbata commented Apr 5, 2024

@cyberdelia Could you clarify what you mean by "promoting the reverse"? Specifically, I'm interested in understanding the benefits you see in preferring immediate constraint validation (i.e., not using not_valid: true) across all migrations, and why we should encourage disabling the cop for its use.

By setting not_valid: false (the default value), there's a risk that tables with a large number of rows will be locked for both reads and writes during migration, potentially causing application-wide halts. This can cause background jobs to fail due to timeouts, lost connections, and 5xx errors, among other things. This occurs because the database must sequentially scan and validate each row before committing changes.

Conversely, not_valid: true avoids this by not immediately enforcing the constraint on existing rows. Instead, it applies only to newly created or updated rows, allowing validation of pre-existing rows to be handled separately and in parallel without locking the table.

For example, this approach is also recommended by strong_migrations gem, emphasizing its importance in preventing migration-related breakage.

However, I am open to modifying the cop to only check for the presence of the not_valid option, regardless of its value. This adjustment would ensure that migrations explicitly consider whether to apply constraints immediately or defer validation, promoting more informed decision making during migrations without strictly enforcing one approach over the other.

@teoljungberg
Copy link

Speaking from the same team as @numbata is on - we're happy making revisions to this PR to upstream this functionality, or have this live as a custom rubocop rule in our project.

We were bitten by this very configuration option, and thought others could benefit from that learning too.

❤️

@cyberdelia
Copy link
Collaborator

Promoting the use of NOT VALID for most use cases is a bit too heavy handed for most use case, I understand it fit the way you and your team work.

So here is what I suggest, we can add this cop but not enable it by default. So anyone can pick and choose what fits their team process and database.

@numbata
Copy link
Contributor Author

numbata commented Apr 23, 2024

@cyberdelia The step in have a clear and transparent default settings #41

If the PR with default works for you, then after merging, I can explicitly set Enabled: false by default for my cop in the config/default.yml settings.

…tion

Scanning a large table to verify a new foreign key or check constraint can take a
long time, and other updates to the table are locked out until the
`ALTER TABLE ADD CONSTRAINT` command is committed. Using `NOT VALID` constraint
option is reduce the imact of adding a new constraint on concurrent updates. With
`NOT VALID`, the `ADD CONSTRAINT` command does not scan the table and can be
committed immediately.
@cyberdelia
Copy link
Collaborator

@numbata I'd love to get this merged, if you still have some time to work on this.

@numbata
Copy link
Contributor Author

numbata commented Oct 24, 2024

@cyberdelia Thanks for the ping! We have reimplemented the cop internally, so I would be happy to check what could be updated here and ping you back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants