Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add zstd compression support to Barman Cloud #898

Closed
wants to merge 1 commit into from

Conversation

Jellyfrog
Copy link

@Jellyfrog Jellyfrog commented Feb 8, 2024

This implements zstandard support to Barman cloud

Quick benchmark (from my laptop over internet to S3) shows it's the fastest and second smallest size:

Type Size Time
bz2 187.7 MB 81,26s
gz 250.5 MB 137,36s
snappy 398.1 MB 8,41s
tar 945.9 MB 9,05s
zstd 242.2 MB 8,24s

Fixes #850

@Jellyfrog
Copy link
Author

Jellyfrog commented Feb 8, 2024

I'll leave this in draft until I tested it more

Tested backup and restore with basebackup and Wal files.

@Jellyfrog Jellyfrog marked this pull request as ready for review February 9, 2024 18:15
@Jellyfrog
Copy link
Author

Ping @martinmarques

@martinmarques
Copy link
Contributor

Ping @martinmarques

A little patience, we have some deadlines with products, but this is on our radar and we do want to fully analyse this with the CNPG team.

More later.

@VoVAllen
Copy link

VoVAllen commented Oct 8, 2024

Is there any plan for this to be merged?

@gcalacoci gcalacoci requested a review from a team as a code owner October 28, 2024 12:22
zstd = _try_import_zstd()
self.compressor = zstd.ZstdCompressor()
self.decompressor = zstd.ZstdDecompressor().decompressobj(
read_across_frames=True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing this here but is valid in other places of the code...
decompressobj doesn't support read_across_frames for versions of the zstandard lib that are older than 0.22, as you can see here: https://github.com/indygreg/python-zstandard/blob/0.21.0/zstandard/backend_cffi.py#L3903

Unfortunately, this means that this implementation is not compatible with Python 3.6 and 3.7, as you can also see in the unit tests run for this PR.

@Jellyfrog we cannot accept and merge this implementation as it is now, but would be nice if you could be able to make this patch compatible with older python versions that barman still needs to support.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gcalacoci I'm interested in finish this. zstd library doesn't seem to have a breaking change with 3.6 and 3.7. Just no longer build the binaries for 3.6 and 3.7. To solve this, we may need a fork of zstd, with some new releases. Does this work with barman? Is EDB interested in holding the fork as well?

Copy link
Author

@Jellyfrog Jellyfrog Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gcalacoci I'm interested in finish this. zstd library doesn't seem to have a breaking change with 3.6 and 3.7. Just no longer build the binaries for 3.6 and 3.7. To solve this, we may need a fork of zstd, with some new releases. Does this work with barman? Is EDB interested in holding the fork as well?

Feel free to take over. I’m not going to work on this (not using Barman anymore, and especially not for EOL python versions)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's something we want to do or encourage people to do so. There are other solutions to this. Giulio is trying to encourage @Jellyfrog to address those issues.

In short, Barman has requirements, and we have to be careful about those requirements because there are users running Barman on systems that still have python 3.6. This PR would require the deprecation of python 3.6 and 3.7, and that's not something we want to do without some deep thinking.

@martinmarques
Copy link
Contributor

I'm closing this PR. We plan to add zstd and lz4 support for WAL compression, but we feel the implementation has to cover the whole Barman ecosystem, not just the cloud scripts.

@jmealo
Copy link

jmealo commented Nov 12, 2024

@martinmarques: I'm looking to roll out CNPG in production, and we'd really prefer to use lz4 or zstd for our backups/archiving. I work at a Python shop and would be happy to contribute work towards this if I have a sponsor to help review the pull request. If you're open to outside contributors, please reach out. I have added support for lz4 and zstd to other streaming compression use cases we had, so I'm somewhat familiar with the process.

I'm assuming the biggest pain here would be including the native dependencies for the Python packages and extending any test harnesses to support it?

@martinmarques
Copy link
Contributor

@jmealo You have that now, as of yesterday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature] add Zstd support to cloud cli tools
5 participants