Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the protobuf dependency? #37

Open
NeilGirdhar opened this issue Jul 14, 2022 · 13 comments
Open

Update the protobuf dependency? #37

NeilGirdhar opened this issue Jul 14, 2022 · 13 comments

Comments

@NeilGirdhar
Copy link

NeilGirdhar commented Jul 14, 2022

Could the protobuf dependency be incremented by two major versions please?

@chhabrakadabra
Copy link

To make a case for this ticket, many libraries are now requiring protobuf>4. Meaning that tensorflow-metadata is blocking codebases from upgrading all these other libraries.

chhabrakadabra added a commit to chhabrakadabra/feast that referenced this issue Sep 30, 2022
- As of version `1.49`, the various python packages in the [grpc
  repo](https://github.com/grpc/grpc/tree/master/src/python) require
  `protobuf>=4.21.3`. Unfortunately, this is incompatible with all
  versions of `tensorflow-metadata` (see [this
  issue](tensorflow/metadata#37)). And since
  `piptools` doesn't backtrack during dependency resolution, the
  requirement files cannot be regenerated without adding an upper limit
  on these grpc libraries directly in `setup.py`.
- The previous attempt to upgrade usages of the `mock_dynamodb2`
  decorator to the newest version failed. Since I'm not an expert in
  dynamodb, it made sense to just cap the test tool to the version
  already being used in CI.

Signed-off-by: Abhin Chhabra <[email protected]>
feast-ci-bot pushed a commit to feast-dev/feast that referenced this issue Oct 5, 2022
* Initial implementation of BigTable online store.

Signed-off-by: Abhin Chhabra <[email protected]>

* Attempt to run bigtable integration tests.

Currently focusing on just getting the tests running locally. I've only
build python3.8 requirements.

Signed-off-by: Abhin Chhabra <[email protected]>

* Got the BigTable tests running in local containers

Signed-off-by: Abhin Chhabra <[email protected]>

* Set serialization version when computing entity ID

Signed-off-by: Abhin Chhabra <[email protected]>

* Switch to the recommended layout in bigtable.

This was recommended by the BigTable dev team. Details of this layout
will be added to the documentation in a future commit.

Signed-off-by: Abhin Chhabra <[email protected]>

* Minor bugfixes.

- If a row is empty when fetching data, don't process it more.
- If a task in the threadpool fails, bubble up that failure.
- If a `created_ts` is not available, use an empty string. `None` does
  not automatically serialize to bytes.

Signed-off-by: Abhin Chhabra <[email protected]>

* Move BigTable online store out of contrib

As per feedback on the PR.

Signed-off-by: Abhin Chhabra <[email protected]>

* Attempt to run integration tests in CI.

Provide the GCP project and the bigtable instance ID for the tests to
connect to.

Signed-off-by: Abhin Chhabra <[email protected]>

* Delete tables for entity-less feature views.

Signed-off-by: Abhin Chhabra <[email protected]>

* Table names should be smaller than 50 characters

This is BigTable's table length limit and it's causing test failures.

Signed-off-by: Abhin Chhabra <[email protected]>

* Optimize bigtable reads.

- Fetch all the rows in one bigtable fetch.
- Get only the columns that are necessary (using a column regex filter).

Signed-off-by: Abhin Chhabra <[email protected]>

* dynamodb: switch to `mock_dynamodb`

The latest rebuilding of requirements has upgraded the `moto` library
past the `4.0.0` release, which has a couple of breaking changes.
Specifically, the `mock_dynamodb2` decorator has been deprecated. See
https://github.com/spulec/moto/blob/master/CHANGELOG.md#400 for more
details.

The actual PR (getmoto/moto#4919) mentions that
it's because the `mock_dynamodb` decorator is now equivalent to the
`mock_dynamodb2` decorator.

Signed-off-by: Abhin Chhabra <[email protected]>

* minor: rename `BigTable` to `Bigtable`

This matches the GCP docs.

Signed-off-by: Abhin Chhabra <[email protected]>

* Wrote some Bigtable documentation.

Closely mirrors the docs for the other online stores.

Signed-off-by: Abhin Chhabra <[email protected]>

* Bugfix: Deal with missing row keys.

It looks like the bigtable client will just skip over non-existent row
keys.

Signed-off-by: Abhin Chhabra <[email protected]>

* Fix linting issues.

Signed-off-by: Abhin Chhabra <[email protected]>

* Generate requirements files.

- As of version `1.49`, the various python packages in the [grpc
  repo](https://github.com/grpc/grpc/tree/master/src/python) require
  `protobuf>=4.21.3`. Unfortunately, this is incompatible with all
  versions of `tensorflow-metadata` (see [this
  issue](tensorflow/metadata#37)). And since
  `piptools` doesn't backtrack during dependency resolution, the
  requirement files cannot be regenerated without adding an upper limit
  on these grpc libraries directly in `setup.py`.
- The previous attempt to upgrade usages of the `mock_dynamodb2`
  decorator to the newest version failed. Since I'm not an expert in
  dynamodb, it made sense to just cap the test tool to the version
  already being used in CI.

Signed-off-by: Abhin Chhabra <[email protected]>

* Don't bother materializing created timestamp.

Had a discussion with Danny about whether it's useful to copy this
column. He agreed that there's no value to storing this in the online
store.

Signed-off-by: Abhin Chhabra <[email protected]>

* Remove `tensorflow-metadata`.

Turns out that this dependency is not required. We removed all
references to it in [this
PR](#2063), but did not remove it
from `setup.py`. Removing it has caused many of the restrictions imposed
in previous commits to be unnecessary.

Signed-off-by: Abhin Chhabra <[email protected]>

* Minor fix to Bigtable documentation.

Feedback from Danny mentioned that Bigtable should be able to store
multiple versions of the same key and fetch the latest at read time.
This makes sense and means that concurrent writes should work just fine.

Signed-off-by: Abhin Chhabra <[email protected]>

* update roadmap docs

Signed-off-by: Danny Chiao <[email protected]>

* Fix roadmap doc

Signed-off-by: Danny Chiao <[email protected]>

* Change link to point to roadmap page

Signed-off-by: Danny Chiao <[email protected]>

* change order in roadmap

Signed-off-by: Danny Chiao <[email protected]>

Signed-off-by: Abhin Chhabra <[email protected]>
Signed-off-by: Abhin Chhabra <[email protected]>
Signed-off-by: Danny Chiao <[email protected]>
Co-authored-by: Danny Chiao <[email protected]>
franciscojavierarceo pushed a commit to franciscojavierarceo/feast that referenced this issue Oct 18, 2022
* Initial implementation of BigTable online store.

Signed-off-by: Abhin Chhabra <[email protected]>

* Attempt to run bigtable integration tests.

Currently focusing on just getting the tests running locally. I've only
build python3.8 requirements.

Signed-off-by: Abhin Chhabra <[email protected]>

* Got the BigTable tests running in local containers

Signed-off-by: Abhin Chhabra <[email protected]>

* Set serialization version when computing entity ID

Signed-off-by: Abhin Chhabra <[email protected]>

* Switch to the recommended layout in bigtable.

This was recommended by the BigTable dev team. Details of this layout
will be added to the documentation in a future commit.

Signed-off-by: Abhin Chhabra <[email protected]>

* Minor bugfixes.

- If a row is empty when fetching data, don't process it more.
- If a task in the threadpool fails, bubble up that failure.
- If a `created_ts` is not available, use an empty string. `None` does
  not automatically serialize to bytes.

Signed-off-by: Abhin Chhabra <[email protected]>

* Move BigTable online store out of contrib

As per feedback on the PR.

Signed-off-by: Abhin Chhabra <[email protected]>

* Attempt to run integration tests in CI.

Provide the GCP project and the bigtable instance ID for the tests to
connect to.

Signed-off-by: Abhin Chhabra <[email protected]>

* Delete tables for entity-less feature views.

Signed-off-by: Abhin Chhabra <[email protected]>

* Table names should be smaller than 50 characters

This is BigTable's table length limit and it's causing test failures.

Signed-off-by: Abhin Chhabra <[email protected]>

* Optimize bigtable reads.

- Fetch all the rows in one bigtable fetch.
- Get only the columns that are necessary (using a column regex filter).

Signed-off-by: Abhin Chhabra <[email protected]>

* dynamodb: switch to `mock_dynamodb`

The latest rebuilding of requirements has upgraded the `moto` library
past the `4.0.0` release, which has a couple of breaking changes.
Specifically, the `mock_dynamodb2` decorator has been deprecated. See
https://github.com/spulec/moto/blob/master/CHANGELOG.md#400 for more
details.

The actual PR (getmoto/moto#4919) mentions that
it's because the `mock_dynamodb` decorator is now equivalent to the
`mock_dynamodb2` decorator.

Signed-off-by: Abhin Chhabra <[email protected]>

* minor: rename `BigTable` to `Bigtable`

This matches the GCP docs.

Signed-off-by: Abhin Chhabra <[email protected]>

* Wrote some Bigtable documentation.

Closely mirrors the docs for the other online stores.

Signed-off-by: Abhin Chhabra <[email protected]>

* Bugfix: Deal with missing row keys.

It looks like the bigtable client will just skip over non-existent row
keys.

Signed-off-by: Abhin Chhabra <[email protected]>

* Fix linting issues.

Signed-off-by: Abhin Chhabra <[email protected]>

* Generate requirements files.

- As of version `1.49`, the various python packages in the [grpc
  repo](https://github.com/grpc/grpc/tree/master/src/python) require
  `protobuf>=4.21.3`. Unfortunately, this is incompatible with all
  versions of `tensorflow-metadata` (see [this
  issue](tensorflow/metadata#37)). And since
  `piptools` doesn't backtrack during dependency resolution, the
  requirement files cannot be regenerated without adding an upper limit
  on these grpc libraries directly in `setup.py`.
- The previous attempt to upgrade usages of the `mock_dynamodb2`
  decorator to the newest version failed. Since I'm not an expert in
  dynamodb, it made sense to just cap the test tool to the version
  already being used in CI.

Signed-off-by: Abhin Chhabra <[email protected]>

* Don't bother materializing created timestamp.

Had a discussion with Danny about whether it's useful to copy this
column. He agreed that there's no value to storing this in the online
store.

Signed-off-by: Abhin Chhabra <[email protected]>

* Remove `tensorflow-metadata`.

Turns out that this dependency is not required. We removed all
references to it in [this
PR](feast-dev#2063), but did not remove it
from `setup.py`. Removing it has caused many of the restrictions imposed
in previous commits to be unnecessary.

Signed-off-by: Abhin Chhabra <[email protected]>

* Minor fix to Bigtable documentation.

Feedback from Danny mentioned that Bigtable should be able to store
multiple versions of the same key and fetch the latest at read time.
This makes sense and means that concurrent writes should work just fine.

Signed-off-by: Abhin Chhabra <[email protected]>

* update roadmap docs

Signed-off-by: Danny Chiao <[email protected]>

* Fix roadmap doc

Signed-off-by: Danny Chiao <[email protected]>

* Change link to point to roadmap page

Signed-off-by: Danny Chiao <[email protected]>

* change order in roadmap

Signed-off-by: Danny Chiao <[email protected]>

Signed-off-by: Abhin Chhabra <[email protected]>
Signed-off-by: Abhin Chhabra <[email protected]>
Signed-off-by: Danny Chiao <[email protected]>
Co-authored-by: Danny Chiao <[email protected]>
@mattb-zip
Copy link

it looks like this is holding us back from upgrading to tensorflow 2.12.0 and using protobuf 4 for any use case that requires tensorflow-metadata, or tensorflow-datasets which requires tensorflow-metadata

@coreyhu
Copy link

coreyhu commented Apr 4, 2023

any word on this?

@elgalu
Copy link

elgalu commented Oct 4, 2023

any updates?

@NeilGirdhar
Copy link
Author

NeilGirdhar commented Nov 3, 2023

@rtg0795 Would you mind taking a look at this?

@masonkirchner
Copy link

masonkirchner commented Nov 27, 2023

Any updates on this?

EDIT: If you could publish any changes that loosen the Protobuf dependency as tensorflow-metadata==1.14.1 that would be appreciated. A 1.15.0 release with the loosened Protobuf dependency would be worthless without re-releasing all the Tensorflow libraries that depend on tensorflow-metadata because they're all scoped to tensorflow-metadata<1.15.0.

@rtg0795
Copy link

rtg0795 commented Jan 2, 2024

@NeilGirdhar Hi. I see that 'protobuf>=3.20.3,<4.21' has been used for 1.14.0. Can you please explain or provide more details on why this is not sufficient?

@NeilGirdhar
Copy link
Author

@rtg0795 I'll let others comment more on this, but if I remember, this was preventing upgrades to Python 3.12. See also: #37 (comment) and #37 (comment)

@NeilGirdhar
Copy link
Author

@rtg0795 What are your thoughts? It's been almost two years since this was requested.

@jeffpicard
Copy link

@rtg0795 protobuf>=3.20.3,<4.21 is not sufficient because 4.21 is the first released 4.x version.

This is blocking us as well.

@NeilGirdhar
Copy link
Author

NeilGirdhar commented Mar 15, 2024

Thank you for fixing this on master! For everyone watching this, we just need to wait for the next release or use the master version.

Also, just noting that Protobuf 5 is now out 😄

@rclough
Copy link

rclough commented Aug 16, 2024

Is this fixed on master? The boundary is still <4.21, which does not include any valid protobuf 4 versions. This is impacting my organization as well. We have packages that depend on protobuf >4, and the current boundary does not provide that.

Note: This is for Py39/310, the newer protobuf versions are supported for py311, but note even TFX supports 311 yet/we will probably need 39/310 support for a while anyways

@dbalabka
Copy link

dbalabka commented Nov 28, 2024

As the previous commenter mentioned, the most simplistic way to fix the dependency issue is to jump to Python 3.11 to allow tensorflow-metadata package work with the latest version of Protobuff:

metadata/setup.py

Lines 132 to 133 in f440b43

'protobuf>=4.25.2,<6.0.0dev;python_version>="3.11"',
'protobuf>=3.20.3,<4.21;python_version<"3.11"',

It worked for me. Thanks to @rclough

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants