Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IVF bench for PgVector extension #512

Closed
wants to merge 1 commit into from

Conversation

C0rWin
Copy link
Contributor

@C0rWin C0rWin commented Apr 16, 2024

Currently, the pgvector benchmark looks only at the HNSW index, missing the second index type, IVFFlat. This commit also extends the pgvector benchmark and the ability to execute the IVFFlat index. Though I understand HNSW might have superior performance, it's still interesting to have the ability to compare it with another index type.

@@ -5,7 +5,7 @@ float:
disabled: false
docker_tag: ann-benchmarks-pgvector
module: ann_benchmarks.algorithms.pgvector
name: pgvector
name: pgvector_hnsw
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed only to make it clear while comparing with ivfflat.

@C0rWin
Copy link
Contributor Author

C0rWin commented Apr 16, 2024

_________________________________ test_hamming _________________________________

    def test_hamming():
        dist = metrics["hamming"].distance
    
        p = numpy.array([1, 1, 0, 0], dtype=numpy.bool_)
        q = numpy.array([1, 0, 0, 1], dtype=numpy.bool_)
>       assert dist(p, q) == pytest.approx(2)
E       assert 0.5 == 2 ± 2.0e-06
E         comparison failed
E         Obtained: 0.5
E         Expected: 2 ± 2.0e-06

test/distance_test.py:[21](https://github.com/erikbern/ann-benchmarks/actions/runs/8704604263/job/23873209353?pr=512#step:5:22): AssertionError
=========================== short test summary info ============================

not sure whenever and how is PR related to the UT failure.

@C0rWin
Copy link
Contributor Author

C0rWin commented Apr 16, 2024

_________________________________ test_hamming _________________________________

    def test_hamming():
        dist = metrics["hamming"].distance
    
        p = numpy.array([1, 1, 0, 0], dtype=numpy.bool_)
        q = numpy.array([1, 0, 0, 1], dtype=numpy.bool_)
>       assert dist(p, q) == pytest.approx(2)
E       assert 0.5 == 2 ± 2.0e-06
E         comparison failed
E         Obtained: 0.5
E         Expected: 2 ± 2.0e-06

test/distance_test.py:[21](https://github.com/erikbern/ann-benchmarks/actions/runs/8704604263/job/23873209353?pr=512#step:5:22): AssertionError
=========================== short test summary info ============================

not sure whenever and how is PR related to the UT failure.

tested with the previous commit, getting the same error.... I assume there are some issues with the main branch.

@C0rWin
Copy link
Contributor Author

C0rWin commented Apr 16, 2024

#513 should expected to fix the UT issue.

@jkatz
Copy link
Contributor

jkatz commented Apr 16, 2024

cc @ankane

The IVFFlat testing for pgvector was explicitly removed in #463 -- Andrew & I had discussed this quite a bit and opted for the simpler approach of only having HNSW present in the ANN Benchmarks suite. Given this is the primary indexing method + de facto choice for pgvector, it makes sense to just have the single method present.

If we did decide to do too, I would opt to break it out as "pgvector_ivfflat" and keep the HNSW implementation as "pgvector"

@@ -62,3 +62,55 @@ def get_memory_usage(self):

def __str__(self):
return f"PGVector(m={self._m}, ef_construction={self._ef_construction}, ef_search={self._ef_search})"


class PGVectorIVF(BaseANN):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is to be added, I'd recommend using the original implementation, though adding the binary load found in #488:

https://github.com/erikbern/ann-benchmarks/tree/a884247a83db918d4f19c82048293c8fce65b30b/ann_benchmarks/algorithms/pgvector

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware it was part of the ann_benchmark workload. It makes sense to bring back the original implementation, but I don't know why it was removed in the first place; I am trying to learn the tradeoffs presented by both algorithm implementations and find it extremely useful.

@C0rWin
Copy link
Contributor Author

C0rWin commented Apr 16, 2024

cc @ankane

The IVFFlat testing for pgvector was explicitly removed in #463 -- Andrew & I had discussed this quite a bit and opted for the simpler approach of only having HNSW present in the ANN Benchmarks suite. Given this is the primary indexing method + de facto choice for pgvector, it makes sense to just have the single method present.

If we did decide to do too, I would opt to break it out as "pgvector_ivfflat" and keep the HNSW implementation as "pgvector"

Here is the thing: I do not think this is a de facto method since there might be workloads where you'd like to compromise the accuracy in favor of speed; having this benchmark might help to shed some light on differences between these two. Obviously, this is only IMHO.

@jkatz
Copy link
Contributor

jkatz commented Apr 16, 2024

cc @ankane
The IVFFlat testing for pgvector was explicitly removed in #463 -- Andrew & I had discussed this quite a bit and opted for the simpler approach of only having HNSW present in the ANN Benchmarks suite. Given this is the primary indexing method + de facto choice for pgvector, it makes sense to just have the single method present.
If we did decide to do too, I would opt to break it out as "pgvector_ivfflat" and keep the HNSW implementation as "pgvector"

Here is the thing: I do not think this is a de facto method since there might be workloads where you'd like to compromise the accuracy in favor of speed; having this benchmark might help to shed some light on differences between these two. Obviously, this is only IMHO.

I'm in a position where I can make some more data-driven opinions on this, and from the data and conversations with pgvector users, the typical approach now is to go "HNSW-first." There are cases where I talk to users who are using IVFFlat and are very happy with their results, their workloads tend to be more static (all vectors are available at indexing time), whereas most use I'm seeing these days tends to be iterative (vectors are continuosuly added to the data set).

While we were evaluating pgvector's HNSW implementation, I did publish benchmarks (using an adapted ANN Benchmark) on the two methods. I just happened to be doing some comparative analysis on the upcoming pgvector release (which I'll publish), but the performance/recall ratio gap between pgvector's HNSW + IVFFlat implementation has grown, and IVFFlat's biggest advantage (index build time) has dwindled or been superceded. In the coming days I'll have additional data on this point.

I can understand from an evaluation standpoint where it may be helpful to compare the two methods, but from a simplicity and usage standpoint (and where the referenced changes to the module game), it makes more sense to compare the HNSW implementation. Maybe the compromise is including the code to execute an evaluation with IVFFlat, but have it disabled by default?

@C0rWin C0rWin closed this Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants