Performance impact of large number of collections #37594

jubingc · 2024-11-11T21:18:44Z

jubingc
Nov 11, 2024

A Milvus instance allows up to 65,536 collections. However, too many collections may result in performance issues.

I would like to better understand how the number of collections and partitions affects performance. Could you clarify what constitutes "too many" collections in this context?

As an example, the calculation below multiplies the number of collections, shards, and partitions. However, shards are primarily for data writing, while partitions and segments are used for data reading. Why are these elements multiplied together?

60 (collections) x 2 (shards) x 4 (partitions) + 40 (collections) x 1 (shard) x 12 (partitions) = 960

Additionally, per the documentation, the maximum number of partitions in a collection is 4,096 (with a default of 1,024, controlled by rootCoord.maxPartitionNum). Given a shared rootCoord.maxGeneralCapacity, which of the following configurations would likely yield better performance?

1,024 collections, 2 shards per collection, 16 partitions per collection = 32,768 general capacity
16 collections, 2 shards per collection, 1,024 partitions per collection = 32,768 general capacity

Beyond performance, I’d also appreciate insights into the pros and cons of each setup. Some drawbacks of the second setup I’m aware of include:
a. The recommended size for a partition is up to 1 billion items (reference).
b. There is currently no way to filter data within a partition quickly.

Are there additional pros or cons to consider for each of these configurations?

yhmo · 2024-11-12T02:41:54Z

yhmo
Nov 12, 2024
Collaborator

Take a look at this chart so understand how milvus manages the data in shards/partitions/collections/segments:

5 replies

yhmo Nov 12, 2024
Collaborator

Assume a collection has 2 shards, 2 partitions. To maintain the data in this collection, it maintains 2 virtual-channels for each partition. So, there will be 2 * 2 virtual channels to be managed. If there are 60 collections, and each collection has 2 shards, 4 partitions, there will be 60 * 2 * 4 virtual channel objects to be managed. Each single row of data passed via a virtual channel needs to be carefully recorded.

yhmo Nov 12, 2024
Collaborator

1,024 collections, 2 shards per collection, 16 partitions
16 collections, 2 shards per collection, 1,024 partitions

No much difference between the two cases, since there will be 1024 * 2 * 16 virtual channels need to be maintained.

Typically, it is recommended that the total number of virtual channels be controlled under 1000. In the v2.4.x, maybe v-channels number 5000 ~ 10000 also can work. Anyway, we don't recommend high-number v-channels.

jubingc Nov 12, 2024
Author

@yhmo thanks. That diagram is helpful. Is it available somewhere in the doc? Suppose we have more than 10K v-channel, which component would become the bottleneck first?

jubingc Nov 14, 2024
Author

@yhmo v-channel is for write performance, right? How about the read performance, do the following two options have the same read performance?

1,024 collections, 2 shards per collection, 16 partitions
16 collections, 2 shards per collection, 1,024 partitions

I have this confusion because of this sentence from the doc

The search performance of partition-oriented multi-tenancy is much better than collection-oriented multi-tenancy.

Why does partition-oriented multi-tenancy have better search performance than collection-oriented multi-tenancy? @xiaofan-luan

yhmo Nov 15, 2024
Collaborator

As @xiaofan-luan mentioned "partition is considered to be more light weight than collection". So, partition-oriented is better.

xiaofan-luan · 2024-11-12T19:31:05Z

xiaofan-luan
Nov 12, 2024
Maintainer

@yanliang567 is working on the effect of large number of the collections/partitions

The goal here is to support:
10000 collections with 4096 partitions
in oue cluster.

This could be part of milvus 2.5.X

7 replies

xiaofan-luan Nov 12, 2024
Maintainer

2.5 is gonna to released soon this week or next. We don't have those improvements yet. this is currently under evaluation and need to be improved

jubingc Nov 14, 2024
Author

@xiaofan-luan Thank you for the suggestion.

Beyond performance, we're also evaluating strategies for multi-tenancy. Our use case involves managing Milvus as a service for internal teams. Currently, we use collections for multi-tenancy, assigning each user to their own collection since usage patterns vary, and we can’t predict the collection sizes—users may ingest data unpredictably. Currently, we have around 150 collections, each with only one partition. All collections do not share the same schema, so partition-based multi-tenancy might not be an option.

If we were to use partitions for multi-tenancy, though, we’d likely encounter some limitations sooner, including:

All partitions in the same collection must share the same schema.
The maximum partition limit per collection (4,096) is significantly lower than the maximum number of collections (65,536).
Data filtering by partition would no longer be an option (a tenant could have multiple partitions, which adds management overhead).
Partition size limitations (1B records) could cause users to reach size limits earlier than if they used collections.

Any suggestions would be helpful. Thanks!

xiaofan-luan Nov 15, 2024
Maintainer

@xiaofan-luan Thank you for the suggestion.

Beyond performance, we're also evaluating strategies for multi-tenancy. Our use case involves managing Milvus as a service for internal teams. Currently, we use collections for multi-tenancy, assigning each user to their own collection since usage patterns vary, and we can’t predict the collection sizes—users may ingest data unpredictably. Currently, we have around 150 collections, each with only one partition. All collections do not share the same schema, so partition-based multi-tenancy might not be an option.

If we were to use partitions for multi-tenancy, though, we’d likely encounter some limitations sooner, including:

All partitions in the same collection must share the same schema.

The maximum partition limit per collection (4,096) is significantly lower than the maximum number of collections (65,536).

Data filtering by partition would no longer be an option (a tenant could have multiple partitions, which adds management overhead).

Partition size limitations (1B records) could cause users to reach size limits earlier than if they used collections.

Any suggestions would be helpful. Thanks!

How many tenants are you trying to work with?
If you tenant number is definitely less than 10000 then I would prefer to create multiple collections .
Right now we believe create 10000 colleection is acually doable at this moment, We do have a campaign to improve the collection number support.

xiaofan-luan Nov 15, 2024
Maintainer

Would like to get more info about you are building and how we can help. Ping me at [email protected].

Also want to share about some of the experiment we did during the test

jubingc Nov 15, 2024
Author

@xiaofan-luan Thank you. I will be contacting you via email for further discussion.

We don't expect to have more than 4,000 tenants/collections in a cluster. In this case, we can adjust the number of partitions for specific tenants who require higher throughput.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance impact of large number of collections #37594

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 12 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Performance impact of large number of collections #37594

jubingc Nov 11, 2024

Replies: 2 comments · 12 replies

yhmo Nov 12, 2024 Collaborator

yhmo Nov 12, 2024 Collaborator

yhmo Nov 12, 2024 Collaborator

jubingc Nov 12, 2024 Author

jubingc Nov 14, 2024 Author

yhmo Nov 15, 2024 Collaborator

xiaofan-luan Nov 12, 2024 Maintainer

xiaofan-luan Nov 12, 2024 Maintainer

jubingc Nov 14, 2024 Author

xiaofan-luan Nov 15, 2024 Maintainer

xiaofan-luan Nov 15, 2024 Maintainer

jubingc Nov 15, 2024 Author

jubingc
Nov 11, 2024

Replies: 2 comments 12 replies

yhmo
Nov 12, 2024
Collaborator

yhmo Nov 12, 2024
Collaborator

yhmo Nov 12, 2024
Collaborator

jubingc Nov 12, 2024
Author

jubingc Nov 14, 2024
Author

yhmo Nov 15, 2024
Collaborator

xiaofan-luan
Nov 12, 2024
Maintainer

xiaofan-luan Nov 12, 2024
Maintainer

jubingc Nov 14, 2024
Author

xiaofan-luan Nov 15, 2024
Maintainer

xiaofan-luan Nov 15, 2024
Maintainer

jubingc Nov 15, 2024
Author