This is a condensed version of my article. You can read the full story on Medium.
1,000 Users ≠ 1,000 Collections
Building a multi-user AI application with thousands of tenants? Creating a separate collection for each user might seem clean, but it leads to high costs and cluster instability.
The solution is Multitenancy: letting tenants share the same database infrastructure while keeping their data private. Think of it as an apartment building—everyone has their own private space, but they share the same foundation.
The Simple Solution: Payload-based Multitenancy
Instead of separate collections, use one collection and tag each vector with a tenant identifier (e.g., group_id).
Implementation
Upserting Data:
client.upsert(
collection_name="multitenant_collection",
points=[
models.PointStruct(id=1, vector=[...], payload={"group_id": "user_1"}),
models.PointStruct(id=2, vector=[...], payload={"group_id": "user_2"})
]
)
Querying Data:
client.search(
collection_name="multitenant_collection",
query_vector=[...],
query_filter=models.Filter(
must=[
models.FieldCondition(key="group_id", match=models.MatchValue(value="user_1"))
]
)
)
Optimization: Set is_tenant=true on the payload index to co-locate vectors for the same tenant, ensuring fast, sequential disk reads.

The “Noisy Neighbor” Problem
Payload-based multitenancy works great until one tenant becomes a “Whale”, a high-volume user that dominates shared resources, degrading performance for everyone else.
The Advanced Solution: Tiered Multitenancy
Qdrant 1.16 introduces Tiered Multitenancy, a hybrid architecture where:
- Small tenants share a “fallback” shard efficiently.
- Large tenants (Whales) get promoted to their own dedicated shards.
How it works
- Enable Custom Sharding: Create a collection with
sharding_method=Custom. - Create a Fallback Shard: A shared space for all small tenants.
- Tenant Promotion: Seamlessly move a growing tenant from the fallback shard to a dedicated shard without downtime.
# Routing requests
client.upsert(...,
shard_key_selector=models.ShardKeyWithFallback(
target="user_1", # Try dedicated shard
fallback="default" # Use shared shard if dedicated doesn't exist
)
)
This might summarize the key differences between the two multitenancy strategies :

Conclusion
Start with standard, payload-based multitenancy, it’s simple and cost-effective. As your application grows and “power users” emerge, adopt Tiered Multitenancy to promote them to dedicated resources.
(END)