AI/ML Infra Meetup | Scaling Vector Databases for E-Commerce Visual Search: Architectural Strategies for Millions of Products

Alluxio 361 views 16 slides Sep 23, 2024
Slide 1
Slide 1 of 16
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16

About This Presentation

AI/ML Infra Meetup
Aug. 29, 2024
Organized by Alluxio

For more Alluxio Events: https://www.alluxio.io/events/

Speaker:
- Mahesh Pasupuleti (VP of DS, ML & Data Infra @ Poshmark)

In the rapidly evolving world of e-commerce, visual search has become a game-changing technology. Poshmark, a lea...


Slide Content

POSHLENS

Scaling Vector Databases for
E-Commerce Visual Search:
Architectural Strategies for
Millions of Products
Mahesh Pasupuleti
Poshmark
VP - DS, ML & Data Eng

>_
Poshmark is a leading fashion resale
marketplace powered by a vibrant, highly
engaged community of buyers and sellers and
real-time social experiences.
Designed to make online selling fun, more social
and easier than ever, Poshmark empowers its
sellers to turn their closet into a thriving
business and share their style with the world.
Since its founding in 2011,
Poshmark has grown its community to over 130
million users, helping sellers realize billions in
earnings, delighting buyers with deals and
one-of-a-kind items, and building a more
sustainable future for fashion

Data & ML Architecture

Posh Lens
●Visual search feature that allows
shoppers to search for desired
merchandising using just a photo

●Enhanced product discovery by utilizing
image data from listings, adding a
powerful search method beyond
text-based options.

Embedding
●Image embedding is a numerical
representation that captures the key visual
features of an image, allowing it to be
compared and matched with other images in a
multi-dimensional space.
●Image embedding is like turning an image into
a 3D point based on its features (like color,
shape, and texture), so similar images end up
close together in that 3D space. Just that its in
order of 250-1000 dimensions instead of just
3 dimension
●Poshmark uses proprietary model to compute
image embeddings

Posh Lens Architecture

Creating product
●Listing creation & Updates are
captured via event bus system

●Clip model & Category classifier
hosted on GPU enabled machines

●Queue worker converts Image to
Embeddings

●Embedding are then indexed to
Milvus Database in respective
indexes

Listing Create / Update Events
Image Embedding

Querying
●App server sends image to ML search api
●ML search api calls:
a.Object detection API, the biggest
bounding box (B) along with the
category that’s supported is picked for
search.
b.Image is cropped for B dimensions
and retrieval api is called for
generating embedding
c.Milvus Search is performed against
embedding generated
●App server will receive listing_ids and the
similarity scores and then query
elasticsearch for filtering/ business logic
application.

Sample Request
Sample response

Why Milvus ?
●High-Performance Vector Search
○ Milvus is designed to handle large-scale vector searches efficiently. It can index and
search through millions or even billions of vectors rapidly
●High Availability & Fault Tolerance
○Milvus supports data replication across multiple nodes
○In case of a node failure, Milvus automatically reroutes queries and tasks to other
available nodes
●Load Balancing
○Milvus can be deployed with load balancers that distribute incoming requests across
multiple nodes
●Horizontal Scaling
○Milvus employs a distributed architecture where compute and storage are separated
○The separation of query, data, index nodes allows for better scaling.
●Open Source
○Apache License Version 2.0 backed by Linux Foundation
●Rolling upgrades for each components allows for zero downtime

Milvus architecture

Milvus Configuration
●Indexes:
○10 : each per category
●Index Type
○HSNW 64
●Capacity
○150 M vectors of 256 Dimensions

●Coordinator
○2 Nodes * C6i.xlarge
●Data Nodes
○2 Nodes * M6i.large
●Etcd
○3 Nodes * C6i.xlarge
●Index
○4 Nodes * C6i.2xlarge
●Storage
○MinIO
○4 Nodes * M6i.xlarge
●Message Storage
○Pulsar
○3 nodes * R6i.xlarge
●Query Nodes
○21 Nodes * r6i.2xlarge

Challenges
●Filtering
○Collections, Partitioning & Post Filtering
●Pagination
○Choose a large value for top K parameter and filter results that are not required for
current page and cache it in a different layer
○Offset is supported but internally gets all N+Offset number of results and then
performs a filter
●Higher QPS is expensive
○We have observed that in cluster mode - the configuration provided based on inputs
is usually good for 25QPS at p99 of 100ms. Increase in QPS requires scaling up of the
query nodes/ increasing replicas as well.
●Parameter tuning
○Nlist, NProbe
●Deletions are lazy
●Higher Cost of maintenance

Future
●Pagination
○This can be done via iterator + Caching
●Different indices
○We tried experiments with FLAT, IVF_FLAT, IVF_SQ8 and HNSW.
○IVF_SQ8 provides a good balance between system requirements and performance of
search. IVF_SQ8 was providing a recall of 90+% compared FLAT.
●GPU based index
○It supports Higher QPS at lower costs
●Upgrades
○Better support and improved performance
●Broaden Use cases
○Complete the look, Trends, Similar Items
●Multimodal
○Explore text + image embedding for richer results

Q & A

Thank You