“Real-time Retail Product Classification on Android Devices Inside the Caper AI Cart,” a Presentation from Instacart

embeddedvision 79 views 22 slides Sep 27, 2024
Slide 1
Slide 1 of 22
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22

About This Presentation

For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/09/real-time-retail-product-classification-on-android-devices-inside-the-caper-ai-cart-a-presentation-from-instacart/

David Scott, Senior Machine Learning Engineer at Instacart, presents the “Real-time Reta...


Slide Content

Real-Time Retail Product
Classification on Android Devices
inside the Caper AI Cart
David Scott
Senior Machine Learning Engineer
Instacart (Caper)

Table of Contents
2© 2024 Instacart
Instacart and Caper
Current challenges for such products
The novel proposed approach
Approach to reduce resource usage
Approach to integrate with Android system
Performance results

Caper @ Instacart
3© 2024 Instacart

A Seamless Shopping Experience
4© 2024 Instacart
Caper Cart
Embedded Location Systems
Streamlining the shopper experience
Sensor Fusion & AI Integrated
Providing customers with a more
convenient way to shop
1
Weights & Measures Enabled
Robust and certified
2
3

In order to unlock a magical shopping experience, we enable seamless AI
that detects and recognizes products in less than 0.5 seconds
Computer Vision and AI @ Caper​
5© 2024 Instacart

Current Challenges for Such Products
6© 2024 Instacart
Ensure smooth and efficient detection
Avoid delays or disruptions in user experience
High-throughput model with high accuracy
Limited CPU resources shared by Android services
Model speed can impact overall app degradation
Limited by system constraints
How can we utilize the hardware + software resources
we have to improve our user experience without
requiring re-educating our customer?
User experience
1
2
3

Proposed Approach
7© 2024 Instacart
Utilize 30 FPS red-green-blue (RGB) camera to
stream images for processing
Generate image stream and process directly
on Android board
Utilize the processing power of our Android board
to stream and process images directly on Android
Skip processing images on Jetson GPU
Train and convert our PyTorch models to TFLite
for model inference
Deploy using TensorFlow Lite (TFLite)
1
2
3

Proposed Approach ​
8© 2024 Instacart
Get a stream of images that are background
removed
User adds product under our camera
Android board takes in images as an input and
returns embeddings
Perform inference on each image
-Generate one set of embeddings per frame
from a single network
-Utilize ground truth embeddings to calculate
highest similarity in class
Utilize a voting classifier with embeddings
1
2
3

Approach to Reduce Resource Usage

10
© 2024 Instacart
Frame skipping
Do we have an accuracy
reduction or lose
important information by
skipping frames?
Re-using circular
buffers
Other services utilize the
same camera;
we can re-use buffers
Reducing the resolution
of incoming image stream
Since our model utilizes
224 [w, h] images, is it
possible to modify the
camera firmware to give
us smaller images?
Approach to Reduce Resource Usage: Camera​

Approach to Reduce Resource Usage: System
11© 2024 Instacart
How many threads is
necessary to maximize
the performance of the
system?
Perfetto traces
Utilized to help pinpoint
heavy system resource
usages on CPU and RAM
Reducing embeddings
dimensions
By decreasing the
dimensions and # of our
output embeddings, we
reduced system usage
Threading

Approach to Integrate with Android System
12© 2024 Instacart

Approach to Integrate with Android System
13© 2024 Instacart
Break thread if issue arises, avoiding impact on
other apps
Utilizing co-routines and state machine
learning model
Combine information across services to give best
result
Incoming signals via Redis
-Feed model and perform embedding
similarity
-Utilize TFLite built in optimizations for on-
device inference
TFLite integration
1
2
3

Results and Next Steps​

Android Performance Metrics
15© 2024 Instacart
CPU usage
Get the best
performance while
minimizing CPU usage for
other Apps
Model throughput
How can we optimize
model throughput while
maintaining precision
and recall?
Camera speed
Higher FPS gets us better
recognition
More FPS = More CPU;
how to balance?

16© 2024 Instacart
Increase in speed of ~96% from 400ms!
Model throughput
We optimized our throughput by the
following:
●Using INT8 instead of FLOAT32
●Utilizing 112x112 to train model
instead of 224x224
●Decreasing output embedding size
Android Performance Metrics -Model Throughput​
​Android Performance Metrics —Model Throughput​​

Android Performance Metrics —Camera FPS
17© 2024 Instacart
Throughput + resolution investigation
Camera FPS
We optimized our image resolution +
camera FPS by
●Using 640P images @ 30 FPS
●Re-using existing circular buffers
from another service
●Worked with camera vendor to get
custom firmware

Android Performance Metrics —CPU Usage
18© 2024 Instacart
Feature Testing from QA
CPU usage
We optimized CPU usage by optimizing our
model & camera
●Skip intermediate steps, like image
resizing
●Feed system with a queue of images
●Stress test across our system to ensure
performance

Next Steps
19© 2024 Instacart
Continuous improvement of model accuracy and speed
Further optimizations
Contribute our learnings and findings to the sparse
documentation for TFLite.
Improving TFLite ecosystem and
documentation
Provide a seamless and delightful experience for users
Utilizing research to enhance user experience
1
2
3

Conclusions
20© 2024 Instacart
Optimizing hardware selection is a great place to
start before software
Solve your problems first with hardware
Gains in throughput speed helped decrease CPU
usage (through frame skipping)
Optimizing throughput leads to better overall
performance
Spending time investigating pre-processing and
model optimization was fruitful for the team
Deeply investigating pre-processing and
model steps to maximize throughput
1
2
3

TensorFlow Lite Documentation:
•TensorFlow Lite Model Optimization
Documentation
•TensorFlow Lite Quantization Documentation
•TensorFlow Lite Model Analyzer Documentation
Resources
Android Development for
TensorFlow Lite:
•Quickstart for Android Development

Thank you for your time!
Look forward to answering your
questions
David Scott
Senior Machine Learning Engineer
Instacart (Caper)
https://www.linkedin.com/in/davejscott/