"I see eyes in my soup": How Delivery Hero implemented the safety system for AI-generated images
chloewilliams62
194 views
37 slides
May 08, 2024
Slide 1 of 37
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
About This Presentation
Learn more: https://zilliz.com/blog/how-delivery-hero-implemented-safety-system-for-ai-generated-images
In this talk, we are going to cover the use-case of food image generation at Delivery Hero, its impact and the challenges.
In particular, we will present our image scoring solution for filtering o...
Learn more: https://zilliz.com/blog/how-delivery-hero-implemented-safety-system-for-ai-generated-images
In this talk, we are going to cover the use-case of food image generation at Delivery Hero, its impact and the challenges.
In particular, we will present our image scoring solution for filtering out inappropriate images and elaborate on the models we are using.
Size: 11.62 MB
Language: en
Added: May 08, 2024
Slides: 37 pages
Slide Content
1Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
“I see eyes
in my soup”
How we implemented the safety
system for AI-generated images
at Delivery Hero
Iaroslav Amerkhanov
Nikolay Ulyanov
2Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
A brief recap of non-equilibrium thermodynamics
Forward process, applying Gaussian noise:
Reverse diffusion process:
A simple reparameterization trick:
It is also straightforward that:
Obviously, we can use the variational lower bound to optimize the
negative log-likelihood:
3Delivery Hero. Proprietary & Confidential. 2023 All Rights Reserved
Delivery Hero
“a mission to deliver anything, anywhere”
Some of the brands that belong to the
Delivery Hero family
prepared meals groceries
flowers coffee medicine
70+
countries
4
continents
4Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Agenda
01
Main points:
●Who we are?
●What we are working on?
●Why we is this important for
Delivery Hero?
●How we have built the MVP?
●How we have built the production?
Food Image Generation
Main points:
●Safety System Components
overview
●Deep Dive into Image Tagging
●Deep Dive into Image Centering
02
Safety System
5Delivery Hero. Proprietary & Confidential. 2023 All Rights Reserved
Part 1
Food Image Generation
6Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Rahul
Product Manager
Aleksei
Product Analyst
Nikolay
Data Scientist
Artem
MLE
Paul
DS Manager
Sonia
SWE
Iaroslav
Data Scientist
Vladimir
Tech Lead
Sankalp
SWE
Maria
Product Manager
Intro
AI Menu Content Team
7Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
The hypothesis:
The quality of the Menu Content increases CVR, i.e. the revenue.
Analysis:
●86% of the products ordered have a dish image
●65% of orders have dishes with a description
●~40% of the dishes are missing either image or description
A/B test:
Experiments show 6-8% increase in conversion rate from menu to card
when we add an image to a product.
Why do we need image generation, business perspective
8Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
MVP: Do not overcomplicate
9Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Image Generation: Dynamic Background
a professional photo of {dish} and {dish attributes} on a nice plate,
{background} background
pastel bluepastel pink wood
white marble concrete No-background
10Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
01
Image selection
Identify a suitable image from existing
02
Object Detection
Get a bounding box with the food content
03
Masking
Remove the content
04
Inpainting
Replace with the relevant dish
Image Generation: Tailored Background
Tech stack:
●Kubeflow & Vertex AI pipelines - orchestration
●Open AI DALL-E 2 - image generation
●GroundingDINO - object detection
●Google Cloud Pub/Sub - integration with the backend
13Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Production: MVP
14Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Deploying to prod on Friday’s evening be like …
Tech stack:
●Kubeflow & Vertex AI pipelines - orchestration
●SDXL-based model - image generation
●GroundingDINO - object detection
●CloudRun & Postgres - backend
●Google Cloud Pub/Sub - integration with the backend
●EKS and Kubernetes - GPU cluster for model inference
●ECR & GCP artifact registry
●GitHub Actions - CI/CD
●SNS & SQS - messaging and x-cloud communication
●GCP & S3 - model artefacts and metainfo
●BigQuery and Tableau - analytics
Technical details
Image Generation:
cross-cloud inference
17Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Production: Launch
18Delivery Hero. Proprietary & Confidential. 2023 All Rights Reserved
Part 2
Safety System
19Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Safety System Components
Centered / Cropped
Sharp / Blurry
Image tagging (ok / not ok)
Text detection
20Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Recognize-Anything model (RAM++)
Scoring Logic:
●4500 tags in total
●Identified 10 “food” tags, and 50 “negative” tags
●0 - bad image (contains bad tags, e.g. ‘animal’, ‘eye’, ‘dinning
table’, etc.)
●1 - good image (contains at least 1 “food” tag and no bad tags)
Image Tagging component
['butter', 'condiment',
'dip', 'plate', 'fish',
'food', 'fry', 'mustard',
'peak', 'potato',
'white']
['ant', 'beetle', 'bug',
'catch', 'roach', 'cricket',
'plate', 'food',
'grasshopper', 'hornet',
'person', 'insect',
'mantid', 'miniature',
'nest', 'wasp', 'white']
21Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Image Tagging: Tag2Text/RAM/RAM++ architecture
22Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Image Tagging: Tag2Text/RAM/RAM++ architecture
●2 tasks:
○Image tagging
○Image-Tag-Text alignment
●Modules:
○Image encoder (Swin Transformer)
○Text encoder (CLIP)
○Image-Tag-Text alignment decoder (2-layer transformer)
●Pretrained on large-scale datasets with a resolution of 224
●Fine-tuned at a resolution of 384 using small and high-quality datasets