"I see eyes in my soup": How Delivery Hero implemented the safety system for AI-generated images

chloewilliams62 194 views 37 slides May 08, 2024
Slide 1
Slide 1 of 37
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37

About This Presentation

Learn more: https://zilliz.com/blog/how-delivery-hero-implemented-safety-system-for-ai-generated-images
In this talk, we are going to cover the use-case of food image generation at Delivery Hero, its impact and the challenges.
In particular, we will present our image scoring solution for filtering o...


Slide Content

1Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
“I see eyes
in my soup”
How we implemented the safety
system for AI-generated images
at Delivery Hero
Iaroslav Amerkhanov
Nikolay Ulyanov

2Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
A brief recap of non-equilibrium thermodynamics
Forward process, applying Gaussian noise:
Reverse diffusion process:
A simple reparameterization trick:
It is also straightforward that:
Obviously, we can use the variational lower bound to optimize the
negative log-likelihood:

3Delivery Hero. Proprietary & Confidential. 2023 All Rights Reserved
Delivery Hero
“a mission to deliver anything, anywhere”
Some of the brands that belong to the
Delivery Hero family
prepared meals groceries
flowers coffee medicine
70+
countries
4
continents

4Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Agenda
01
Main points:
●Who we are?
●What we are working on?
●Why we is this important for
Delivery Hero?
●How we have built the MVP?
●How we have built the production?
Food Image Generation
Main points:
●Safety System Components
overview
●Deep Dive into Image Tagging
●Deep Dive into Image Centering
02
Safety System

5Delivery Hero. Proprietary & Confidential. 2023 All Rights Reserved
Part 1
Food Image Generation

6Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Rahul
Product Manager
Aleksei
Product Analyst
Nikolay
Data Scientist

Artem
MLE
Paul
DS Manager
Sonia
SWE
Iaroslav
Data Scientist

Vladimir
Tech Lead
Sankalp
SWE
Maria
Product Manager
Intro
AI Menu Content Team

7Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
The hypothesis:
The quality of the Menu Content increases CVR, i.e. the revenue.

Analysis:
●86% of the products ordered have a dish image
●65% of orders have dishes with a description
●~40% of the dishes are missing either image or description

A/B test:
Experiments show 6-8% increase in conversion rate from menu to card
when we add an image to a product.
Why do we need image generation, business perspective

8Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
MVP: Do not overcomplicate

9Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Image Generation: Dynamic Background
a professional photo of {dish} and {dish attributes} on a nice plate,
{background} background
pastel bluepastel pink wood
white marble concrete No-background

10Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
01
Image selection
Identify a suitable image from existing
02
Object Detection
Get a bounding box with the food content
03
Masking
Remove the content
04
Inpainting
Replace with the relevant dish
Image Generation: Tailored Background

Tech stack:
●Kubeflow & Vertex AI pipelines - orchestration
●Open AI DALL-E 2 - image generation
●GroundingDINO - object detection
●Google Cloud Pub/Sub - integration with the backend

Components:
●Backend
●Data Extraction
●Dynamic background (text-to-image):
○Image Generation
●Tailored background (inpainting):
○Image Selection
○Background Extraction
○Image Generation
Technical details

Image Generation pipeline - Architecture

13Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Production: MVP

14Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Deploying to prod on Friday’s evening be like …

Tech stack:
●Kubeflow & Vertex AI pipelines - orchestration
●SDXL-based model - image generation
●GroundingDINO - object detection
●CloudRun & Postgres - backend
●Google Cloud Pub/Sub - integration with the backend
●EKS and Kubernetes - GPU cluster for model inference
●ECR & GCP artifact registry
●GitHub Actions - CI/CD
●SNS & SQS - messaging and x-cloud communication
●GCP & S3 - model artefacts and metainfo
●BigQuery and Tableau - analytics
Technical details

Image Generation:
cross-cloud inference

17Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Production: Launch

18Delivery Hero. Proprietary & Confidential. 2023 All Rights Reserved
Part 2
Safety System

19Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Safety System Components
Centered / Cropped
Sharp / Blurry
Image tagging (ok / not ok)
Text detection

20Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Recognize-Anything model (RAM++)

Scoring Logic:
●4500 tags in total
●Identified 10 “food” tags, and 50 “negative” tags
●0 - bad image (contains bad tags, e.g. ‘animal’, ‘eye’, ‘dinning
table’, etc.)
●1 - good image (contains at least 1 “food” tag and no bad tags)
Image Tagging component
['butter', 'condiment',
'dip', 'plate', 'fish',
'food', 'fry', 'mustard',
'peak', 'potato',
'white']
['ant', 'beetle', 'bug',
'catch', 'roach', 'cricket',
'plate', 'food',
'grasshopper', 'hornet',
'person', 'insect',
'mantid', 'miniature',
'nest', 'wasp', 'white']

21Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Image Tagging: Tag2Text/RAM/RAM++ architecture

22Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Image Tagging: Tag2Text/RAM/RAM++ architecture
●2 tasks:
○Image tagging
○Image-Tag-Text alignment
●Modules:
○Image encoder (Swin Transformer)
○Text encoder (CLIP)
○Image-Tag-Text alignment decoder (2-layer transformer)
●Pretrained on large-scale datasets with a resolution of 224
●Fine-tuned at a resolution of 384 using small and high-quality datasets

23Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Metrics and False Negative examples
Predicted MonstersPredicted Food
Monsters (+) TP = 1% FN = 0.07%
Food (-) FP = 8% TN = 91%

24Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
False Positive examples
['animal'] ['worm']
['animal', 'eye'] ['animal'] ['face', 'eye'] [‘beverage']

25Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Grounding DINO

Scoring Logic:
●0 - no plate detected
●0.5 - bounding box touches image edge
●1 - bounding box in the center
food, plate, bowl, glass
Image Centering component

10.5

26Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Grounding DINO architecture

27Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Grounding DINO architecture

28Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Safety System via Image Scoring
1 0.75 0.73 0
0.81 0.83 0.78
0.58
0.49

29Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Safety System via Image Scoring
1 0.75 0.73
0.81 0.83 0.78

Thank you!
Questions?

31Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Quiz: Which ones are AI generated?

32Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Quiz: Which ones are AI generated?
AI
AI

33Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Quiz: Which ones are AI generated?

34Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Quiz: Which ones are AI generated?
AI
AI

35Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Quiz: Which ones are AI generated?

36Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Quiz: Which ones are AI generated?
AI

37Delivery Hero Proprietary & Confidential. 2023 All Rights Reserved
Iaroslav Amerkhanov Nikolay Ulyanov
Thank you!
Questions?
Tags