A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv) @PAPIs Connect — São Paulo 2017

papisdotio 947 views 60 slides Jun 26, 2017

Slide 1 of 60

About This Presentation

News recommendations are particularly challenging given the high number of new contents produced every day and the fast deterioration of its value for the users, demanding models and infrastructure able to deal with those nuances and serve a newly trained model about 100 times per day. Attending thi...

Size: 8.12 MB

Language: en

Added: Jun 26, 2017

Slides: 60 pages

Slide Content

A Tensorﬂow
Recommending System
for News
Fabricio Vargas Matos

Manhattan, NYTV Stations
Local and National News

Article’s page: recommendations
for continuous scroll section
Recommended articles

Agenda
1.Recency and cold-start problem
2.Data acquisition
3.Matrix factorization
4.Tensorﬂow implementation
5.Hybrid Model: NLP and feature engineering
6.Hybrid Model: Hybrid matrix factorization
7.Conclusions

Cold-start problem
Existent
Items
New
Items
Existent Users New Users

Cold-start solution
Existent
Items
New
Items
Existent Users New Users
Not personalized!
Curated by Editors
+
Highly viewed

Cold-start solution
Existent
Items
New
Items
Existent Users New Users
Not personalized!
Curated by Editors
+
Highly viewed
Hybrid
Matrix
Factorization

Data Acquisition
Page views with
user’s time on page
Google AnalyticsGoogle BigQuery CMS
Content corpus: title,
body, timestamp,
meta-data (sections,
tags, etc.)
Contents
TFRecord/CSV ﬁles

"Users x Items" Sparsity
Dataset Sparsity
MovieLens (movies) 98.61%
Netﬂix (movies) 98.82%
TV Stations (news) 99.94%
Yahoo! KDD (music) 99.96%

Matrix Factorization

VU
Latent Factors Model
R
Items
Users
≈
Latent
factors
Latent
factors
Items
x
user bias item bias
i
j
i
j
R[i,j] ≈ U[i] x V[j]

TF code: factorization op
(…)

TF code: train op

Initial Results
•Training time ≈ 15min (Kubernetes cluster)
•TimeOnPage Prediction Error (RMSE) ≈ 125 sec
•Qualitative recommendation tests with chosen
‘personas’ revealed poor personalization

Hybrid Matrix
Factorization Model

Natural Language
Processing
Concatenate content data
(title, body, sections, tags, …)
Remove stop words, symbols
and HTML tags
Train word2vec Neural Network
Combine all word-vectors of
each article into one (doc2vec)
CMS
articles
doc2vec
contents

Contents Data
Visualization

Entertainment
National News
Health
Sports
Local News

Features Engineering
NLP (doc2vec)
items clustering (k-means)
embed items:
similarity to each cluster centroid
embed users:
viewed contents combined
CMS
articles
k-dimension
items/users
embeddings
Google
Cloud
Storage

Items Parallel coordinates: 40 features/clusters

Feature #1: Similarity to
cluster #1

Feature #39

Who are they?
Magenta contents (health) with high
values for feature #1 (economy)?

Content/User Embeddings
+
Matrix Factorization

VU
Matrix Factorization
R
Items
Users
≈
Latent
factors
Latent
factors
Items
x
user bias item bias
i
j
i
j
R[i,j] ≈ U[i] x V[j]

Hybrid Matrix Factorization
•R ≈ U
*
x V
*
where:
•U
*
= UUsersxKClusters x AKClustersxLatent_factors
•V
*
= BLatent_factorsxKClusters x VKClustersxItems
*Only A and B are variables to be trained. U and V are constants.

TF code: factorization
Now:

Results
•Training time ≈ 20min (Kubernetes cluster)
•TimeOnPage Prediction Error (RMSE) ≈ 100 sec
(20% better)
•Qualitative recommendation tests with chosen
‘personas’ revealed very good personalization
•R&D Project - Not yet publicly available

A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv) @PAPIs Connect — São Paulo 2017

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv) @PAPIs Connect — São Paulo 2017

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 5

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 22

Slide 45

Slide 50

Slide 51

Slide 52

Slide 53

Slide 55

Slide 56

Slide 57

Slide 58

Slide 59

Slide 60

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx