www.7segments.com About me + (421) 918 666 238 BANKS BCR ERSTE BANK (RO) TATRABANKA PSS TELCO & RETAIL T-MOBILE VODAFONE (CZ) SOS ELECTRONIC EXISPORT UTILITIES & OTHER SPP PIXEL FEDERATION
ABOUT predictive analytics INTRODUCTION
CRISP DM Methodology
Business Understanding Set business goal and ask question We need to grow sales in SME segment What SME are interested in our offer? Transform in into task for data-mining Divide SME portfolio into 2 group Will accept offer Will not accept offer Data-mining starts by asking the right question
Data understanding ALL SME ACCEPTED OFFER X Goal: Target less customers and achieve the same results All customers Success in last 30 days
Data understanding Imagine a predictive model splitting customers into segments: green , orange , red ALL SME Target just 20% of customers to get 80% of max profit. 10% All customers Success in last 30 days Best customers 10 000 (20%) Average customers 20 000 (40%) Worst customers 20 000 (40%) 50 000 customers 5 000 accepted offer 4000 (80%) 750 (15%) 250 (5%) 100% 20% 40% 40% 60% Rule1 Rule2
Data preparation In order to build good model you need relevant data Customer Age Location Purchases Target 1 10 East 120 Yes 2 5 East 42 No 3 24 West 23 Yes 4 2 West 50 Yes 5 1 West 19 No More attributes available = better chance for good prediction
Modeling: Split buyers and non buyers Logistic Regression Linear Regression Decision Tree Neural Networks
Model Evaluation Train model on historical data Test on unseen data 50%:50% different history Model vs. random selection on history % of total SME % buyers in group Good models moves buyers to the first groups Weak model is like random: buyers are everywhere
Deployment Target campaign to the best customers Usual process: Import predictions to CRM Select top customers Evaluate campaign after a XY days
HOW predictions WORK IN 7SEGMENTS DEMO
Usually it looks like this :-) IBM SPSS Modeler 16
Smart software can simplify process Step of methodology Level of automation in our solution Business understanding Requires user input Data understanding Automated & supervised Data preparation Automated & supervised Modeling Automated & supervised Evaluation Automated & supervised Deployment Semi-Automated Use wizard for task definition or Select a pre-defined task Let user to check results and decide about actions
Who is eligible for prediction? For all customer who bought some “Shoes” in last 3y
What would you like to predict? Identify those likely to: make another purchase above 5€ in next 300 days
Data understanding Check default response rate & predictors
Training decision tree
Model Evaluation and Profitability Analysis
Deployment Use this rules in your next campaign:
Summary Non-coders can understand and manage Works well on any data Useful: instant deployment into campaigns delivers immediate value
UNDER COVER INSIGHTS
Under Cover: Data We use available customer data for prediction We generate predictors from customer events Customer Time Event Amount Product 1 Monday Purchase 5 Shirt 1 Tuesday Purchase 10 Shoes 1 Thursday Purchase 20 Ball 1 Saturday Purchase 5 Pen 1 Sunday Purchase 2 Pin Count of ( event.purchase ) = 5 Count of ( event.purchase ) in (last 3 days) = 2 Sum ( event.purchase.amount ) = 43 Sum ( event.purchase.amount ) in (last 3 days) = 7 Last / First / Most frequent event.purchase.product = Pin/Shirt/- …. For every event and all its properties
Under Cover: Training & validation Automatic building of train, test and actual dataset train is from different time period then test actual are the most recent data Rules for attribute transformations Discretization, coarse classing, remove noise, replace missing value Variation of CHAID decision tree It’s almost non-parametric ChiSq test works fine on unbalanced data compared to GINI in CARTs Generate heuristics for stopping criteria
Under Cover: Deployment Real-time scoring Calculate only final predictors what’s easy task for single customer Apply rules-set to get prediction Keep scoring for post-evaluation
Under Cover: Near Future Apply various algorithms For better visualization & insights Log. Regression produces visual scorecard with points To reduce risk of over fitting or low AUC Generate less and better predictors [ Average ] [ profit] in [ category =value] in the [ last/first ] N [ periods ] Identify right time to retrain models
(Almost) Full List of Features API and SDK for JavaScript, PHP, Python, Flash, Unity (mobile), Linux shell, …
Thank you for attention ! Call Jozo Kovac (+421) 918 666 238 and schedule live demo! www.7segments.com + (421) 918 666 238