AIRLINE_SATISFACTION_Data Science Solution on Azure

SanelaNikodinoska1 82 views 34 slides Jul 04, 2024

Slide 1 of 34

About This Presentation

Airline Satisfaction Project using Azure

This presentation is created as a foundation of understanding and comparing data science/machine learning solutions made in Python notebooks locally and on Azure cloud, as a part of Course DP-100 - Designing and Implementing a Data Science Solution on Azure.

Size: 3.98 MB

Language: en

Added: Jul 04, 2024

Slides: 34 pages

Slide Content

July 2024 Sanela Nikodinoska AIRLINE SATISFACTION DATA SCIENCE SOLUTION ON

Agenda Introduction Automated ML Designer Notebooks – Python SDK Closing

Introduction For the last but most significant course DP – 100 – Designing and Implementing a Data Science Solution on Azure, part of Data Science Institute held by Semos Education, Airline Satisfaction dataset was given to design and implement a data science solution on Azure. This presentation is an overview of the implemented solutions created using Azure Machine Learning Studio. Since the subscription to Azure was made for learning purposes only and is now cancelled, this presentation is made upon screenshots of the most important steps while developing, training and deploying ml models.

Automated ML Screenshots from Azure Let’s dive in

First steps Started with an Azure free trial, created resource group from UI and created Azure Machine Learning Service

Data – created data asset, uploading local dataset Airline Satisfaction to Azure (no screenshot for that) Automated ML – created two experiments, setting different primary metrics and featurization parameters

Automated ML – best model in both experiments was MaxMinScaler , LightGBM , experiments stopped due to early stopping policy based on level of primary metric

Automated ML – metrics

Designer Screenshots from Azure Let’s dive in

Ju Authoring Using components from Authoring - > Designer tab, CREATED two pipelines with two estimators and one pipeline with single estimator model for feature importance component: Two-Class Logistic Regression and Two-Class Decision Forest, Two-Class Support Vector Machine and Two-Class Neural Network, (deep-learning model), Two-Class Boosted Decision Tree with Cross – Validate Model component Jobs / Metrics After configuring and submitting pipelines and images for env, a job was created. The overview of job, as well as its outputs, logs, child jobs and metrics are presented in the following snapshots Registered Models The best performing models from Automated ML and Designer (Neural Network and LightGBM ) registered as custom and mflow models Real-time Endpoint Blue/green deployment of two best models was made and the blue deployment was tested for inference / the endpoint was invoked DESIGNER Compute targets Compute instance for profiling data asset, compute clusters for training models and pipeline sweep jobs were created Environments Compute instance for profiling data asset, compute clusters for training models and pipeline sweep jobs were created

DESIGNER Two-Class Logistic Regression and Two-Class Decision Forest pipeline

DESIGNER Two-Class Support Vector Machine and Two-Class Neural Network

DESIGNER Two-Class Decision Tree with Feature Importance component and Cross – Validate Model component

Designer - Feature importance

JOBS – list of all experiments

JOBS – overview of designer pipelines metrics – Random Forest Classifier best model Note: No snapshots of pipelines success

JOBS – overview of designer pipelines metrics –Two-Class Neural Network model Note: No snapshots of pipelines success

JOBS – overview of designer pipelines metrics –Two-Class Logistic Regression least performing model Note: No snapshots of pipelines success

Registered models

Deployment of models

Real-time Endpoint

Custom environment for model deployed to the endpoint

Predicting / Invoking the Endpoint

Environments Custom and curated environments for deploying and testing

Compute targets

Notebooks Screenshots from Azure Let’s dive in

Notebooks – created pipeline for training and scoring RandomForestClassifer model and tunning hyperparameters with sweep job

Notebooks – running tunning hyperparameters with sweep job

Notebooks – results from pipeline – child jobs

Notebooks – results from pipeline – best model

Notebooks – results from pipeline – model metrics

Notebooks – results from pipeline - predicting

Summary No-code or programmatically What suits the most Great business solution Having all resources in one place New subscription For future projects Getting work done F inished my course project So many services Yet to discover: Azure DataBricks , Azure Synapce Analystics (for data ingestion), Azure AI Services, Azure Data factory etc. Recommend Definitely!

Closing Thanks to your time. Hoping to get some of your feedback for improving. Sanela Nikodinoska [email protected]

AIRLINE_SATISFACTION_Data Science Solution on Azure

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

AIRLINE_SATISFACTION_Data Science Solution on Azure

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx