Unit I - Business Analytics for MBA .pptx

its88rahulsoni 1 views 45 slides Oct 14, 2025
Slide 1
Slide 1 of 45
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45

About This Presentation

Business Analytics


Slide Content

Business Analytics

Business Analytics is the practice of using data, statistical methods, and technology to analyze past business performance, uncover trends, and gain insights to drive future business planning and decision-making. It’s like giving a company a superpower to see patterns, forecast outcomes, and optimize strategies. Business analytics is a continuous process that involves: Data Collection: Gathering data from various sources (e.g., sales, customer interactions, operations, marketing). Data Preparation: Cleaning, organizing, and transforming raw data into a usable format. Data Analysis: Applying statistical methods, machine learning, and other techniques to identify patterns, correlations, and anomalies. Visualization and Interpretation: Presenting insights through dashboards, reports, and other visualizations to make them understandable and actionable for decision-makers.

Data and analytics capabilities have made a leap forward. Growing availability of vast amounts of data Improved computational power Development of sophisticated algorithms Colleges/universities have curriculum emphasizing business analytics Data and analytics capabilities have changed the way businesses make decisions. Companies need data-savvy professionals Turn data into insights and action Business analytics (data analytics) involves extracting information and knowledge from data. Enhance the customer experience Develop better marketing strategies Deepen customer engagement Enhance efficiency and reduce expenses Identify emerging markets Mitigate risk and fraud

Business analytics is widely applied. Marketing Human resource management Economics Finance Health, sports, and politics Business analytics combines qualitative reasoning with quantitative tools. Identify key business problems Translate data analysis into decisions Improve business performance By leveraging business analytics, organizations can make more informed decisions, optimize operations, identify growth opportunities, and improve overall business performance.

Components of Business Analytics Descriptive analytics : What Happened? Use of historical data to understand past and present business performance It is the most basic type of data analysis It answers the fundamental question, "What happened ?“ It doesn't explain why something happened or predict what will happen next. It provides a summary of a dataset to identify patterns and trends, giving a clear snapshot of a business's health . Examples Actual spending Vs the budgeted amount. Percentage of employees who left the organization. How many customers visited the website Report showing total sales, sales by region, and top-selling products.

Diagnostic analytics : Why did this happen? The process of examining historical data. It goes beyond a simple summary of what occurred (descriptive analytics) to identify the root causes of trends, anomalies, and specific outcomes. Think of it as the detective work of data analysis. Examples A new marketing campaign didn't generate the expected number of leads. A high employee turnover rate in a specific department. A hospital's patient readmission rates are higher than the national average.

Predictive Analytics : what could happen in the future? Use historical data to make predictions Analytical models help identify associations Associations used to estimate the likelihood of a favorable outcome Commonly considered advanced predictions Build models that help an organization understand what might happen in the future Use statistics and data mining Examples Identifying customers who are most likely to respond to specific marketing campaigns Transactions that are likely to be fraudulent

Prescriptive Analytics : what should we do? Optimization and simulation algorithms to provide advice Explore several possible actions Suggest course of action Commonly considered advanced predictions Build models that help an organization understand what might happen in the future Use statistics and data mining Examples Scheduling employees’ works hours Select a mix of products to manufacture Choose an investment portfolio

Tools & Techniques Data Mining : Discover patterns in large datasets. Statistical Analysis : Use statistical methods to interpret data. Machine Learning : Build models that learn from data to make predictions. Visualization Tools : Use dashboards and charts (like Tableau or Power BI) to present insights clearly.

Advantages of BA 1. Enhanced Decision-Making: Business analytics provides data-driven insights, moving organizations beyond relying on intuition or "gut feelings." By analyzing historical data, identifying trends, and predicting outcomes, businesses can make more informed, accurate, and strategic decisions regarding everything from product development to market entry. 2. Improved Operational Efficiency: Analytics helps identify bottlenecks, inefficiencies, and areas of waste within business processes. By understanding what's hindering performance, companies can streamline workflows, optimize resource allocation, automate repetitive tasks, and ultimately reduce costs while increasing productivity. 3. Better Customer Understanding and Experience: By analyzing customer data (like purchasing habits, browse behavior , feedback, demographics), businesses can gain deep insights into customer preferences, needs, and pain points. This enables them to: Personalize offerings: Tailor products, services, and marketing campaigns to individual customer segments. Improve customer service: Address pain points and enhance overall satisfaction. Increase customer loyalty and retention: By meeting customer expectations more effectively .

Advantages of BA ( contd …) 4. Risk Management and Mitigation: Business analytics helps identify potential risks – whether financial, operational, or market-related – before they escalate. Through predictive modeling and anomaly detection, businesses can: Anticipate equipment failures or financial shortfalls. Detect fraudulent activities. Assess the impact of different scenarios and plan mitigation strategies. 5. Competitive Advantage: Organizations that effectively leverage business analytics can gain a significant edge over competitors. By understanding market trends, customer behavior , and operational strengths/weaknesses better than others, they can: Identify new market opportunities. Innovate faster with new products and services. Respond more quickly to market changes. Optimize pricing strategies.

Advantages of BA ( contd …) 6. Increased Revenue and Profitability: By optimizing operations, improving customer experiences, and making better strategic decisions, business analytics directly contributes to increased revenue and profitability. It helps identify high-performing products, target profitable customer segments, and improve conversion rates. 7. Faster Problem Solving: When issues arise, business analytics provides the tools to quickly diagnose the root causes, allowing for more rapid and effective solutions . In essence, business analytics transforms raw data into actionable intelligence, empowering organizations to be more agile, efficient, and competitive.

Data Visualization Data visualization is the graphical representation of information and data. It transforms raw numbers into visual formats—like charts, graphs, and maps—so patterns, trends, and insights become easier to understand and act upon . Data visualization is the practice of representing information and data visually, using elements like charts, graphs, maps, and other visual tools. The primary goal is to make complex, high-volume, or numerical data easier to understand, interpret, and extract insights from.

Why DV Enhanced Comprehension: Our brains are wired to process visual information much faster than raw text or numbers. Visualizations allow us to quickly grasp patterns, trends, and outliers that might be hidden in spreadsheets. Enhanced / Faster Decision-Making: By presenting data in an easily digestible format, decision-makers can identify key insights and make informed choices more quickly. This is vital in fast-paced business environments. Improved Communication: Data visualization acts as a universal language, enabling both technical and non-technical stakeholders to understand complex data insights. It facilitates better collaboration and alignment across teams. Identification of Trends and Patterns: Visual representations make it easier to spot emerging trends, correlations, and anomalies that might go unnoticed in traditional data reports. This can lead to new opportunities or help in identifying potential risks. Storytelling with Data: Effective data visualization helps to tell a compelling story with data, providing context and making the insights more memorable and actionable .

Bar Chart Line Graph Pie Chart Scatter Plot Heat Map Tree Map Geospatial Map Dashboards

Popular Tools Tableau : Interactive dashboards and advanced visuals Power BI : Microsoft’s business intelligence tool Excel : Basic charts and graphs Google Data Studio : Free and web-based Python & R : For custom, code-driven visualizations (e.g., using Matplotlib , Seaborn , ggplot2 )

Big Data : Massive , complex, and rapidly growing datasets that are too large It's not just about the size of the data; it's also about its complexity and the speed at which it's generated. A massive volume of structured and unstructured data. Extremely difficult to manage, process, and analyze using traditional data processing tools. Presents great opportunities to gain knowledge and game-changing intelligence.

Characteristics of big data. Volume : Refers to the enormous amount of data being generated every second, from sources like social media, sensors, and financial transactions. This data is measured in terabytes, petabytes, and even zettabytes . Velocity : The speed at which data is created, collected, and needs to be processed. In many cases, data is streaming in real-time, such as stock market data, sensor readings from IoT devices, or live social media feeds. Variety : Big data comes in many different formats. It can be structured (like a traditional spreadsheet or database), semi-structured (like a JSON file), or unstructured (like images, videos, audio files, and free-form text from social media posts). Veracity : The quality and trustworthiness of the data. Because big data comes from so many different sources, it can be messy, noisy, and prone to errors. Ensuring data veracity is a major challenge. Values : Big data is only useful if it can be transformed into valuable insights. The true goal is to analyze these massive datasets to find patterns, make predictions, and drive better business decisions. Having a large volume of data does not guarantee that useful insights or measurable improvements can be generated. 

Real-World Examples of Big Data Retail and E-commerce : Companies like Amazon and Netflix analyze massive amounts of customer data—including browsing history, purchase behavior, and product reviews—to provide personalized recommendations, optimize pricing, and manage inventory. Finance : Banks and financial institutions use big data to detect and prevent fraud in real time by analyzing millions of transactions for unusual patterns. They also use it for risk management and algorithmic trading. Healthcare : Hospitals and research institutions use big data to analyze patient records, medical images, and genetic data to improve diagnoses, personalize treatment plans, and predict disease outbreaks. Transportation : Ride-sharing companies and logistics providers use real-time GPS and traffic data to optimize routes, predict traffic congestion, and manage their fleets more efficiently. Manufacturing : Factories use sensor data from machines to perform predictive maintenance, identifying potential equipment failures before they occur and minimizing downtime .

Types of data - Data can be classified in several ways depending on context . 1. By Nature of Values Qualitative Data (Categorical Data) - Describes characteristics or qualities; not numerical. Mathematical operations like addition or subtraction on this type of data are not possible. It helps answer "why" or "how" questions . Nominal Data: This type of qualitative data is used for labeling variables without any quantitative value or order. The categories are distinct and don't have a natural ranking. Examples : Gender (Male, Female, Other), Eye Color (Blue, Brown, Green), Marital Status (Single, Married, Divorced ). Ordinal Data: This qualitative data has categories that can be ranked or ordered, but the differences between the categories are not necessarily equal or meaningful. Examples : Customer satisfaction ratings (Very Unsatisfied, Unsatisfied, Neutral, Satisfied, Very Satisfied OR poor, fair, good, excellent), Education Level (High School, Bachelor's, Master's, PhD), Economic Status (Low, Medium, High).

By Nature of Value ( contd …) Quantitative Data (Numerical Data) - Represents measurable quantities., consists of numerical values that can be measured or counted. Mathematical operations can be performed on this data, and it helps answer "how many" or "how much" questions . Discrete Data: This data can only take specific, distinct values, often whole numbers, and usually results from counting. There are clear gaps between possible values. Examples : Number of students in a class, Number of cars in a parking lot, Number of complaints received . Continuous Data: This data can take any value within a given range, including fractions and decimals, and usually results from measurement. Examples: Height of a person, Temperature, Time taken to complete a task, Weight.

Quantitative Data (Numerical Data ) ( contd …) Interval : Interval data has all the properties of ordinal data, but with equal, defined intervals between values. Ordered numerical data with equal intervals but no true zero . Data is ordered, and the intervals between data points are equal. Example : Temperature in Celsius or Fahrenheit Ratio : Ratio data has all the properties of the other three scales, plus a true zero point. This means that a value of zero signifies the complete absence of the measured attribute. Ratio is ordered numerical data with equal intervals. Example : Height, Weight, Age , Income

2. By Structure Structured Data : It is highly organized and follows a predefined format or schema. It's typically stored in relational databases, spreadsheets, or other tabular forms where data elements are easily identifiable and searchable . Organized in rows and columns; easily stored in databases, fits into predefined fields, easy to search and analyze with traditional tools. Examples : Spreadsheets, SQL databases, CRM systems Unstructured Data : It has no predefined format or organization. It's typically stored in its native format and can be text-heavy or multimedia. It's much harder to search and analyze using traditional methods . It lacks a predefined schema, variety of formats, often qualitative, requires advanced tools (like AI/ML) for analysis. Examples : Emails, social media posts, videos, images Semi-Structured Data : This category falls between structured and unstructured data. It doesn't reside in a fixed relational database but contains tags or markers to separate and identify data elements, providing some organizational structure. Examples : JSON files, XML documents, NoSQL databases

3. By Source Primary Data Collected directly from the source for a specific purpose. Examples : Surveys, interviews, experiments Secondary Data Previously collected for another purpose, reused. Examples : Government reports, academic studies, company records 4. By Time Sensitivity Real-Time Data Generated and processed instantly. Examples : Stock market feeds, IoT sensor data Historical Data Past data used for analysis and forecasting. Examples : Sales records, weather data

Data Mining Data mining is the process of discovering patterns, relationships, insights , and anomalies from large datasets. It involves using various computational techniques to extract useful knowledge from data that is too vast or complex for manual analysis .

Data Mining Process The process of data mining typically follows a series of structured steps: Business Understanding: The first step is to clearly define the problem and the business objective. This ensures that the data mining effort is focused on finding relevant and valuable insights. Data Understanding: This involves exploring the data to get a feel for its quality, structure, and content. Analysts check for missing values, identify outliers, and get an initial sense of the data's characteristics. Data Preparation: This is often the most time-consuming phase, where the data is cleaned, transformed, and organized into a format suitable for analysis. This may involve handling missing data, standardizing values, and integrating data from multiple sources . Modeling : This is where the core data mining algorithms are applied to the prepared data. Techniques used can range from simple statistical models to complex machine learning algorithms. The goal is to build a model that can identify patterns and make predictions.

Data Mining Process ( contd …) Evaluation : The results of the model are assessed to determine their accuracy and reliability. The model is tested against new data to ensure it generalizes well and isn't just a good fit for the training data. Deployment: The final, validated model is put into practice. The insights gained from the model are used to inform business decisions, automate processes, or improve a product or service.

Advantages of Data Mining 1 . Enhanced Decision-Making : By uncovering hidden patterns and relationships in data, it provides valuable insights that allow businesses to make more informed, accurate, and proactive decisions . The decisions are taken based on evidences than intuition. A marketing team can identify the most effective channels for advertising by analyzing customer engagement data, leading to a higher return on investment (ROI) on marketing campaigns. 2. Deeper Customer Understanding : Data mining helps businesses to understand their customers on a deeper level by analyzing customer behavior , purchase history, and demographic data. This enables highly personalized and targeted marketing efforts, which increases customer satisfaction and loyalty. E-commerce platforms recommend products to customers based on their past purchases and browsing history, increasing sales and user engagement. 3. Cost Reduction and Efficiency : Data mining can identify inefficiencies and redundancies in business processes, leading to cost savings and improved operational efficiency. It helps in optimizing resource allocation, managing inventory, and predicting equipment failure. Sensor data from machinery can be analysed to predict when a component is likely to fail. This allows for scheduled maintenance, which reduces unplanned downtime and saves significant repair costs .

Advantages of Data Mining ( contd …) 4 . Fraud Detection and Risk Management : Data mining is an essential tool in identifying fraudulent activities and managing risk. By analyzing transaction patterns, it can quickly detect anomalies that might indicate fraud, allowing organizations to take swift action. Financial institutions analyze credit card transactions. An unusual purchase location or a sudden large transaction can be flagged as potentially fraudulent, and the cardholder can be notified immediately. 5. Gaining a Competitive Advantage : In a data-driven world, the ability to analyze and act on insights faster than competitors is a significant advantage. Data mining allows businesses to identify emerging market trends, anticipate customer needs, and develop new products or services to stay ahead of the competition. 6. Predictive analytics : Data mining enables organizations to predict future trends and outcomes.

Statistical Description of Data The statistical description of data involves using descriptive statistics to summarize, organize and present the essential features a dataset. These methods provide a concise overview of the main features of the data, making it easier to understand its characteristics without having to look at every single data point. It helps in understanding the nature, spread, and patterns in the data before doing deeper analysis. The primary categories of descriptive statistics are: 1. Measures of Central Tendency : These measures describe the " center " or "typical" value of a dataset. They provide a single value that represents the whole. Mean: The arithmetic average. It's calculated by summing all the values and dividing by the number of values. It is sensitive to outliers. Median: The middle value of a dataset when it's sorted from least to greatest. It is not affected by extreme outliers. Mode: The value that appears most frequently in the dataset. It's the only measure of central tendency that can be used for nominal (categorical) data. Example: Dataset ->[10, 15, 20, 20, 25, 30, 35] Mean = 22.14 (average ), Median = 20 (middle value), Mode = 20 (it appears twice)

Statistical Description of Data ( contd …) 2. Measures of Dispersion (spread of data) These measures describe how spread out or dispersed the data points are. They give a sense of the distribution's shape and consistency. Range: The difference between the highest and lowest values in the dataset. It's a simple but often unreliable measure because it only considers two values. Variance: The average of the squared differences from the mean. It gives more weight to data points that are farther from the average. Standard Deviation: The square root of the variance. It's the most common measure of spread and is easier to interpret than variance because it is in the same units as the original data. A small standard deviation indicates that the data points are clustered closely around the mean, while a large standard deviation means they are more spread out.

Statistical Description of Data ( contd …) Example : Dataset -> [5, 10, 15, 20, 25] Range : Maximum – Minimum = 25 – 5 = 20 Variance : Mean = ( 5+10+15+20+25​ ) / 5 = 75 / 5​ = 15 Deviation : Value(x) Deviation (x – Mean) Square(x – Mean) 5 5 – 15 = -10 100 10 10 – 15 = -5 25 15 15 – 15 = 0 0 20 20 – 15 = 5 25 25 25 – 15 = 10 100 Sum of  squares = 100+25+0+25+100 = 250 Variance = Sum of Squares / No of Values = 250 / 5 = 50 Standard Deviation : square root (variance) = square root (50) = 7.07

Statistical Description of Data ( contd …) 3. Frequency Distribution This describes how often each value or range of values occurs within a dataset. It's often visualized using tables or graphs like histograms and bar charts. Example: A frequency table could show that in a class of 30 students, 5 students scored in the 80-85 range, 10 scored in the 86-90 range, and so on . 4 . Measures of Shape (Distribution of Data) These describe the pattern or symmetry of data distribution. Skewness – Indicates asymmetry Positive skew → long right tail Negative skew → long left tail Kurtosis – Indicates peakedness or flatness High kurtosis = sharp peak Low kurtosis = flat

Statistical Description of Data ( contd …) 5. Graphical Representation Helps visualize data for better understanding: Histogram (frequency distribution) Box plot (spread & outliers) Scatter plot (relationship between two variables) Pie chart (proportions)

Data Pre-processing Data Preprocessing is a data mining and machine learning step where raw data is cleaned, transformed, and made suitable for analysis or model building. Since raw data is often noisy, incomplete, or inconsistent, preprocessing ensures accuracy, quality, and efficiency. Data preprocessing steps 1. Data Cleaning This step handles messy data by filling in missing values, smoothing out noise, and correcting inconsistencies. Handling Missing Values: You can fill in missing data by replacing it with a constant value, the mean or median of the feature, or by using a predictive model. Smoothing Noisy Data: Noise is random error or variance in data. Techniques like binning (grouping data into smaller intervals) or regression can be used to smooth out this noise. Correcting Inconsistent Data: This involves fixing contradictions in the data, such as a customer's age being listed as 150 or a date that is out of range.

Data Pre-processing ( contd …) 2. Data Integration This step combines data from multiple sources into a single, cohesive dataset. This can be complex due to schema conflicts (naming differences for the same entity) and data redundancy. Example: Merging customer information from a sales database and a marketing database. 3. Data Transformation This is the process of converting data into a suitable format for the model. Normalization/Scaling : This ensures all features contribute equally to the model by scaling values to a specific range (e.g., 0 to 1). This is especially important for algorithms sensitive to the magnitude of values, such as support vector machines (SVMs) or k-nearest neighbors (KNN). Feature Encoding : Categorical data (like colors or countries) must be converted into a numerical format. One-hot encoding or label encoding are common techniques for this. Aggregation : Summarizing data by performing operations like summing or averaging to create a new, more meaningful feature

Data Pre-processing ( contd …) 4. Data Reduction This step aims to reduce the size of the dataset without sacrificing key information. This can improve model performance and reduce training time. Feature Selection: Choosing a subset of the most relevant features to use in the analysis, which helps to eliminate redundancy. Dimensionality Reduction: Techniques like Principal Component Analysis (PCA) reduce the number of features (dimensions) by creating new, uncorrelated variables that capture the most important information from the original dataset. 5. Data Discretization Converting continuous data into categories/bins. Example : Age → {Child: 0–12, Teen: 13–19, Adult: 20–59, Senior: 60+}

Data Pre-processing ( contd …) Example 1 of data pre-processing Employee ID Age Salary Department 1 25 50000 Sales 2 30 60000 Marketing 3 70000 IT 4 40 HR

Data Pre-processing ( contd …) Example 1 ( contd …) Preprocessing Steps: Handling Missing Values : Fill missing Age values with the mean or median age. Fill missing Salary values with the mean or median salary. Data Normalization : Scale Age and Salary values to a common range (e.g., 0 to 1) using Min-Max Scaling. Categorical Encoding : Convert Department categories into numerical values using One-Hot Encoding or Label Encoding.

Data Pre-processing ( contd …) Example 1 ( contd …) Encoded Table (numeric-ready for ML) Employee ID Age Salary Department_Sales Department_Marketing Department_IT Department_HR 1 0.2 0.4 1 2 0.4 0.6 1 3 0.35 0.8 1 4 0.8 0.5 1

Data Pre-processing ( contd …) Example 2 ( contd …) Raw Dataset (Student Records) Name Age Marks Gender City John 18 85 M Mumbai Sara 20 NaN Female Delhi Rohan 19 72 M Pune Anita 21 90 F Mumbai Ravi 22 -5 Male Delhi

Data Pre-processing ( contd …) Example 2 ( contd …) Step 1: Data Cleaning Missing Value : Sara’s marks = NaN → Fill with mean of marks Mean (excluding NaN , -5) = (85 + 72 + 90) ÷ 3 = 82.33 → replace Sara’s marks Incorrect Value : Ravi’s marks = -5 → likely error → replace with mean = 82.33 Inconsistent Gender : “M”, “F”, “Male”, “Female” → Standardize to {Male, Female } Cleaned Table Name Age Marks Gender City John 18 85 Male Mumbai Sara 20 82.3 Female Delhi Rohan 19 72 Male Pune Anita 21 90 Female Mumbai Ravi 22 82.3 Male Delhi

Data Pre-processing ( contd …) Example 2 ( contd …) Step 2: Data Transformation Normalization (Min-Max Scaling 0–1) for Marks: Formula → (x−min)/(max−min)(x - \text{min}) / (\text{max} - \text{min}) Min = 72, Max = 90 John: (85–72)/(90–72 ) = 13/18 = 0.72 Sara: (82.3–72)/ 18 = 10.3/18 = 0.57 Rohan: (72–72)/ 18 = 0 Anita: (90–72)/ 18 = 1 Ravi: (82.3–72)/18 = 10.3/18 = 0.57 Transformed Table Name Age Marks Marks_Normalized Gender City John 18 85 0.72 Male Mumbai Sara 20 82.3 0.57 Female Delhi Rohan 19 72 0.00 Male Pune Anita 21 90 1.00 Female Mumbai Ravi 22 82.3 0.57 Male Delhi

Data Pre-processing ( contd …) Example 2 ( contd …) Step 3: Encoding Convert Gender → Male=0, Female=1 Convert City (One-Hot Encoding): Mumbai=[1,0,0], Delhi=[0,1,0], Pune=[0,0,1] Encoded Table (numeric-ready for ML) Age Marks_Normalized Gender Mumbai Delhi Pune 18 0.72 1 20 0.57 1 1 19 0.00 1 21 1.00 1 1 22 0.57 1