Hair_PPT_Ch02.pptx marketing analytics12

SrikantKapoor1 26 views 15 slides Sep 25, 2024
Slide 1
Slide 1 of 15
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15

About This Presentation

marketing analytics


Slide Content

Chapter 02: Data Management Part 01: Overview of Marketing Analytics and Data Management Copyright © 2025 by McGraw Hill Education (India) Private Limited.

The Era of Big Data Is Here The pace of data collection is accelerating. Large datasets are a core organizational asset. A data infrastructure enabling real-time decision making requires a strong data strategy aligned with overall business strategy. Big data is a relative term describing massive amounts of data with the following characteristics. Volume is data per time unit. Variety is structured or unstructured. Veracity is accuracy or missing data. Velocity is the speed at which data arrives. Value is the usefulness of the data in making accurate decisions. Many are moving away from the term “big data” and adopting “ smart data .” 2

Database Management Systems (DBMS) Relational Databases A database contains current data from company operations. A relational database is a DBS storing data in rows and columns. Data is accessible using a structured querying language (S Q L). Access the text alternative for this image. 3

Database Management Systems (DBMS) Non‑Relational Databases Non-relational databases, or NoS Q L databases, can store large volumes of structured or unstructured data. They allow greater flexibility for storing ever-changing data. Most companies use both relational and non-relational databases. Access the text alternative for this image. 4

Enterprise Data Architecture Data storage architecture allows a company to organize, understand, and use their data to make both small and large decisions. Data analytics can analyze any kind of data repository, including: Customer Relationship Management (CRM). Enterprise Resource Planning (ERP). Other Online Transaction Processing (OLTP) software. A CRM database may store recent customer transactions and allows marketers to monitor developments in real time. Internal data can be combined with other sources like social media. Streaming data is the continuous transfer of data from numerous sources in different formats. 5

Exhibit 2-7: Simple Architecture of a Data Repository Access the text alternative for this image. 6

Traditional ETL Extract, Transform, and Load (ETL) is an integration process designed to consolidate data from a variety of sources into a single location. Functions begin with extracting key data from the source and converting it into the appropriate format. Transformation requires conforming to the appropriate data storage format for where data will be stored. The third ETL step is load , in which the data is loaded into a storage system such as data warehouse, data marts, or a data lake. 7

ETL Using Hadoop Hadoop is an open-source software that divides big data processing over multiple computers. Large amounts of data can be handled simultaneously at lower cost. Hadoop facilitates analysis using MapReduce programming which manages two steps with the data. The first step is to map the data by dividing it into subsets and distributing it to a group of computers for storing and processing. The second step is to combine the answers from the computer nodes into one answer. The ETL process on Hadoop can handle structured, semi-structured, and unstructured data. After ETL, data can be stored in a warehouse, a mart, or a lake. 8

A Closer Look at Data Storage A data warehouse holds historical data from various company databases and provides a structure for high-speed querying. A data mart provides a specific value to a group of users – can be dependent or independent. A data lake is a storage repository that holds a large amount of data in its native format. Data management as a process is the lifecycle management of data from acquisition to disposal. 9

Data Quality Numerous success stories show the use of data in decision making. Not all companies have the same success. “Garbage in, garbage out” refers to deficient data quality – you can follow the trail of bad data. When data are of poor quality, insights from marketing analytics will be unreliable. Data are important, but high-quality data is critical. Common ways of measuring data quality are shown in the box on the right. Timeliness. Completeness. Accuracy. Consistency. Format. 10

Data Understanding, Preparation, and Transformation Data inspires curiosity for marketers because they are eager to use insights to strategize the next move. There is no tangible benefit to data in raw form. Data is messy and must be tidied up before being used. 11

Data Understanding Understanding the data is critical to correctly addressing the business problems and reducing the chance of inaccurately reporting results. Individual data fields are easily overlooked when dealing with large datasets. Suppose your supervisor requests an analysis of sales data. You notice the unit of measure for reporting sales is monthly. You create an annual column for each year. You do not realize the third year’s data only contains six months. Your report shows sales plummeting in the third year. 12

Data Preparation Feature selection. Pay attention to the variables or features and avoid overfitting . Sample size. Sample size may be determined using a power calculation . Unit of analysis. A unit of analysis is the what, when, and who of the analysis. Missing values. Common in datasets – resolve with imputation, omission, or exclusion. Outliers. A method of identifying outliers is the use of cluster analysis . 13

Data Transformation Aggregation is a key process to prepare data for valuable insights. Aggregate weekly sales to calculate sales by month, quarter, or year. Normalization brings all variables into the same scale. Scale a variable by subtracting it from the mean and dividing by the standard deviation. New column (feature) construction. Using the sales data, construct new columns for the day of the week, month, quarter, and year. Dummy coding is useful when considering nominal categorical variables . Geographic location is nonmetric and needs dummy coding. 14

Case Study: Avocado Toast: A Recipe to Learn S Q L Using SQLITE and data from the Hass Avocado Board, this case study explores the data in 20 steps and several Exhibits. You will import data into SQLite and answer several questions through guided queries. You will use S Q L aggregate functions. You will build a supplier table, then merge the table with another. You will update data to change a supplier’s country of origin. Finally, the exercise describes the important act of deleting unneeded data to prevent it from being included in future analysis. 15
Tags