HPIL MANPOWET TRAINING MATERIAL (DATA ANALYTICS)-1.pptx

emmanueladesanya934 59 views 79 slides Oct 03, 2024
Slide 1
Slide 1 of 79
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79

About This Presentation

This is a detailed course work on data analytics


Slide Content

HPIL MANPOWER

Introduction to Data Analytics HPIL MANPOWER

Introduction to Data Analytics (Video) HPIL MANPOWER

The objectives To get introduced to the basic concepts of Data analytics. To understand the basic processing, storage and programming models for Data Analytics. To understand the different phases of a Data Analytics Project. HPIL MANPOWER

Towards ‘on demand’ services are today’s technologies: Smart Technologies: smart home, smart buildings, smart environments, smart cities, smart industries, smart automobiles, smart country, smart world, smart earth…… using techniques and technologies conceptualized as ‘Internet of things’, ‘Artificial Intelligence’ (AI), and ‘ Analytics ’ Analytics Technologies: for making perfect performance decisions on all sectors, business, medicine, social networks, bio-informatics, geo-informatics, sports, media, politics, vision framing, environment etc. where information mound in associated massive data is obtained using techniques and technologies conceptualized as ‘Data Analytics’. HPIL MANPOWER

On Data Analytics Hiding within those mounds of data is knowledge that could change the life of a patient, business, or change the world. Google , once could conclude with its big data expertise, to raise warning for Covid-19 in the U.S. by analyzing the queries having a ‘ Covid-19 theme ’ well before the conventional public health services HPIL MANPOWER

Data Analytics- Definition HPIL MANPOWER

HPIL MANPOWER

HPIL MANPOWER Data collection: This is the process of collecting and evaluating information or data from multiple sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities. Accurate data collection is necessary to make informed business decisions, ensure quality assurance, and keep research integrity. We have two methods of data collection which include primary and secondary method ( Source 2 ). Data can be collected from various sources, including databases, sensors, surveys, or even social media platforms. Data Cleaning: Data cleaning is the process of fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset. When combining multiple data sources, there are many opportunities for data to be duplicated or mislabeled. This component ensures that the data fed into analysis tools is accurate, leading to more reliable results . Data Analysis: This is the heart of the process. Using statistical methods, mathematical formulas, or even machine learning algorithms, raw data is processed to identify patterns, correlations, and trends. For instance, a retailer might analyze purchase data to determine which products are most popular during a particular season. Visualization: Raw data or even statistical results can be challenging to understand. Visualization tools transform this data into visual formats like graphs, charts, heat maps, or dashboards, making it more insightful. Interpretation: This final component is about translating the insights gained from the data into actionable intelligence. It's one thing to know a trend exists and another to understand its implications. In this phase, analysts, business leaders, or decision-makers review the analyzed data (often in its visual form) and determine the next steps.

HPIL MANPOWER

1 st Refresh session HPIL MANPOWER

HPIL MANPOWER

2.1 Descriptive Data Analytics Descriptive analytics looks at what has happened in the past. This type of analytics is by far the most commonly used by customers, providing reporting and analysis centered on past events. the purpose of descriptive analytics is to simply describe what has happened; it doesn’t try to explain why this might have happened or to establish cause-and-effect relationships. For example; a) How many buyers bought A.C. in the month of December previous years? b). How many new users visited the website? There are two main techniques used in descriptive analytics: Data aggregation and data mining. i) Data Aggregation: Data aggregation is the process of gathering data and presenting it in a summarized format. (Sum, Average, minimum, maximum, etc.) ii) Data Mining: This involves exploring the data in order to uncover any patterns or trends. (bar graph, for example, or a pie chart) HPIL MANPOWER

2.2 Diagnostic Data Analytics Diagnostic analytics seeks to dig deeper in order to understand why something happened. “Why did it happen?”. The main purpose of this analytics is to identify and respond to anomalies within your data. i.e.; it used to identify the causes of problems and opportunities. Diagnostic analytics is used after descriptive analytics has been used identify trends and patterns. For example; a) Why there is an increase/ decrease in the sales of A.C.in the month of December? b) Why are not meeting our goal? This technique can also be used to identify opportunities. Probability theory, regression analysis, filtering, and time-series analysis can be used for diagnostic analysis . HPIL MANPOWER

2.3 Predictive Data Analytics Predictive analytics seeks to predict what is likely to happen in the future. “What might happen in the future?”. Based on past patterns and trends, we can devise predictive models which estimate the likelihood of a future event or outcome. Predictive models use the relationship between a set of variables to make predictions. E.g.; What shall be the sales improvement next year (making insights for future)? b) When will a customer make a purchase? Predictive modeling technique use statistical model. These models are trained on historical data to identify trends and patterns. Predictive analytics is also used for classification. A commonly used classification algorithm is logistic regression, linear regression, GMM, etc. HPIL MANPOWER

2.4 Prescriptive Data Analytics Prescriptive analytics looks at what has happened, why it happened, and what might happen in order to determine what should be done next. This type of analytics can be especially useful when making data-driven decisions. It is also used to optimize business decisions an processes. Prescriptive analytics is the most advanced type of analysis and is used after other analytics has been used to understand the past, identify the cause of problem, and predict future events. For example; How much amount of material should be procured to increase the production? How can improve customer satisfaction? Prescriptive can be used to recommend actions by using mathematical models and algorithms. The model can be used for decision optimization. HPIL MANPOWER

Research objectives describe what your research is trying to achieve and explain why you are pursuing it. They summarize the approach and purpose of your project and help to focus your research. Some of the importance of having a defined research objectives are: Establish the scope and depth of your project: This helps you avoid unnecessary research. It also means that your research methods and conclusions can easily be evaluated. Contribute to your research design: When you know what your objectives are, you have a clearer idea of what methods are most appropriate for your research. Indicate how your project will contribute to extant research: They allow you to display your knowledge of up-to-date research, employ or build on current research method. HPIL MANPOWER 3.1 Defining the Research Objective

Once you have identified the research or business problem you want to solve, you need to set out how you want to solve it. This is where your research aim and objectives come in. Decide on a general aim: Your research aim should reflect your research problem and should be relatively broad. Example: Research aim To assess the safety features and response times of self-driving cars. Decide on specific objectives: Break down your aim into a limited number of steps that will help you resolve your research problem. What specific aspects of the problem do you want to examine or understand? Example: Research objective To measure the response time of self-driving cars to pedestrians. Formulate your aims and objective: Once you’ve established your research aim and objectives, you need to explain them clearly to stakeholders/readers. SMART Research Objectives Specific: Make sure your objectives aren’t overly unclear. Measurable: Know how you’ll measure whether your objectives have been achieved. Achievable: Your objectives may be challenging, but they should be feasible. Relevant: Make sure that they directly address the research problem you want to work Time-based: Set clear deadlines for objectives to ensure that the project stays on track. HPIL MANPOWER 3.1 How to define research aims and objectives

Identifying the right data for research is essential. Here are some steps that will guide in identifying relevant research data: Define Your Research Aim: Clearly state the purpose of your research. What specific questions are you trying to answer? Understanding your research goals will guide your data search. Choose Your Data Collection Method: Primary or Secondary method. Consider whether you need existing data or if you’ll collect new data. Existing data might be available from public databases, surveys, or other sources. If collecting new data, decide on the appropriate method: qualitative (interviews, observations) or quantitative (surveys, database, experiments). Plan Your Data Collection Procedures: Outline the steps needed to collect data. This includes details like sampling methods, data collection tools, and ethical considerations. Think about the logistics: How will you access the data? What permissions or approvals are required? Search Existing Sources and Databases: Explore specialized sources related to the objectives. Use search engines, academic databases, and data portals to find relevant datasets. Refine your search using filters like data format, analysis types, and availability. HPIL MANPOWER 3.2 Identifying the Data needed

Understand Business Goals: You begin by comprehending the primary business goals. What are the key priorities? What outcomes does the organization aim to achieve? Consider both short-term and long-term objectives. Identify Research Objectives: Break down the business goals into specific research objectives. These objectives should be measurable, achievable, and relevant to the business targets. For example, if a business goal is to increase customer satisfaction, a research objective could be to understand customer discomfort points and preferences. Formulate Research Questions: Based on the research objectives, create focused and relevant research questions. Ensure that each research question directly contributes to addressing a specific aspect of the business goal. For instance: Business Goal: Improve product adoption. Research Question: “What factors influence users’ adoption of our new mobile app?” 4. Prioritize Research Questions: Not all research questions are equally important. Prioritize them based on their impact on business outcomes. Consider the urgency, potential benefits, and feasibility of answering each question. HPIL MANPOWER 3.3 Link Research Questions to Business Goals

5. Map Research Questions to Business Goals: Clearly link each research question to the corresponding business goal. Use clear language to deliver how answering the question will contribute to achieving the goal. Example: Business Goal: Enhance employee productivity. Research Question: “How can we streamline internal communication tools to improve collaboration among remote teams?” 6. Quantify Success Metrics: Define success metrics for both research questions and business goals. Specify how you will measure progress or achievement. For instance, if the business goal is to increase revenue, the success metric could be a specific percentage growth. 7. Repeat and Improve: Continuously revisit the relationship between research questions and business goals. Be open to improving or updating research questions based on new insights. HPIL MANPOWER 3.3 Link Research Questions to Business Goals (Cont’d)

Case Study Udemy Courses: Data Source Objectives: Business Objectives Data Extraction Data Cleaning and Wrangling Data Consolidation Power Query HPIL MANPOWER

Data in its raw form is not useful to any organization. Data processing is the method of collecting raw data and translating it into usable information. The raw data is collected, filtered, sorted, processed, analyzed, stored, and then presented in a clear format. Data processing is essential for organizations to create better business strategies and increase their competitive edge. Generally, there are six main steps in the data processing cycle: Step 1: Collection The collection of raw data is the first step of the data processing cycle. The type of raw data collected has a huge impact on the output produced. Hence, raw data should be gathered from defined and accurate sources so that the subsequent findings are valid and usable. Raw data can include monetary figures, website cookies, profit/loss statements of a company, user behavior, etc. Step 2: Preparation Data preparation or data cleaning is the process of sorting and filtering the raw data to remove unnecessary and inaccurate data. Raw data is checked for errors, duplication, miscalculations or missing data, and transformed into a suitable form for further analysis and processing. The purpose of this step to remove bad data (redundant, incomplete, or incorrect data) for analysis. Step 3: Input In this step, the raw data is converted into machine readable form and fed into the processing unit. This can be in the form of data entry through a keyboard, scanner or any other input source. HPIL MANPOWER

Step 4: Data Processing In this step, the raw data is subjected to various data processing methods using machine learning and artificial intelligence algorithms to generate a desirable output. This step may vary slightly from process to process depending on the source of data being processed (data lakes, online databases, connected devices, etc.) and the intended use of the output. Step 5: Output The data is finally transmitted and displayed to the user in a readable form like graphs, tables, vector files, audio, video, documents, etc. This output can be stored and further processed in the next data processing cycle. Step 6: Storage The last step of the data processing cycle is storage, where data and metadata are stored for further use. This allows for quick access and retrieval of information whenever needed, and also allows it to be used as input in the next data processing cycle directly. Types of Data Processing Batch Processing - Data is collected and processed in batches. Used for large amounts of data. Eg: payroll system. Real-time Processin g - Data is processed within seconds when the input is given. Used for small amounts of data. Eg: withdrawing money from ATM. HPIL MANPOWER

Online Processing - Data is automatically fed into the CPU as soon as it becomes available. Used for continuous processing of data. Eg: barcode scanning Multiprocessing - Data is broken down into frames and processed using two or more CPUs within a single computer system. Also known as parallel processing. Eg: weather forecasting Time-sharing - Allocates computer resources and data in time slots to several users simultaneously. Data Processing Methods There are three main data processing methods - manual, mechanical and electronic. Manual Data Processing: This data processing method is handled manually. The entire process of data collection, filtering, sorting, calculation, and other logical operations are all done with human intervention and without the use of any other electronic device or automation software. It is a low-cost method and requires little to no tools, but produces high errors, high labor cost. Mechanical Data Processing: Data is processed mechanically through the use of devices and machines. These can include simple devices such as calculators, typewriters, printing press, etc. Simple data processing operations can be achieved with this method. Electronic Data Processing: Data is processed with modern technologies using data processing software and programs. A set of instructions is given to the software to process the data and yield output. HPIL MANPOWER

Data exploration involves the use of data visualization tools and statistical techniques to uncover data set characteristics and initial patterns. During exploration, raw data is typically reviewed with a combination of manual workflows and automated data exploration techniques to visually explore data sets; look for similarities, patterns and outliers; and identify the relationships between different variables. Data exploration acts as the initial step in analyzing any dataset, aiming to gain familiarity with the information it holds. It involves: Familiarization with Data Characteristics: This phase entails getting familiar with the raw data, its format, structure, and the relationships within it. Understanding these aspects is fundamental to navigating and interpreting the data effectively. Identification of Initial Patterns: Data exploration aims to dig up primary patterns, trends, or anomalies present in the dataset. This includes basic statistical analyses to identify central tendencies, distributions, and outliers, which could significantly impact subsequent analyses and interpretations. Contextualizing the Story within the Data: It’s about cracking the narrative embedded within the data. By exploring the data initially, analysts gain insight into what stories or insights might be hidden, awaiting discovery through deeper analysis. Exploratory Data Analysis (EDA): This EDA phase involves the application of various statistical tools such as box plots, scatter plots, histograms, and distribution plots. Additionally, correlation matrices and descriptive statistics are utilized to uncover links, patterns, and trends within the data. Model Building and Validation: During this stage, preliminary models are developed to test hypotheses or predictions. Regression, classification, or clustering techniques are employed based on the problem at hand. Cross-validation methods are used to assess model performance and generalizability. HPIL MANPOWER

Importance of Data Exploration Trend Identification and Anomaly Detection: Data exploration helps uncover underlying trends and patterns within datasets that might otherwise remain unnoticed. It facilitates the identification of anomalies or outliers that could significantly impact decision-making processes. Ensuring Data Quality and Integrity: It is essential for spotting and fixing problems with data quality early on. Through the resolution of missing values, outliers, or inconsistencies, data exploration guarantees that the data and models used is accurate and reliable. Foundation for Advanced Analysis and Modeling: Data exploration sets the foundation for more advanced analyses and modeling techniques. It helps in selecting relevant features, understanding their importance, and refining them for optimal model performance. Revealing Hidden Insights: Often, valuable insights might be hidden within the data, not immediately apparent. Through visualization and statistical analysis, data exploration uncovers these latent insights, providing a deeper understanding of relationships between variables, correlations, or factors influencing certain outcomes. Supporting Informed Decision-Making: By finding patterns and insights, data exploration enables decision-makers with a clearer understanding of the data. This enables informed and evidence-based decision-making across various domains such as marketing strategies, risk assessment, resource allocation, and operational efficiency improvements. HPIL MANPOWER

Example of Data Exploration Finance: Detecting fraudulent activities through irregular transaction patterns. In the financial domain, data exploration plays a key role in safeguarding institutions against fraudulent practices by accurately analyzing transactional data. Anomaly Detection Techniques: Data exploration employs advanced anomaly detection algorithms to sift through vast volumes of transactional data. This involves identifying deviations from established patterns, such as irregular transaction amounts, unusual frequency, or unexpected locations of transactions. Behavioral Analysis: Analyzing historical transactional behaviors, data exploration discerns normal patterns from suspicious activities. This includes recognizing deviations from regular spending habits, unusual timeframes for transactions, or a typical transaction sequences. Pattern Recognition: Through advanced data exploration methods, financial institutions can uncover intricate patterns that might indicate fraudulent behavior. This could involve recognizing specific sequences of transactions, correlations between seemingly unrelated accounts, or unusual clusters of transactions occurring concurrently. Machine Learning Models: Leveraging machine learning models as part of data exploration enables the creation of predictive fraud detection systems. These models, trained on historical data, can continuously learn and adapt to evolving fraudulent tactics, enhancing their accuracy in identifying suspicious transactions. HPIL MANPOWER

We will explore the main steps in the data analysis process. This will cover how to define your goal, collect data, and carry out an analysis. Step one: Defining the question The first step in any data analysis process is to define your objective. In data analytics, this is called the ‘problem statement’. Defining your objective means coming up with a hypothesis and figuring how to test it. Start by asking: What business problem am I trying to solve? For instance, your organization’s senior management might pose an issue, such as: “Why are we losing customers?” It’s possible, though, that this doesn’t get to the core of the problem. Keeping track of business metrics and key performance indicators (KPIs) and monthly reports can allow you to track problem points in the business. Step two: Collecting the data Once you’ve established your objective, you’ll need to create a strategy for collecting and aggregating the appropriate data. A key part of this is determining which data you need. This might be quantitative (numeric) data, e.g. sales figures, or qualitative (descriptive) data, such as customer reviews. All data fit into one of three categories: first-party, second-party, and third-party data. Let’s explore each one. HPIL MANPOWER

What is first-party data? First-party data are data that you, or your company, have directly collected from customers. It might come in the form of transactional tracking data or information from your company’s customer relationship management (CRM) system. Whatever its source, first-party data is usually structured and organized in a clear, defined way. Other sources of first-party data might include customer satisfaction surveys, focus groups, interviews, or direct observation. What is second-party data? To enrich your analysis, you might want to secure a secondary data source. Second-party data is the first-party data of other organizations. This might be available directly from the company or through a private marketplace. The main benefit of second-party data is that they are usually structured, and although they will be less relevant than first-party data, they also tend to be quite reliable. Examples of second-party data include website, app or social media activity, like online purchase histories, or shipping data. What is third-party data? Third-party data is data that has been collected and aggregated from numerous sources by a third-party organization. Often (though not always) third-party data contains a vast amount of unstructured data points (big data). Many organizations collect big data to create industry reports or to conduct market research. HPIL MANPOWER

Step three: Cleaning the data Once you’ve collected your data, the next step is to get it ready for analysis. This means cleaning, or ‘scrubbing’ it, and is crucial in making sure that you’re working with high-quality data. Key data cleaning tasks include: Removing major errors, duplicates, and outliers —all of which are inevitable problems when aggregating data from numerous sources. Removing unwanted data points —extracting irrelevant observations that have no bearing on your intended analysis. Bringing structure to your data— general ‘housekeeping’, i.e. fixing typos or layout issues, which will help you map and manipulate your data more easily. Filling in major gaps —as you’re tidying up, you might notice that important data are missing. Once you’ve identified gaps, you can go about filling them. Step four: Analyzing the data Finally, you’ve cleaned your data. Now comes the fun bit—analyzing it! The type of data analysis you carry out largely depends on what your goal is. But there are many techniques available. Univariate or bivariate analysis, time-series analysis, variance, KPIs, financial analysis, and regression analysis HPIL MANPOWER

a. Regression analysis Regression analysis is used to estimate the relationship between a set of variables. When conducting any type of regression analysis, you’re looking to see if there’s a correlation between a dependent variable (that’s the variable or outcome you want to measure or predict) and any number of independent variables (factors which may have an impact on the dependent variable). The aim of regression analysis is to estimate how one or more variables might impact the dependent variable, in order to identify trends and patterns. Case Study 2 b. Monte Carlo simulation Monte Carlo simulation, otherwise known as the Monte Carlo method, is a computerized technique used to generate models of possible outcomes and their probability distributions. It essentially considers a range of possible outcomes and then calculates how likely it is that each particular outcome will be realized. The Monte Carlo method is used by data analysts to conduct advanced risk analysis, allowing them to better forecast what might happen in the future and make decisions accordingly. c. Factor analysis Factor analysis is a technique used to reduce a large number of variables to a smaller number of factors. It works on the basis that multiple separate, observable variables correlate with each other because they are all associated with an underlying construct. This is useful not only because it condenses large datasets into smaller, more manageable samples, but also because it helps to uncover hidden patterns. This allows you to explore concepts that cannot be easily measured or observed—such as wealth, happiness, fitness, or, for a more business-relevant example, customer loyalty and satisfaction. HPIL MANPOWER

HPIL MANPOWER Regression Analysis Monte Carlo Simulation Factor Analysis

d. Cohort analysis Cohort analysis is a data analytics technique that groups users based on a shared characteristic, such as the date they signed up for a service or the product they purchased. Once users are grouped into cohorts, analysts can track their behavior over time to identify trends and patterns. A cohort is a group of people who share a common characteristic (or action) during a given time period. Students who enrolled at university in 2020 may be referred to as the 2020 cohort. Customers who purchased something from your online store via the app in the month of December may also be considered a cohort. (Case study 3) . e. Cluster analysis Cluster analysis is an exploratory technique that seeks to identify structures within a dataset. The goal of cluster analysis is to sort different data points into groups (or clusters) that are internally homogeneous and externally heterogeneous. This means that data points within a cluster are similar to each other, and dissimilar to data points in another cluster. There are many real-world applications of cluster analysis. f. Time series analysis Time series analysis is a statistical technique used to identify trends and cycles over time. Time series data is a sequence of data points which measure the same variable at different points in time (for example, weekly sales figures or monthly email sign-ups). By looking at time-related trends, analysts are able to forecast how the variable of interest may fluctuate in the future. When conducting time series analysis, the main patterns you’ll be looking out for in your data are: Trends: Stable, linear increases or decreases over an extended time period. Seasonality: Predictable fluctuations in the data due to seasonal factors over a short period of time. Cyclic patterns: Unpredictable cycles where the data fluctuates. Cyclical trends are not due to seasonality, but rather, may occur as a result of economic or industry-related conditions. HPIL MANPOWER

HPIL MANPOWER Cohort Analysis Cluster Analysis Time Series Analysis

A data report is an analytical tool used to extract past, present, and future performance insights to accelerate a company’s growth. It combines various sources of information and is usually used for operational and strategic decision-making. The Importance Of Data Reporting In our current data-driven world, the importance of data reporting cannot be overstated. Organizations rely heavily on data to gain insights into their operations, understand market trends, and make informed decisions. Better decision-making: Effective reporting provides insights into trends, patterns, and performance metrics, enabling stakeholders to make informed decisions based on evidence. Objective performance evaluation: By analyzing data reports, organizations can evaluate the performance of individuals, teams, departments, or entire systems. This helps leaders find areas for improvement and recognize their strengths, weakness, opportunities, threats. Better resource allocation: Data analysis reports can help decision-makers proactively decide where resources are most needed and where they can be optimized for better outcomes. They can also identify potential waste areas, allowing them to curb losses and reduce budget strain. Improved risk management : Every organization faces risks, and reporting can help call attention to risks that might otherwise go unnoticed. HPIL MANPOWER

Customer insights: Analyzing customer behaviors helps organizations understand customer behavior, preferences, and needs. These insights can lead to a better marketing approach and customer experience. Accountability and transparency: Data (and access to data) helps foster a culture of transparency and allows users to be more accountable for their performance and decisions. Empowering employees with data gives them clear metrics against which they can measure their performance and activities. New trends and opportunities: Data can reveal many things about your company and services that might otherwise go unnoticed, such as emerging trends and patterns. Spotting these trends early gives you the best advantage to get ahead of competitors. D ata Reporting Basics Purpose: Data analytics is the art of curating and analyzing raw insights to transform metrics into actionable insights. Data analytics reports present metrics, analyses, conclusions, and recommendations in an accessible, digestible visual format so everyone can make informed, data-driven decisions. Data types: Business data reports cover a variety of topics and functions. As such, all types vary greatly in length, content, and format. It’s possible to present reporting data as an annual overview, monthly sales, accounting report, data requested by management exploring a specific issue, information requested by the stakeholders showing a company’s compliance with regulations, progress reports, feasibility studies, and more. Accessibility: Historically, creating data-driven reports was time- and resource-intensive. Data pull requests were the exclusive duties of the IT department, with a significant amount of effort spent analyzing, formatting, and then presenting the data. Modern business dashboard tools allow a wider audience to comprehend and disseminate the findings. Users can also easily export these dashboards and data visualizations. HPIL MANPOWER

Flexibility: In addition to the fact that software offers a wealth of visually accessible KPI-driven insight, business intelligence dashboards are also completely customizable to suit individual goals or needs. Types Of Data Report Informational vs. analytical Recommendation/justification report Investigative report Compliance report Feasibility report Research studies report Periodic report KPI report Data Report Examples ( Dashboard Templates ) Financial KPI Dashboard, Marketing KPI Dashboard, Retail KPI Dashboard, Customer Support Dashboard, Talent Management Dashboard, Sales Opportunity Dashboard, IT Issue Management Dashboard, Procurement Quality Dashboard, etc. HPIL MANPOWER

HPIL MANPOWER Customer Support Dashboard Finance KPI Dashboard

HPIL MANPOWER Marketing KPI Dashboard Talent Management Dashboard

Establish trust and rapport: One of the most important factors for validating stakeholder needs is to build trust and rapport with them. “Trust” is the catalyst that could offer the right concoction of stakeholder relations. You can also use techniques such as mirroring, paraphrasing, and summarizing to demonstrate that you understand their perspective and concerns. Use multiple elicitation techniques: To engage stakeholders unfamiliar with analysis processes, a business analyst can utilize various elicitation techniques. Start with interviews to grasp their needs and expectations. For example, you can combine interviews, surveys, workshops, observations, and document analysis to collect data from various stakeholders and contexts. You can also use techniques such as brainstorming, prototyping, storyboarding, and scenarios to generate and test ideas with your stakeholders. Apply the MoSCoW method: A useful tool for validating stakeholder needs is the MoSCoW method, which stands for Must have, Should have, Could have, and Won't have. This method helps you prioritize and categorize stakeholder needs based on their importance and urgency, and align them with the project scope and objectives. Validate early and often: A final way to validate stakeholder needs is to validate them early and often throughout the analysis process. This means involving your stakeholders in every stage of the analysis, from defining the problem and the solution vision, to specifying the requirements and the acceptance criteria. You can also use techniques such as feedback sessions, walkthroughs, demonstrations, and testing to validate your deliverables. HPIL MANPOWER

HPIL MANPOWER

Data interpretation refers to the process of taking raw data and transforming it into useful information. This involves analyzing the data to identify patterns, trends, and relationships, and then presenting the results in a meaningful way. Methods of Data Interpretation Descriptive Statistics : Descriptive statistics involve summarizing and presenting data in a way that makes it easy to understand. This can include calculating measures such as mean, median, mode, and standard deviation. Inferential Statistics: Inferential statistics involves making inferences and predictions about a population based on a sample of data. This type of data interpretation involves the use of statistical models and algorithms to identify patterns and relationships in the data. Visualization Techniques: Visualization techniques involve creating visual representations of data, such as graphs, charts, and maps. These techniques are particularly useful for communicating complex data in an easy-to-understand manner and identifying data patterns and trends. Data Interpretation Tools Google Sheets: Google Sheets is a free, web-based spreadsheet application that allows users to create, edit, and format spreadsheets. It provides a range of features for data interpretation, including functions, charts, and pivot tables. HPIL MANPOWER

HPIL MANPOWER Descriptive Statistics Inferential Statistics

Microsoft Excel: Microsoft Excel is a spreadsheet software widely used for data interpretation. It provides various functions and features to help you analyze and interpret data, including sorting, filtering, pivot tables, and charts. Tableau: Tableau is a data visualization tool that helps you see and understand your data. It allows you to connect to various data sources and create interactive dashboards and visualizations to communicate insights. Power BI: Power BI is a business analytics service that provides interactive visualizations and business intelligence capabilities with an easy interface for end users to create their own reports and dashboards. R: R is a programming language and software environment for statistical computing and graphics. It is widely used by statisticians, data scientists, and researchers to analyze and interpret data. Python libraries: Matplotlib, Seaborn, and Plotly are popular Python libraries for creating static and interactive visualizations. HPIL MANPOWER

Understand your data; get help with analysis if you need it. Start with simple analysis first. Look for patterns. Identify the data that tells you the most about your outcome and is most helpful in making improvements. Summarize. What are the top points for each outcome, variable, or concept you assessed? Use charts if they make data more understandable. Determine which audiences need to know about what information in order to make improvements. What are their needs, perspectives, or priorities? HPIL MANPOWER

TELL A STORY Determine which audiences need to know about what information in order to make improvements. What are their needs, perspectives, or priorities? Tell a story of what you’ve learned and what you decide or recommend. ‒ What was relevant for making decisions and taking action? ‒ What was particularly interesting? ‒ What (if anything) was unexpected? ‒ What differences found are meaningful? Ensure data visualizations communicate clearly and accurately. Acknowledge limitations and flaws. HPIL MANPOWER

HPIL MANPOWER

Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data. Let’s now examine the most popular data visualization techniques! Line Chart One of the most used visualizations, line charts are excellent at tracking the evolution of a variable over time. They are normally created by putting a time variable on the x-axis and the variable you want to analyze on the y-axis. For example, the line plot below shows the evolution of the DJIA Stock Price during 2022. HPIL MANPOWER

2. Bar Chart A bar chart ranks data according to the value of multiple categories. It consists of rectangles whose lengths are proportional to the value of each category. Bar charts are prevalent because they are easy to read. Businesses commonly use bar charts to make comparisons, like comparing the market share of different brands or the revenue of different regions There are multiple types of bar charts, each suited for a different purpose, including vertical bar plots, horizontal bar plots, and clustered bar plots. HPIL MANPOWER

3. Histogram Histograms are one of the most popular visualizations to analyze the distribution of data. They show the numerical variable's distribution with bars. To build a histogram, the numerical data is first divided into several ranges or bins, and the frequency of occurrence of each range is counted. The horizontal axis shows the range, while the vertical axis represents the frequency or percentage of occurrences of a range. HPIL MANPOWER

4. Scatter Plot Scatter plots are used to visualize the relationship between two continuous variables. Each point on the plot represents a single data point, and the position of the point on the x and y-axis represents the values of the two variables. It is often used in data exploration to understand the data and quickly surface potential correlations. HPIL MANPOWER

5. Bubble Plot Bubble plots can be easily augmented by adding new elements that represent new variables. We could also change the size of the points according to another variable. This is what characterizes the so-called bubble plots. For example, this incredible graph shows the relationship between a country's life expectancy and GDP, adding color to represent the country's region, and size to represent the country's population. HPIL MANPOWER Source: Gapminder

6. Tree Maps Tree maps are suitable to show part-to-whole relationships in data. They display hierarchical data as a set of rectangles. Each rectangle is a category within a given variable, whereas the area of the rectangle is proportional to the size of that category. Compared to similar visualizations, like pie charts, tree maps are considered more intuitive and preferable. HPIL MANPOWER

7. Heatmaps A heatmap is a common and beautiful matrix plot that can be used to graphically summarize the relationship between two variables. The degree of correlation between two variables is represented by a color code. For example, this heatmap extracted analyzes the occupation of the guests of the Daily Show during the 1999-2012 period. HPIL MANPOWER

8. World Clouds Word clouds are useful for visualizing common words in a text or data set. They're similar to bar plots but are often more visually appealing. However, at times word clouds can be harder to interpret. World clouds are useful in the following scenarios: Quickly identify the most important themes or topics in a large body of text. Understand the overall sentiment or tone of a piece of writing. Explore patterns or trends in data that contain textual information. Communicate the key ideas or concepts in a visually engaging way. HPIL MANPOWER

9. Maps A considerable proportion of the data generated every day is inherently spatial. Spatial data –also known sometimes as geospatial data or geographic information– are data for which a specific location is associated with each record. Every spatial data point can be located on a map using a certain coordinate reference system. For example, the image below, shows the different districts of Barcelona. HPIL MANPOWER

10. Network Diagrams Most data is stored in tables. However, this is not the only format available. The so-called graphs are better suited to analyze data that is organized in networks, such as online social networks, like Facebook and Twitter, to transportation networks, like metro lines. Network analytics is the subdomain of data science that uses graphs to study networks. Network graphs consist of two main components: nodes and edges, also known as relationships. This is an example of a simple network graph. HPIL MANPOWER

10. Network Diagrams Most data is stored in tables. However, this is not the only format available. The so-called graphs are better suited to analyze data that is organized in networks, such as online social networks, like Facebook and Twitter, to transportation networks, like metro lines. Network analytics is the subdomain of data science that uses graphs to study networks. Network graphs consist of two main components: nodes and edges, also known as relationships. This is an example of a simple network graph. HPIL MANPOWER

Data visualization tools provide data visualization designers with an easier way to create visual representations of large data sets. 1. Plotly Plotly at a glance: Availability: Open-source software with enterprise versions available. Commonly used by: Data analysts and data scientists. Pros: Highly customizable visuals, with many different tools available under the Plotly banner. Cons: Requires coding knowledge. HPIL MANPOWER

2. D3.js D3.js at a glance: Availability: Free, open-source software. Commonly used by: Data analysts and data scientists. Pros: Rich visualizations, highly-customizable, large support community. Cons: Steep learning curve, not suited to other data analytics tasks, e.g. data cleaning and analysis, hard installation process. HPIL MANPOWER

3 . Qlik Sense Qlik Sense Qlik Sense at a glance: Availability: Commercial as Qlik Sense Business, but you can download a free trial. Commonly used by: Business analysts. Pros: Speed—it uses in-memory databases and can pull data from many sources without slowdown. Generative AI-powered. Cons: Requires programming expertise. HPIL MANPOWER

4. Tableau Tableau at a glance: Availability: Tableau Desktop is the enterprise version. Tableau Public is free but has limited functionality. Commonly used by: Data analysts and business intelligence analysts. Pros: Excellent visualizations, usability, speed, and interactivity. Cons: No data pre-processing, poor customization of functions. HPIL MANPOWER

5 . Grafana Grafana at a glance: Availability: Open-source, but there’s also an enterprise version with extra functionality. Commonly used by: Development and IT operations, as well as AdTech. Pros: Lovely visuals and a user-friendly output. Cons: Hard to set up. HPIL MANPOWER

6. Datawrapper Datawrapper at a glance: Availability: Commercial but affordable, with a free version available. Commonly used by: Journalists and those working in news media. Pros: Quick to use and simple for non-technical users. Cons: Limited customizations, only good for basic visualizations–not designed for analytics. HPIL MANPOWER

7 . Microsoft Power BI Microsoft Power BI at a glance: Availability: Commercial software (with a free version available). Commonly used by: Business analysts. Pros: Great visualizations, good data connectivity; suitable for a wide range of analytics tasks. Cons: Interface is not that intuitive and it has rigid formulas. HPIL MANPOWER

8 . Microsoft Excel Microsoft Excel at a glance: Availability: Commercial software (with a free version available). Commonly used by: Data and Business analysts. Pros: Great visualizations, good data connectivity; suitable for a wide range of analytics tasks. Cons: Interface is not that intuitive and it has rigid formulas. HPIL MANPOWER

9. Google Data Studio Data Studio at a glance: Availability: Open-source, but there’s also an enterprise version with extra functionality. Commonly used by: Development and IT operations, as well as AdTech. Pros: Lovely visuals and a user-friendly output. Cons: Hard to set up. HPIL MANPOWER

Consider your audience. As a golden rule, you should always empathize with the audience your visualization is addressing. This means having a good understanding of your audience’s area of expertise, level of technical knowledge, and interests. Clear the clutter. To avoid making unreadable, cluttered visualizations, ask yourself if what you’re including is relevant to the audience, and remove unnecessary elements as much as you can. Keep an eye on the fonts. Even though it can be tempting to use different fonts and sizes, as a general rule of thumb, stick to one font with no more than three different sizes. You should follow the font hierarchy and keep headings larger than the body, as well as use a bold typeface to highlight key elements and headings. Use colors creatively. Color is one of the most eye-catching aspects of any data visualization. As such, put a lot of thought into choosing the color scheme of your data visualization. HPIL MANPOWER

HPIL MANPOWER

Case Study 2 ASA Sales Data: Source Objectives: Business Objectives Data Extraction Data Importing Data Cleaning and Wrangling Data Modeling Conditional Formatting Excel Pivot Tables and Chart Excel Formula HPIL MANPOWER

HPIL MANPOWER

HPIL MANPOWER

HPIL MANPOWER

HPIL MANPOWER

HPIL MANPOWER

HPIL MANPOWER

HPIL MANPOWER

8.0 FINAL ASSESSMENT Build a final project Create Portfolio Optimization of LinkedIn Profile Revamping CV Data Analytics Africa
Tags