Data Science and Analytics Lesson 1.pptx

XanGwaps 109 views 24 slides Oct 07, 2024
Slide 1
Slide 1 of 24
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24

About This Presentation

Data Science


Slide Content

Introduction to Data Science and Analytics Inst. Z andro Apostol LESSON 1 1

Objectives 2 W hat is Data Science D ata LifeCycle What is Analytics Difference between Data Science and Analytics Data Data Types Excel as Analysis Tool

What is Data Science? Data science is an interdisciplinary field that involves using statistical techniques, programming, and domain knowledge to analyze and interpret complex data sets, with the goal of extracting meaningful insights and making data-driven decisions. It combines elements of mathematics, computer science, and domain expertise to build predictive models, identify patterns, and solve real-world problems across various industries.

1. DATA LIFECYCLE - foundational framework that guides the systematic process of turning raw data into actionable insights. Data Collection 2. 3. 5. 4. Data C leaning Data P rocessing Data Analysis V isualization

1. DATA COLLECTION Data science begins with the collection of raw data, which serves as the foundational input for all subsequent stages. In data science, this step might involve gathering large volumes of data from various sources such as databases, web scraping, IoT sensors, or public datasets. The goal is to obtain data that is both relevant and sufficient for the problem being addressed, whether it’s for predictive modeling, pattern recognition, or trend analysis.

Once data is collected, it often contains errors, inconsistencies, or missing values. In data science, data cleaning is essential because algorithms and models require high-quality, consistent data to function accurately. Data scientists spend a significant amount of time cleaning data to ensure that their models are trained on reliable and accurate datasets. This step includes tasks like removing duplicates, handling missing values, correcting errors, and standardizing data formats. 2. Data CLeaning

3. DATA PROCESSING After cleaning, data needs to be processed to prepare it for analysis. In data science, this involves transforming the data into a format suitable for machine learning algorithms or statistical models. Data processing might include normalizing data, encoding categorical variables, aggregating data, or integrating multiple datasets. This step is crucial because the way data is processed directly impacts the performance and accuracy of the models built during the analysis phase.

Data analysis is the core of data science, where data scientists apply various statistical methods, algorithms, and machine learning techniques to extract insights from the processed data. This stage can involve exploratory data analysis (EDA) to understand the data's underlying patterns, followed by more complex tasks like building predictive models or identifying trends. The insights gained here are what drive decision-making and strategic planning. 4. Data Analysis

5. DATA VISUALIZATION Finally, data visualization is the stage where the results of the data analysis are presented in a clear and accessible way. In data science, effective visualization is critical for communicating findings to stakeholders, who may not have a technical background. Visualization tools and techniques allow data scientists to highlight key patterns, trends, and anomalies in the data, making it easier for decision-makers to understand and act on the insights.

What is Data Analytic? It refers to the focused practice of examining and interpreting datasets to extract meaningful insights that inform decision-making. It involves techniques such as exploratory data analysis (EDA), where analysts investigate the data’s structure, patterns, and relationships to identify trends and anomalies. Data analytics also encompasses the use of statistical methods, data querying, and visualization tools to summarize and present data in a way that is easily understandable and actionable. The primary goal of data analytics in this context is to provide clear, data-driven answers to specific questions, helping organizations make informed decisions based on the evidence derived from their data.

01 Types of Data Analytic can be categorized into several types, each serving a different purpose in the process of extracting insights from data. Here are the four main types of data analytics: Data analytics Diagnostic Analytic 02 Predictive Analytic 03 Prescriptive Analytic Descriptive Analytic 04

02 01 Purpose: Descriptive analytics answers the question, "What happened?" It focuses on summarizing historical data to understand trends, patterns, and relationships. Methods: Common techniques include data aggregation, data mining, and simple statistical analysis (e.g., mean, median, mode). Examples: Monthly sales reports, website traffic analysis, and customer demographics summaries. Descriptive Analytics Purpose: Diagnostic analytics addresses the question, "Why did it happen?" It goes a step further than descriptive analytics by investigating the causes of trends or anomalies in the data. Methods: Techniques include drill-down, data discovery, correlations, and data querying to identify relationships and causality. Examples: Root cause analysis of a drop in sales, exploring correlations between marketing campaigns and customer engagement, and understanding factors contributing to high employee turnover. Diagnostic Analytics

04 03 Purpose: Predictive analytics answers the question, "What is likely to happen?" It uses historical data and statistical models to forecast future outcomes. Methods: Techniques include regression analysis, time series analysis, machine learning, and forecasting models. Examples: Sales forecasting, predicting customer churn, and risk assessment in finance. Predictive Analytics Purpose: Prescriptive analytics answers the question, "What should we do?" It provides recommendations for actions based on the analysis of data and the prediction of outcomes. Methods: Techniques include optimization algorithms, simulation, decision analysis, and machine learning. Examples: Recommending optimal pricing strategies, suggesting the best course of action in supply chain management, and personalizing marketing campaigns. Prescriptive Analytics

Objectives Difference Between Data Science and Data Analytics Data Science - Innovative solutions, predictive models, and advanced algorithms. -To analyze historical and current data to identify trends, patterns, and insights that can guide decision-making and improve business operations. -To develop new methods, models, and algorithms that can predict future outcomes or automate processes. It often involves creating data-driven products or solutions. Data Analytics - Reports, dashboards, and actionable insights that address specific business questions. Outcome

Techniques and Tools 15 Data Science - Python, R, TensorFlow, Keras , Hadoop, Spark. -To analyze historical and current data to identify trends, patterns, and insights that can guide decision-making and improve business operations. -Includes machine learning, deep learning, and advanced statistical modeling. It often involves feature engineering, model training, and algorithm development. Data Analytics - SQL, Excel, Tableau, Power BI, Google Analytics. Techniques Tools

Example Scenario 16 Data Science -A company wants to develop a recommendation system to suggest products to users based on their browsing history. Data scientists would build and train machine learning models to make accurate recommendations. Data Analytics Scenario - A company needs to understand why sales dropped in the last quarter. Data analysts would analyze sales data, create visualizations, and identify factors such as changes in pricing or marketing effectiveness.

What is Data? ? Data - refers to raw facts, figures, or information that are collected and used for analysis. It can be quantitative (numerical) or qualitative (descriptive) and is the foundational element for generating insights, making decisions, and building knowledge in various fields

Types of Data? 1. Quantitative Data - Numerical data that can be measured and quantified. Examples: Sales figures, temperatures, heights, and weights. Subcategories: Discrete Data: Countable values (e.g., number of employees, number of sales). Continuous Data: Measurable values that can take any value within a range (e.g., height, weight).

Types of Data? 2. Qualitative Data - Descriptive data that can be observed but not measured numerically. Examples: Customer feedback, interview responses, and product reviews. Subcategories: Categoral Data: Data that can be grouped into categories (e.g., colors, types of products). Ordinal Data: Data with a meaningful order but not a quantifiable difference (e.g., survey ratings).

Microsoft Excel as Analysis Tool 20 MS Excel - is a powerful and versatile tool widely used for data analysis due to its broad range of functionalities and ease of use. 1. Data Organization and Cleaning : Data Entry : Excel allows users to input and organize data in spreadsheets, making it easy to manage and structure information. Data Cleaning : Features like Find and Replace , Text-to-Columns , and Remove Duplicates help clean and preprocess data. Functions such as TRIM , CLEAN , and SUBSTITUTE assist in correcting and formatting text data..

Microsoft Excel as Analysis Tool 21 2. Data Analysis : Basic Statistical Functions : Excel provides a range of built-in functions for statistical analysis, such as AVERAGE , MEDIAN , MODE , STDEV , and VAR . Pivot Tables : PivotTables are a powerful feature for summarizing, analyzing, and visualizing data. They allow users to quickly aggregate data, create custom views, and perform complex calculations without altering the original data set. Data Filtering and Sorting : Excel’s filter and sort functionalities help users to organize and analyze subsets of data efficiently, making it easier to focus on specific criteria.

Microsoft Excel as Analysis Tool 22 2. Data Visualization : Charts and Graphs : Excel offers a variety of chart types, including bar charts , line graphs , pie charts , scatter plots , and histograms , which help in visualizing data patterns and trends.

L ets now P rac tice! 23

Economics thesis defense 24 Thank you!! & Godbless!
Tags