Unit 1 Introduction to DATA SCIENCE .pptx

priyankajadhav6600 144 views 16 slides Jul 23, 2024
Slide 1
Slide 1 of 16
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16

About This Presentation

Introduction to Data Science
1.1 What is Data Science, importance of data science,
1.2 Big data and data Science, the current Scenario,
1.3 Industry Perspective Types of Data: Structured vs. Unstructured Data,
1.4 Quantitative vs. Categorical Data,
1.5 Big Data vs. Little Data, Data science proc...


Slide Content

DATA SCIENCE Introduction to Data Science Presented by Prof.Priyanka Jadhav

1.1 What is Data Science, importance of data science, 1.2 Big data and data Science, the current Scenario, 1.3 Industry Perspective Types of Data: Structured vs. Unstructured Data, 1.4 Quantitative vs. Categorical Data, 1.5 Big Data vs. Little Data, Data science process 1.6 Role of Data Scientist 2

What is Data Science The study of data to extract meaningful insights for business . Data Science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data. It combines aspects of mathematics, statistics, computer science, and domain knowledge to interpret data for decision-making and problem-solving. 3

Importance of Data Science Data science is important because it combines tools, methods, and technology to generate meaning from data 1 2 3 4 5 6 4

Comparison 5 Structured data –   Structured data  is data whose elements are addressable for effective analysis. It has been organized into a formatted repository that is typically a database. It concerns all data which can be stored in database  SQL  in a table with rows and columns. They have relational keys and can easily be mapped into pre-designed fields. Today, those data are most processed in the development and simplest way to manage information.  Example:  Relational data.    Semi-Structured data –   Semi-structured data  is information that does not reside in a relational database but that has some organizational properties that make it easier to analyze. With some processes, you can store them in the relation database (it could be very hard for some kind of semi-structured data), but Semi-structured exist to ease space.  Example : XML data.    Unstructured data –   Unstructured data  is a data which is not organized in a predefined manner or does not have a predefined data model, thus it is not a good fit for a mainstream relational database. So for Unstructured data, there are alternative platforms for storing and managing, it is increasingly prevalent in IT systems and is used by organizations in a variety of business intelligence and analytics applications.  Example : Word, PDF, Text, Media logs. 

6

7 Big Data : The definition of big data is data that contains greater variety, arriving in increasing volumes and with more velocity.   Big data is a term that describes large, hard-to-manage volumes of data – both structured and unstructured – that inundate businesses on a day-to-day basis. But it's not just the type or amount of data that's important, it's what organisations do with the data that matters . Big Data Examples to Know Transportation: assist in GPS navigation, traffic and weather alerts Big data examples: Tracking consumer behavior and shopping habits to deliver hyper-personalized retail product recommendations tailored to individual customers. Monitoring payment patterns and analyzing them against historical customer activity to detect fraud in real time.

8

9

10 Why is big data needed in current scenario? Large data sets are meant to be comprehensive and encompass as much information as the organization needs to make better decisions. Big data insights let business leaders quickly make data-driven decisions that impact their organizations. Better customer and market insights.

Quantitative variables  are any variables where the data represent amounts (e.g. height, weight, or age). Categorical variables  are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips). You need to know what  type of variables  you are working with to choose the right statistical test for your data and interpret your  results .

12

13

14

15 Role of Data Scientist : A data scientist is a tech professional that collects, analyzes, and interprets vast amounts of data using analytical, statistical, and programming skills. They are responsible for mining valuable information from various sources and transforming it into actionable insights that can drive business growth.

Thank You
Tags