Data_analytics_life_cycle in big data.pptx

calvinctttafirei 18 views 39 slides Sep 09, 2024
Slide 1
Slide 1 of 39
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39

About This Presentation

Big data analytics presentation


Slide Content

Hit 2203-Big Data & Data Analytics FACILITATORS: L.Amos S.Chaputsira T.Butsa

Data analytics life cycle

Apache spark Processing framework for big data

Spark supports various programming languages

Spark features

Components of apache spark

Data analytics life cycle The  Data Analytics Lifecycle  is a cyclic process which explains, in six stages, how information is made, collected, processed, implemented, and analyzed for different objectives

Discovery phase Understanding the problem statement, thorough study of the business model .This phase involves : U nderstanding of the business problem Asking questions Meeting up with all the stakeholders Understanding what kind of data is available Is there any example of the same problem that have been solved earlier

Data preparation Also known as data munging or data manipulation is the most important task in the data life cycle for any valuable insights to pop up. Raw data on its own is meaningless therefore the data scientist would want to explore the data ,take a look at some sample data by taking a few records to discover whether there are any gaps on the data and is the structure of the data appropriate to feed into the system, are there any columns which are not adding value and if they are there these columns may not be required for analysis e.g Name of customers column may not add nay value for analysis perspective. There may be gaps in the data so we need to fill those gaps with something meaningful

Model planning This step involves exploratory data analysis (EDA) to understand the relation between variables and to see what the data can tell us Key variables are selected

Various techniques can be used for model planning which includes:

Model building Using various analytical tools and techniques, data is transformed with the goal of discovering useful information to build the right model

Communicating the results Key findings are identified and communicated to the stakeholders

Operationalize Final reports, code and technical documents are delivered by the team