UNIT-3 Data Visualization for the life used...

AadityaRathi4 11 views 17 slides Jul 26, 2024

Slide 1 of 17

About This Presentation

opsdvkerf

Size: 569.7 KB

Language: en

Added: Jul 26, 2024

Slides: 17 pages

Slide Content

UNIT-3 Processing data visualization TOPICS Processing Big Data Integrating disparate data stores Mapping data to the programming framework Connecting and extracting data from storage Transforming data for processing subdividing data in preparation for Hadoop Map Reduce.

Processing Big Data Data processing means to process the data i.e. to convert its format. As we all know data is the very useful and when it is well presented, and it becomes informative and useful. Data processing system is also referred as information system .

Types of Data Processing Manual Data Processing Mechanical Data Processing Electronic Data Processing Batch Data Processing Real-time Data Processing Online Data Processing Automatic Data Processing

Processing BIG Data

Integrating disparate data stores Integrating disparate data stores involves combining and connecting data from different sources or storage systems to provide a unified and coherent view of the information. This process is crucial for organizations that have data scattered across various databases, file systems, or platforms

Here are the key aspects of integrating disparate data stores Data Discovery Data Mapping Data Extraction Data Transformation Data Loading Data Synchronization

HADOOP MAPREDUCE Mapping data to the programming framework

Map and Reduce Map: As the name suggests its main use is to map the input data in key-value pairs. The input to the map may be a key-value pair where the key can be the id of some kind of address and value is the actual value that it keeps. The Map() function will be executed in its memory repository on each of these input key-value pairs and generates the intermediate key-value pair which works as input for the Reducer or Reduce() function.

Reduce: The intermediate key-value pairs that work as input for Reducer are shuffled and sort and send to the Reduce() function. Reducer aggregate or group the data based on its key-value pair as per the reducer algorithm written by the developer.

What is data mapping

The data mapping process in 5 steps Identify all data fields that must be mapped Standardize naming conventions across sources Create data transformation rules and schema logic Test your logic Complete the migration, integration, or transformation

Connecting and extracting data from storage

Data EXTRACTION Software Web Scraping Tools OCR (Optical Character Recognition PDF Extraction Screen Scraping Text Analytics

Data Transformation in Data Mining The goal of data transformation is to prepare the data for data mining so that it can be used to extract useful insights and knowledge.

Phases of MapReduce – How Hadoop MapReduce Works It covers all the phases of MapReduce job execution like Input Files, Input Format, Input Splits, Record Reader, Mapper, Combiner, Partitioner, Shuffling, and Sorting, Reducer, Record Writer, and Output Format in detail.

UNIT-3 Data Visualization for the life used...

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

UNIT-3 Data Visualization for the life used...

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......