BIG DATA INTRO , bigdata_intro , Hadoop PPT

johnnyripper 21 views 25 slides Sep 02, 2024
Slide 1
Slide 1 of 25
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25

About This Presentation

BIG DATA INTRO , bigdata_intro , Hadoop PPT


Slide Content

Bigdata

Big Data Big Data is term utilized to refer to the increase in the volume of data that are difficult to store, process and analyze through traditional data base technology.

Other definitions Big Data refers to extremely large datasets that are too complex and voluminous for traditional data processing software to handle efficiently. Big data is a combination of structured, semi-structured and unstructured data that organizations collect, analyze and mine for information and insights. Big data refers to the large, diverse sets of information that grow at ever-increasing rates. Big data is a voluminous set of structured, unstructured, and semi-structured datasets, which is challenging to manage using traditional data processing tools.

Characteristic of Big Data

Characteristic of Big Data

"V"s of Big Data- Key characteristics of bigdata Volume : The amount of data generated is vast and continuously growing. Big Data deals with large quantities of data from various sources. Velocity : The speed at which data is generated, collected, and processed. Big Data requires the ability to handle this rapid arrival of data in real-time or near real-time. Variety : The different types of data, both structured (like databases) and unstructured (like text, images, and videos), that need to be analyzed. Veracity : The quality and accuracy of the data. It addresses the trustworthiness and reliability of the data being used. Value : The potential insights and benefits that can be derived from analyzing the data. This V emphasizes the importance of turning data into actionable insights. Variability : The inconsistency of the data, which can vary greatly in terms of quality and format. This characteristic highlights the challenges in processing and managing data that may change over time. Visualization : The ability to represent data in a way that is understandable and actionable, often through charts, graphs, and other visual tools.

Example explanation of all V’s Volume: Example : Social media platforms like Facebook generate massive amounts of data every day. Each user's posts, comments, likes, and shares contribute to an enormous volume of data that needs to be stored and analyzed. 2. Velocity : Example : Stock trading systems process millions of transactions per second. The speed at which this data is generated and needs to be processed to make real-time trading decisions exemplifies the velocity characteristic. 3. Variety : Example : An e-commerce company collects data from various sources such as transaction records, customer reviews (text), social media interactions (images, videos), and customer service calls (audio). This mix of data types demonstrates the variety in Big Data.

Example explanation of all V’s 4.Veracity: Example: In healthcare, patient data can come from various sources like medical records, wearable devices, and patient surveys. Ensuring this data is accurate and reliable (veracity) is crucial for making correct diagnoses and treatment plans. 5. Value: Example: A retailer analyzes customer purchasing patterns and discovers that sales of certain products increase during specific times of the year. This insight (value) allows the retailer to optimize inventory and marketing strategies to boost sales.

Example explanation of all V’s 6. Variability: Example: Weather data is collected from sensors, satellites, and weather stations. The data can vary greatly due to changes in weather patterns, sensor malfunctions, and varying data formats, showcasing variability. 7.Visualization: Example: A city government uses data from traffic sensors and public transportation systems to create interactive maps and dashboards. These visualizations help city planners and residents understand traffic flow and make informed decisions about commuting.

Types of Big Data Structure Unstructure Semi Structure Transaction data Machine Data Spatial data Time series data Open Data

Types of Big Data examples Structured Data: Example : Data stored in databases, spreadsheets, and tables. This type of data is organized and easily searchable by simple algorithms. Sources : Relational databases, Excel files, transaction records Unstructured Data: Example : Data that does not have a predefined data model or structure. This includes text, images, videos, and social media posts. Sources : Social media platforms, multimedia files, email messages.

Types of Big Data examples Semi-Structured Data: Example : Data that does not conform to a rigid structure but contains tags or markers to separate semantic elements. Sources : XML files, JSON documents, NoSQL databases. Transactional Data: Example : Data generated from daily operations and transactions, such as sales, purchases, and financial operations. Sources : Point-of-sale systems, online transaction processing systems.

Types of Big Data examples Machine Data: Example: Data generated by machines, sensors, and other devices, often used in IoT applications. Sources: Log files, industrial equipment sensors, network devices. Spatial Data: Example: Data that represents the physical location and shape of objects, often used in geographic information systems (GIS). Sources: GPS data, satellite imagery, maps.

Types of Big Data examples Time-Series Data: Example: Data points indexed or organized by time, capturing data changes over time. Sources: Financial market data, weather monitoring systems, IoT sensors. Open Data: Example: Data that is freely available for anyone to use and share, often provided by governments or public organizations. Sources: Government databases, public research datasets, open-source projects.

Classification of Big Data Big Data can be classified based on various criteria, including its Structure Source Processing method Data Source Contents Usage

Classification of Big Data Based on Structure Structured Data: Organized in fixed formats or schemas. Examples: Relational databases, spreadsheets. Unstructured Data: Lacks a predefined format or organization. Examples: Text documents, social media posts, videos, images. Semi-Structured Data: Does not conform to a strict structure but contains tags or markers to separate data elements. Examples: XML files, JSON documents, emails.

Classification of Big Data Based on Source Human-Generated Data: Created by people through their interactions with various systems. Examples: Social media content, emails, online reviews. Machine-Generated Data: Produced by machines or systems without human intervention. Examples: Sensor data, log files, transactional data from automated systems.

Classification of Big Data Based on Processing Method Batch Processing Data: Processed in large volumes at a specific time. Examples: Payroll systems, large-scale data analysis tasks. Real-Time Processing Data: Processed instantly as it is generated. Examples: Financial transactions, real-time monitoring systems.

Classification of Big Data Based on Data Sources Internal Data: Collected from within an organization. Examples: Sales records, employee information, internal surveys. External Data: Sourced from outside the organization. Examples: Market research data, social media feeds, public datasets.

Classification of Big Data Based on Usage Operational Data: Used for daily operations and transactions. Examples: Online transaction records, CRM data. Analytical Data: Used for analysis and decision-making. Examples: Data warehouses, business intelligence reports.

Classification of Big Data Based on Content Text Data: Comprises textual information. Examples: Documents, emails, web pages. Multimedia Data: Includes various forms of media. Examples: Images, audio files, videos. Time-Series Data: Sequential data points indexed in time order. Examples: Stock prices, sensor readings. Geospatial Data: Related to geographic or spatial aspects. Examples: GPS data, maps.

Sources of big data Social Media Platforms: Examples: Facebook, Twitter, Instagram, LinkedIn. Data Types: Posts, comments, likes, shares, user profiles, multimedia content. Sensor Data: Examples: IoT devices, weather stations, industrial equipment, smart meters. Data Types: Temperature readings, humidity levels, machinery status, energy consumption. Transactional Data: Examples: Online shopping platforms, banking transactions, point-of-sale (POS) systems. Data Types: Purchase records, payment histories, order details.

Sources of big data 4. Machine-Generated Data: Examples: Server logs, application logs, network traffic data. Data Types: Error logs, access logs, event logs. 5. Public Data: Examples: Government databases, open data initiatives, public research datasets. Data Types: Census data, economic indicators, public health records. 6. Enterprise Data: Examples: Customer relationship management (CRM) systems, enterprise resource planning (ERP) systems, internal surveys. Data Types: Customer profiles, sales data, inventory levels.

Sources of big data 7. Multimedia Data: Examples: Video streaming services, online photo galleries, audio libraries. Data Types: Videos, images, audio files. 8.Web Data: Examples: Websites, blogs, forums, e-commerce sites. Data Types: Web pages, blog posts, product reviews, user ratings. 9. Geospatial Data: Examples: GPS devices, satellite imagery, geographic information systems (GIS). Data Types: Location coordinates, maps, satellite images.

Sources of big data 10. Health Data: Examples: Electronic health records (EHRs), medical imaging, wearable health devices. Data Types: Patient records, MRI scans, fitness tracker data. 11. Communication Data: Examples: Emails, text messages, call records. Data Types: Email content, SMS logs, call duration. 12. Scientific Data: Examples: Research experiments, laboratory results, astronomical observations. Data Types: Experimental data, research findings, telescope images.
Tags