Data analytics,...........................

viji76760 47 views 10 slides Sep 01, 2024
Slide 1
Slide 1 of 10
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10

About This Presentation

Introduction to big data analytics


Slide Content

Introduction to Big Data and Tools This presentation will introduce you to the world of Big Data, its tools and its impact on modern business

What is Big Data ? Big Data refers to massive dataset that are too large and complex for traditional data processing tools. It is characterized by its volume , velocity, variety and veracity. Volume : The sheer size of data collected, requiring specialized storage and processing. Velocity : The Speed at which data is generated and processed, demanding real-time analytics. Variety : Data comes in different formats , including structured , semi-structured and unstructured data. Veracity : The trustworthiness and accuracy of the data , requiring data quality assurance and validation.

MongoDB : NoSQL database for Big Data MongoDB is a document –oriented NoSQL database to handle large volumes of data. It offers flexibility, scalability , and high performance . What is Big Data? Flexibility MongoDB allows for flexible data structure, adapting to evolving data needs. Scalability MongoDB can be easily scaled horizontally to accommodate growing data volumes. Performance MongoDB’s query language and indexing capabilities optimize data retrieval.

Apache HBase: Distributed Database for Big Data Apache HBase is a NoSQL database built on top of Hadoop's distributed file system, HDFS. It provides high performance and scalability for large datasets. Features Provides a column-oriented data storage model, offering high performance for read-heavy workloads. Allows for dynamic column schema, adapting to evolving data needs. Use Cases Suitable for applications requiring real-time data access, such as social media analytics, IoT data management, and clickstream analysis.

Sqoop: Integrating Hadoop with Relational Databases Sqoop is a tool used to transfer data between Hadoop and relational databases, enabling data integration and analysis across different systems. Data Transfer : Sqoop transfers data from relational databases to Hadoop and vice Data Transformation : Sqoop provides data transformation capabilities, including data cleaning and enrichment. Data Integration : Sqoop facilitates data integration between Hadoop and relational databases, enabling seamless data analysis.

Hive and Pig: Big Data Processing and Analytics Hive and Pig are frameworks that provide a high-level interface for querying and analyzing data stored in Hadoop. They simplify data processing and analytics. Hive Pig SQL – Like query language Dataflow language Schema-on-read Schema-on-write Suitable for complex queries Suitable for ad-hoc data processing

Microsoft Excel: Spreadsheet Tool for Data Analysis Microsoft Excel is a widely used spreadsheet software that offers a versatile platform for data analysis. It provides tools for data manipulation, visualization, and reporting. Data Visualization: Create charts, graphs, and pivot tables to visualize data patterns and insights. Data Manipulation :Use formulas and functions to perform calculations, clean data, and transform datasets. Data Filtering : Filter data based on specific criteria to analyze subsets and identify trends. Data Reporting : Generate reports and dashboards to summarize data findings and communicate insights.

Tableau: Data Visualization for Big Data Tableau is a data visualization and business intelligence software that empowers users to explore and understand data visually. It offers intuitive drag-and-drop functionality. Data Connection : Connect to various data sources, including databases, spreadsheets, and cloud services. Data Exploration: Explore data visually using interactive dashboards, charts, and maps. Data Analysis: Analyze data trends, patterns, and outliers to gain insights and make informed decisions. Data Sharing: Share dashboards and insights with stakeholders, enabling collaboration and informed decision-making.

Power BI: Intelligence for Big Data Power BI is a suite of business intelligence tools designed to analyze data, create interactive dashboards, and share insights. It offers a comprehensive approach to data management and analysis. Data Preparation : Clean, transform, and prepare data for analysis using Power Query. Data Visualization :Create interactive dashboards and reports using Power BI Desktop. Data Sharing :Share dashboards and insights with colleagues using Power BI Service.

THANK YOU !!
Tags