Conhecendo o Apache HBase

faferreira 96 views 32 slides Sep 30, 2020

Slide 1 of 32

About This Presentation

Precisando lidar com dados massivos onde centenas de gigabytes com crescimento para terabytes ou mesmo petabytes fazem parte do seu dia-a-dia ? Você precisa realizar milhares de operações por segundo em múltiplos terabytes de dados ? Venha conhecer o Apache HBase, um banco de dados NoSQL que rod...

Size: 2.3 MB

Language: en

Added: Sep 30, 2020

Slides: 32 pages

Slide Content

Felipe Ferreira

Conhecendo o

Natural Partner for Innovation

[email protected]

•NoSQL datastore built on top of HDFS (Hadoop)
•An Apache Top Level Project
•The goal is the hosting of very large tables (billions of
rows X millions of columns)
•Based on Google’s BigTable paper
What Is HBase?

•Storing large amounts of data (TB/PB)
•High throughput for a large number of requests
•Storing unstructured or variable column data
•Big Data with random read and writes
Why Use HBase?

•Only use with Big Data problems
•Read straight through files
•Write all at once or append new files
–Not random reads or writes
•Access patterns of the data are ill-defined
When to Consider Not Using HBase?

•More complete list at http://wiki.apache.org/hadoop/Hbase/PoweredBy
Hbase in production

HBase Architecture – How It works

•HBase Master
•RegionServer
•ZooKeeper
•HDFS
–NameNode/Standby NameNode
–DataNode
Meet the Daemons

Daemon Locations

Tables and Column Families

Rows and Columns

Regions

Write Path

Read Path

HBase API – How to access the data

•Data is not accessed over SQL
•You must:
–Create your own connections
–Keep track of the type of data in a column
–Give each row a key
–Access a row by its key
No SQL Means No SQL

•Gets
–Gets a row’s data based on the row key
•Puts
–Update/inserts a row with data based on the row key
•Scans
–Finds all matching rows based on the row key
–Scan logic can be increased by using filters
Types of Access

Gets

Puts

HBase Schema Design – How to design

•Designing schemas for HBase requires an in-depth knowledge
•Schema Design is ‘data-centric’ not ‘relationship-centric’
•You design around how data is accessed
•Row keys are engineered
No SQL Means No SQL

•A row key is more than the glue between two tables
•Engineering time is spent just on constructing a row key
–Contents of a row key vary by access pattern
–Often made up of several pieces of data
Row Keys

•Schema design does not start in an ERD
•Access pattern must be known and ascertained
•Denormalize to improve performance
–Fewer, bigger tables
Schema Design

HBase in production - examples

•Use of HBase to integrate SMS, chat, email and Facebook Messages into
one inbox

•HydraBase – The evolution of HBase@Facebook

•HBase provides a distributed, read/write backup of all mysql tables in
Twitter's production
•A number of applications including people search rely on HBase internally
for data generation
•Additionally, the operations team uses HBase as a timeseries database for
cluster-wide monitoring/performance data

•Uses HBase as a foundation for cloud scale storage for a variety of
applications
•Uses HBase to build a graph service for global web threat entities
evaluation and reputation

Internal Use Only
Non-profit R&D Center
founded by Nokia in 2001 in Brazil

Focused on projects
delivering solutions and products in the mobile
technology area

Technical team of 200+
Located in Brazil
Manaus | Brasilia | Recife | São Paulo
50+
invention reports
accepted by
Nokia/Microsoft to file
patent application
500+
items of scientific
production

300+
completed projects

Internal Use Only
OUR
CERTIFICATIONS

Internal Use Only
OUR
AWARDS
Eco System Saving Tips (app)
Mobile World Congress 2012
Facelock 1
st
prize
London Hackathon | Nokia World 2010
Audio Aid
1
st
prize |Forum Nokia
Calling All Innovators 2009

Microsoft Data Gathering
Tele.Síntese
2012 & 2013
award

•About training in Big Data (Developer, Analyst, Admin):
http://www.indt.org/servicos/treinamentos/hadoop-developer
http://www.indt.org/servicos/treinamentos/hadoop-analyst
http://www.indt.org/servicos/treinamentos/hadoop-admin

•About Hbase
http://hbase.apache.org/
•About INDT:
http://www.indt.org
[email protected]
•About Hortonworks:
http://www.hortonworks.com
[email protected]

INFOS + CONTACT

Conhecendo o Apache HBase

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Conhecendo o Apache HBase

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx