Introduction to asdfghjkln b vfgh n v

What is covered in this presentation? A brief history of databases NoSQL WHY, WHAT & WHEN? Characteristics of NoSQL databases Aggregate data models CAP theorem

I n t r odu c tion Database - Organized collection of data DBMS - a software package with comput er programs that controls th e creation, maintenance and use of a database Databases are created to operate large quantities of information by inputting, storing, retrieving, and managing that information

A brief history

Benefits of Relational databases: Designed for all purposes ACID Strong consistancy, concurrency, recovery Mathematical background Standard Query language (SQL) Lots of tools to use with i.e: Reporting services, entity frameworks, ... Relational databases

SQL databases

But... Relational databases were not built for distributed applications. Because... Joins are expensive Hard to scale horizontally Impedance mismatch occurs Expensive (product cost, hardware, Maintenance) NoSQL why, what and when?

And.... It’s weak in: Speed (performance) High availability Partition tolerance NoSQL why, what and when?

Why NOSQL now?? Ans. Driving Trends

RDBMS performance

Data Data is a new class of economic asset, like currency and gold Source: World Economic Forum 2012 Data is the new raw material

Data size growth 150 exabytes in 2005 (exabyte is a billion gigabytes) 1200 exabytes in 2010 35000 exabytes in 2020 (expected by IBM)

Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2025

Data size growth Examples: ISRO launches the advanced earth observation and mapping satellite CARTOSAT-3 along with 13 other commercial nano-satellites Information and images coming from the satellite Maharashtra Election : 20000 tweets/second Around 30 billion RFID tags produced/year Automatic toll collection using RFID Oil drilling platforms have 20k to 40k sensors 95% of data produced is unstructured

Challenge Big Data’s characteristics are challenging conventional information management architectures Massive and growing amounts of information residing internal and external to the organization Unconventional semi structured or unstructured ( diverse ) including web pages, log files, social media, click-streams, instant messages, text messages, emails, sensor data from active and passive systems, etc. Changing information 15 Multi-Channel analytics Sentiment analytics Transaction analytics Call Detail Records analytics Warranty claim analytics Surveillance analytics Claim fraud analytics

What is big data? “A massive volume of both structured and unstructured data that is so large that it's difficult to store, analyse , process, share, visualise and manage with traditional database and software techniques.” - Roger Magoulas of O’reilly in 2005 Big data technologies describe a new generation of technologies and architectures , designed to economically extract value from very large volumes of a wide variety of data, by enabling high velocity capture, discovery, and/or analysis IBM / MS Volume (Terabytes -> Zettabytes ) Variety (Structured -> Semi-structured -> Unstructured) Velocity (Batch -> Streaming Data)

What Makes it Big Data? (V 3 ) VOLUME VELOCITY VARIETY VALUE SOCIAL BLOG SMART METER 101100101001001001101010101011100101010100100101 Volume ： Gigabyte(10 9 ), Terabyte(10 12 ), Petabyte (10 15 ), Exabyte(10 18 ), Zettabytes (10 21 ) Variety : Structured,semi -structured, unstructured; Text, image, audio, video, record Velocity (Dynamic, sometimes time-varying)

Variability: Variability vs variety. 6 different coffee blends tastes different every day, that is variability. The same is true of data, if the meaning is constantly changing it can have a huge impact on your data homogenization. Visualization: Using charts and graphs to visualize large amounts of complex data

A NoSQL database provides a mechanism for storage and retrieval of data that employs less constrained consistency models than traditional relational database No SQL systems are also referred to as "NotonlySQL“ to emphasize that they do in fact allow SQL-like query languages to be used. But Wh at is NoSQL?

NoSQL avoids:  Overhead of ACID transactions  Complexity of SQL query  Burden of up-front schema design  DBA presence  Transactions (It should be handled at application layer) Provides:  Easy and frequent changes to DB  Fast development  Large data volumes(eg.Google)  Schema less Characteristics of NoSQL databases

NoSQL is getting more & more popular

In relational Databases:  You can’t add a record which does not fit the schema  You need to add NULLs to unused items in a row  We should consider the datatypes. i.e : you can’t add a stirng to an interger field  You can’t add multiple items in a field (You should create another table: primary-key, foreign key, joins, normalization, ... !!!) What is a schema-less datamodel?

In NoSQL Databases:  There is no schema to consider  There is no unused cell  There is no datatype (implicit)  Most of considerations are done in application layer  We gather all items in an aggregate (document) What is a schema-less datamodel?

NoSQL databases are classified in four major datamodels: Key-value Document Column family Graph Each DB has its own query language Categories of NoSQL databases

Simplest NOSQL databases The main idea is the use of a hash table Access data (values) by strings called keys Data has no required format data may have any format Data model: (key, value) pairs Basic Operations: Insert(key,value), Fetch(key), Update(key), Delete(key) Key-value data model

Row oriented DB – stores row by row, suitable for OLTP Column oriented DB – stores column by column – OLAP Companies such as Facebook, Twitter, Yahoo, and Adobe use HBase internally (large data and random read/write) The column is lowest/smallest instance of data. It is a tuple that contains a name, a value and a timestamp Column family data model

Example 28

Some statistics about Facebook Search (using Cassandra ) MySQL > 50 GB Data Writes Average : ~300 ms Reads Average : ~350 ms Rewritten with Cassandra > 50 GB Data Writes Average : 0.12 ms Reads Average : 15 ms Column family data model

Based on Graph Theory. Scale vertically, no clustering. You can use graph algorithms easily Transactions ACID Graph data model

Pair each key with complex data structure known as data structure. Indexes are done via B-Trees. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents. Document based data model

SQL vs NOSQL

NoSQL may complement RDBMS RDBMS may hold smaller amounts of high-value structured data NoSQL may hold vast amounts of less valued and less structured Relational implementations provide ACID guarantees Atomicity : transaction treated an all or nothing operation Consistency : database values correct before and after Isolation : as if only transaction. Durability : upon completion of transaction, operation is not reversed. NoSQL often provides BASE Basically available : Allowance for parts of a system to fail ( sharding / partitioning) Soft state : An object may have multiple simultaneous values (at different times) Eventually consistent : Consistency achieved over time (not on every commit) CAP Theorem It is impossible to have consistency , availability , and partition tolerance in a distributed system

What we need ? We need a distributed database system having such features: • • • • Fault tolerance High availability Consistency Scalability Which is impossible!!! According to CAP theorem

We can not achieve all the three items In distributed database systems (center) The CAP theorem

CAP theorem

Conclusion….

Introduction to asdfghjkln b vfgh n v

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Introduction to asdfghjkln b vfgh n v

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......