Red Hat JBOSS Data Virtualization

1,686 views 29 slides Aug 10, 2016
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

Red Hat JBOSS Data Virtualization


Slide Content

www.dlt.com
Red Hat JBoss Data Virtualization
July, 2016
Rick Stewart, Middleware SA
Herndon, VA

7/19/16 DLT Solutions LLC –Proprietary & Confidential 2
“Kiss” “Whitesnake” “Poison”
“Bad Company”
Data
Warehouse

“Bad Company”
7/19/16 DLT Solutions LLC –Proprietary & Confidential 3
“Kiss” “Whitesnake” “Poison”
Data
Warehouse
Data Virtualization Server

What does Data Virtualization software do?
7/19/16 DLT Solutions LLC –Proprietary & Confidential 4
Virtual Consolidated Data Source
BI Reports
Data Virtualization Software
•Consume
•Compose
•Connect
SAP Salesforce.comOracle DW XML, CSV
& Excel files
Siloed &
Complex
Virtualize
Abstract
Federate
Easy,
Real-time
Information
Access
Applications
DATA CONSUMERS
DATA SOURCES

“Bad Company”
7/19/16 DLT Solutions LLC –Proprietary & Confidential 5
“Kiss” “Whitesnake” “Poison”
Data
Warehouse
Data Virtualization Server

“Bad Company”
7/19/16 DLT Solutions LLC –Proprietary & Confidential 6
“Kiss” “Whitesnake” “Poison”
Data
Warehouse
Data Virtualization Server

Data Challenges Getting Bigger
7/19/16 DLT Solutions LLC –Proprietary & Confidential 7
BI Reports
Operational
Reports
Enterprise
Applications
Cloud Native
Applications
Mobile
Applications
Hadoop NoSQL Cloud Apps Data Warehouse
& Databases
Mainframe XML, CSV
& Excel Files
Enterprise Apps
Integration Complexity
Consumption & Creation
Siloed
How to Integrate?

Improve Access to Your Data
7/19/16 DLT Solutions LLC –Proprietary & Confidential 8
BI Reports
Operational
Reports
Enterprise
Applications
Cloud Native
Applications
Mobile
Applications
Hadoop NoSQL Cloud Apps Data Warehouse
& Databases
Mainframe XML, CSV
& Excel Files
Enterprise Apps
Broad & Streamlined
Adaptable & Secure
Federated & Meaningful
Data Virtualization Server

Simplify Access to Your Data
7/19/16 DLT Solutions LLC –Proprietary & Confidential 9
streaming
databases
social
media data
production
application
big data
stores
website
ESB
analytics
& reporting
unstructured
data
mobile
App
data
warehouse
& data marts
internal
portal dashboard
external
data
private
data
ODBC/SQL JDBC/SQL XML/SOAP REST/JSON OData SQL
JMS SQL JDBC OData Hive RSS Excel JSONREST SOAP
JMS message SQL statement SOAP messageData Virtualization Server
production
databases
applications

Turn Siloed Data into Actionable Information
7/19/16 DLT Solutions LLC –Proprietary & Confidential 10
Connect
Compose
Consume
BI Reports & Analytics
Mobile Applications
Applications & Portals ESB, ETL
Native Data Connectivity
Standard based Data Provisioning
JDBC, ODBC, SOAP, REST, OData
JBoss
Data
Virtual-
ization
Data
Consumers
Data
Sources
Design Tools
Dashboard
Optimization
Caching
Security
Metadata
Hadoop NoSQL Cloud Apps Data Warehouse
& Databases
Mainframe
XML, CSV
& Excel Files
Enterprise Apps
Siloed &
Complex
Virtualize
Transform
Federate
Easy,
Real-time
Information
Access
Unified Virtual Database / Common Data Model
Data Transformations

Supported Data Sources
7/19/16
DLT Solutions LLC –Proprietary & Confidential 11
Enterprise RDBMS:
•Oracle
•IBM DB2
•Microsoft SQL Server
•Sybase ASE
•MySQL
•MariaDB
•PostgreSQL
•Ingres
Enterprise EDW:
•Teradata
•Netezza
•Greenplum
Search:
•Apache SOLR
Hadoop:
•Apache
•HortonWorks
•Cloudera
•More coming…
Office Productivity:
•Microsoft Excel
•Microsoft Access
•Google Spreadsheets
Specialty Data
Sources:
•ModeShape Repository
•Mondrian
•MetaMatrix
•LDAP
•Apache POI for Excel
NoSQL:
•JBoss Data Grid
•MongoDB
•Cassandra
•More coming…
Enterprise & Cloud
Applications:
•Salesforce.com
•SAP
Technology
Connectors:
•Flat Files, XML Files,
XML over HTTP
•SOAP Web Services
•REST Web Services
•OData Services
7/19/16

Data As A Service
DLT Solutions LLC –Proprietary & Confidential 127/19/16
Contextual view of disparate
source data
Single point of access
Standard based interfaces
Shareable integration and
transformation logic
Reusable data services
But you cannot achieve this by
writing more application code…
Hadoop
NoSQL Cloud Apps Data Warehouse
& Databases
Mainframe XML, CSV
& Excel Files
Enterprise Apps
JBoss Data Virtualization
BI Dashboard
& Reports
Analytical
Applications
ESB/SOA
Integration
BPM
Applications
Mobile
Applications
SQL Statement SOAP MessageREST Message
REST Request
JSON Result
SQL Request
SQL Result

Logical Architecture
7/19/16 DLT Solutions LLC –Proprietary & Confidential 13
Data Consumers
Data Sources

Teiid Data Virtualization Designer
7/19/16 DLT Solutions LLC –Proprietary & Confidential 14

7/19/16 DLT Solutions LLC –Proprietary & Confidential 15
Tooling VirtualDBEngine Server

7/19/16 DLT Solutions LLC –Proprietary & Confidential 16
Tooling VirtualDBEngine Server
Users create data models
based on metadata:
•Imported from data
sources
•Supplied via DDL
•Provided by Engine
•Specified by user
Models are packaged in a
Virtual Database (VDB)
Physical Models representing actual data sources
Logical Models

7/19/16 DLT Solutions LLC –Proprietary & Confidential 17
Tooling VirtualDBEngine Server
Build XML Document
models from XML Schemas
Map XML Document
models to other data models
Enable data access via
XML

7/19/16 DLT Solutions LLC –Proprietary & Confidential 18
Tooling VirtualDBEngine Server
Virtual Databases (VDBs) are deployment
archives similar to .WAR.
VDBs contain
•Source metadata and models
•View metadata and models
•System metadata
•Connection information, which is bound to
sources at deployment time
VDBs are deployed to the query engine
VDB Internals
Source Models
Connector
Binding
Properties
View Models
Manifesto Info

7/19/16
19
Tooling VirtualDBEngine Server
JBoss Data Virtualization can offer finer-grained
security control:
Authentication:Kerberos, LDAP, WS-UsernameToken,
HTTP Basic, SAML
Authorization:Virtual data views, Role based access
control
Administration:Centralized management of Virtual DB
privileges
Audit:Centralized audit logging and dashboard
Protection:
Row and column masking
SSL encryption (ODBC and JDBC)
DLT Solutions LLC –Proprietary & Confidential

7/19/16 DLT Solutions LLC –Proprietary & Confidential 20
Tooling VirtualDBEngine Server
Query Engine
JDBC API
VDB
Connector
Binding (1)
Connector
Binding (2)
C1 C2
DB
Oracle
DB
SQL Server
Data Consumer Apps
Query Engine is core data
virtualization functionality: Federating
relational query engine. Rule and cost
based optimizer, advanced query
planner, caching, hint processing.
Query Engine hosts VDBs, binds to
data sources, performs query
execution and results processing.

7/19/16
21
Tooling VirtualDBEngine Server
The Teiid Query engine is hosted in
JBoss EAP and uses key container-
provided services:
•Transaction manager
•JAAS security framework
•Container managed data sources
•EAP management infrastructure
•EAP deployment
The Serverexposes views /services
to consumers and managed
connections and connection pools for
data sources.
DLT Solutions LLC –Proprietary & Confidential
JBoss EAP
Applications
Security
JAAS
Transaction
Manager
JDV Runtime Engine
BufferMgr
Threading
Local Caches
etc.
VDB
VDBs
ODBC Socket
Transport
Admin Socket
Transport
JDBC Socket
Transport
Profile
Service
ODBC
JDBC
Admin /
AdminShell
JON
DS
DS
DS
DS
JCA
Translators
Embedded DS
xxx-ds.xml
yyy-ds.xml
zzz-ds.xml

7/19/16
22
Tooling VirtualDBEngine Server
DLT Solutions LLC –Proprietary & Confidential
CACHING & MATERIALIZATION
Multiple levels of caching to meet
performance requirements and manage load
on source systems:
Materialized Views
–External or Internal materialized views
–Ability to override use of materialized
views
Result set Caching
–Applied to results return from user queries
and virtual procedure calls
–Configurable time to live and max. number
of entries
Code Table Caching
–Suited for integrating reference data with
transaction/operational data e.g. Country
code, State Code etc.
QUERY
Access Patterns –criteria requirements on
pushdown queries
Pushdown –decompose user query into
source queries
–Projection minimization to remove unused
select items
–Decompose aggregates over joins/unions
–Generating SQL matching Teiid system
functions
Dependent Joins (can use hints) –feed equi-
join values from one side of the join to the
other
Partition aware aggregation and joins
Copy Criteria –uses criteria transitivity to
minimize join tuples.
PERFORMANCE OPTIMIZATION

Business Dashboard
7/19/16 DLT Solutions LLC –Proprietary & Confidential 23

Bring It All Together
7/19/16 DLT Solutions LLC –Proprietary & Confidential 24
Hadoop
Data Integration
JBoss Data Virtualization
In-memory Cache
JBoss Data Grid
BI Analytics
(historical, operational, predictive)
Composite Applications
Messaging and Event Processing
JBoss A-MQ and JBoss BRMS
J
Structured Data
Streaming
Data
Semi-Structured
Data
Capture & Process
Integrate & Analyze
Red Hat Storage

25
Questions
?

Bring It All Together
7/19/16 DLT Solutions LLC –Proprietary & Confidential 26

27
Thank
You!

JBoss Data Virtualization –Use Cases
7/19/16 DLT Solutions LLC –Proprietary & Confidential 28
Self-Service
Business
Intelligence
Thevirtual, reusable data model provides business-friendly representation of data,
allowing the user to interact with their data without having to know the complexities of
their database or where the data is stored and allowing multiple BI tools to acquire data
from centralized data layer. Gain better insights from Big Data using JBoss Data
Virtualization to integrate with existing information sources.
360

Unified
View
Deliver a complete view of master & transactional data in real-time. The virtual data layer
serves as a unified, enterprise-wide view of business information that improves users’ ability
to understand and leverage enterprise data.
Agile SOA
Data
Services
A data virtualization layer deliver the missing data services layer to SOA applications. JBoss
Data Virtualization increases agility and loose coupling with virtual data stores without the
need to touch underlying sources and creation of data services that encapsulate the data
access logic and allowing multiple business service to acquire data from centralized data
layer.
Regulatory
Compliance
Data Virtualization layer deliver the data firewall functionality. JBoss Data Virtualization
improves data quality via centralized access control, robust security infrastructure and
reduction in physical copies of data thus reducing risk. Furthermore, the metadata
repository catalogs enterprise data locations and the relationships between the data in
various data stores, enabling transparency and visibility.

7/19/16 DLT Solutions LLC –Proprietary & Confidential 29
BA C D
JBoss Data Virtualization
Leveraged TPC-H like schema, data and queries
Use 4 different commercial enterprise RDBMS
Each database with 1 TB data representing
•150 million customers, with over
•600 million order records, and
•6 billion order line items.
•Total 4 TB of data
Findings:
•No measurable JDV queries overhead vs. direct queries
•Queries to federated data from four data sources ran
61.7 percent faster vs. baseline
•Scaling queries workload by 2x resulted in <10% impact
on response time
Download Benchmark Study @ http://www.redhat.com/en/resources/jboss-data-virtualization-query-performance-benchmark-study