Big Data: Big SQL web tooling (Data Server Manager) self-study lab

CynthiaSaracco 3,884 views 37 slides Oct 01, 2015
Slide 1
Slide 1 of 37
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37

About This Presentation

This hands-on lab introduces you to Data Server Manager, a Web tool for querying and monitoring your Big SQL database. Data Server Manager (DSM) and Big SQL support select Apache Hadoop platforms.


Slide Content

IBM

Getting started with Big SQL's Web
tooling (Data Server Manager 4.1)

Cynthia M. Saracco
IBM Solution Architect

Oct. 1, 2015

Page 2 of 37

Contents
LAB 1 OVERVIEW ......................................................................................................................................................... 4
1.1. WHAT YOU'LL LEARN ................................................................................................................................ 4
1.2. ABOUT YOUR ENVIRONMEN T ..................................................................................................................... 4
1.3. GETTING STARTED ................................................................................................................................... 5
LAB 2 EXPLORING THE BASICS ................................................................................................................................ 6
2.1. LAUNCHING BIGINSIGHTS HOME AND BIG SQL WEB TOOLING ...................................................................... 6
2.2. CONNECTING TO YOUR DA TABASE ............................................................................................................. 9
2.3. USING HELP FACILITIES .......................................................................................................................... 12
LAB 3 EXPLORING AND QUERYI NG YOUR DATABASE ........................................................................................ 15
3.1. EXPLORING THE CONTENT S OF YOUR DATABASE........................................................................................ 15
3.2. ISSUING QUERIES AND INSPECTING RESULTS ............................................................................................. 17
3.3. OPTIONAL: USING SYNTAX ASSISTAN CE .................................................................................................... 20
LAB 4 MONITORING YOUR ENVI RONMENT ............................................................................................................ 23
4.1. SETTING UP YOUR ENVIRONMENT ............................................................................................................ 23
4.2. EXAMINING DATABASE METRICS ............................................................................................................... 24
4.3. EXAMINING ALERTS ................................................................................................................................ 28
LAB 5 COLLECTING STATISTICS AND VIEWING DATA ACCESS PLANS ............................................................ 30
5.1. COLLECTING STATISTICS......................................................................................................................... 30
5.2. USING EXPLAIN .................................................................................................................................. 33
LAB 6 SUMMARY ....................................................................................................................................................... 36

Page 4 of 37

Lab 1 Overview
This hands-on lab introduces you to Data Server Manager, a Web tool for querying and monitoring your
Big SQL database. Data Server Manager (DSM) and Big SQL support select Apache Hadoop platforms
and are included with several IBM BigInsights offerings.
1.1. What you'll learn
After completing all exercises in this lab guide, you'll know how to
• Launch the web tooling.
• Connect to your Big SQL server.
• Execute Big SQL queries and inspect results.
• Explore Big SQL database metrics and alerts.
• Collect statistics.
• Generate and inspect data access plans.
Allow 1.5 - 2 hours to complete all sections of this lab. For additional information about DSM capabilities,
please visit IBM’s DSM site at http://www-03.ibm.com/software/products/en/ibm-data-server-manager. If
you have questions or comments about this lab, please post them to the forum on Hadoop Dev
at https://developer.ibm.com/answers?community=hadoop.
Separate labs are available on getting started with Big SQL, using Big SQL with HBase, and using Spark
to access Big SQL data.
1.2. About your environment
This lab requires a BigInsights 4.1 environment in which Big SQL, BigInsights Home, and DSM are
installed and running. These components are part of BigInsights Quick Start Edition, BigInsights Data
Analyst, and BigInsights Data Scientist.
Examples in this lab use are based on a sample multi-node cluster with the configuration shown in the
tables below. If your environment is different, modify the sample code and instructions as needed
to match your configuration.
User Password
Root account root password
Big SQL Administrator bigsql bigsql
Ambari Administrator admin admin
Knox Gateway account guest guest-password

Page 5 of 37


Property Value
Host name myhost.ibm.com
Ambari port number 8080
Big SQL database name bigsql
Big SQL port number 32051
Big SQL installation
directory
/usr/ibmpacks/bigsql
Big SQL samples
directory
/usr/ibmpacks/bigsql/4.1/bigsql/samples/data
BigInsights Home
https://myhost.ibm.com:8443/gateway/default/BigInsightsWeb/index.html#/
welcome


About the screen captures, sample code, and environment configuration
Screen captures in this lab depict examples and results that may vary from what you
see when you complete the exercises. In addition, some code examples may need to
be customized to match your environment.
1.3. Getting started
To get started with the lab exercises, you need access to a working Big SQL environment and a secure
shell (command window). A free Quick Start Edition is available for download from IBM at http://www-
01.ibm.com/software/data/infosphere/hadoop/trials.html.
This lab was tested against a native BigInsights 4.1 installation on a multi-node cluster. For information
about how to install and configure BigInsights on your own cluster, consult the product's Knowledge
Center (http:/www-
01.ibm.com/support/knowledgecenter/SSPT3X_4.1.0/com.ibm.swg.im.infosphere.biginsights.welcome.d
oc/doc/welcome.html).
Before continuing with this lab, verify that Big SQL, DSM, and all its pre-requisite services are running.

If have any questions or need help getting your environment up and running, visit Hadoop Dev
(https://developer.ibm.com/hadoop/) and review the product documentation or post a
message to the forum. You cannot proceed with subsequent lab exercises without access to a
working environment.

Page 6 of 37

Lab 2 Exploring the basics
IBM provides Web tools that enable you to inspect Big SQL database metrics, issue Big SQL statements,
and perform other functions. These tools are part of IBM Data Server Manager (DSM), a component you
can install and configure on your Hadoop cluster.
After completing this lab, you will know how to:
 Launch the BigInsights Home page and the Big SQL Web tooling (DSM).
 Establish a connection to your Big SQL database.
 Access online help.
Prior to beginning this lab, you will need access to a BigInsights cluster in which BigInsights Home, Big
SQL, DSM, Ambari, and all pre-requisite services are running.
Allow 30 minutes to complete this lab.
2.1. Launching BigInsights Home and Big SQL Web tooling
Big SQL Web tooling is accessed through a link in the BigInsights Home page. BigInsights Home is a
component provided with several IBM BigInsights offerings, including the Quick Start Edition, BigInsights
Analyst, and BigInsights Data Scientist. In this exercise, you’ll verify that BigInsights Home and Big SQL
Web tooling (DSM) are installed and running on your cluster. Then you’ll launch the Home page and the
Web tooling. Instructions in this exercise presume that you're using Knox and the demo LDAP for
security.
__1. Launch Ambari and sign into its console. If necessary, consult a separate lab on Getting Started
with Big SQL for details on how to do this.

Page 7 of 37


__2. From the Ambari dashboard, inspect the list of services in the left pane. Verify BigInsights
Home, BigInsights – Big SQL, and BigInsights Data Server Manager are running, as well as all
pre-requisite services (e.g., HDFS, MapReduce2, YARN, Hive, and Knox).
__3. Click on the Knox service.
__4. Start the LDAP demo.

__5. Click the Ambari Hosts tab and expand the information about the nodes in your cluster to locate
where Knox is running.

Page 8 of 37


__6. Launch BigInsights Home, providing the appropriate URL based on your installation’s
configuration. Assuming you installed BigInsights with Knox and accepted default installation
values, the BigInsights Home URL is similar to the link shown below. Substitute the location of
the Knox gateway on your cluster for the italicized text in this example.
https://yourKnoxGatewayNode:8443/gateway/default/BigInsightsWeb/index.html#/welcome

If you're unable to launch BigInsights Home, you may have an installation or configuration
problem. Consult the product documentation or post a message to the forum on Hadoop Dev
at https://developer.ibm.com/answers?community=hadoop. You cannot proceed with
subsequent lab exercises without access to a working environment.

__7. When prompted, enter a valid user ID and password for the Knox gateway. (Defaults are guest
/ guest-password).
__8. Verify that BigInsights Home displays an item for Big SQL. Depending on the size of your
browser window and other BigInsights components installed on your cluster, you may need to
scroll through the BigInsights Home page to locate the Big SQL section.

Page 9 of 37


__9. Click the Launch button in the Big SQL box. (Your screen may appear somewhat different than
the image below when you launch the tool for the first time.)

2.2. Connecting to your database
In this exercise, you will establish a connection to your Big SQL database.
__1. With the Big SQL web tooling launched, click Home > Welcome > Database Connections.

Page 10 of 37


__2. If prompted, open the database connection task to create a new connection.
__3. Click the plus sign to add a new connection.

__4. Inspect the pop-up menu that appears.

Page 11 of 37

__5. Complete the required information for your connection, scrolling down if needed to expose all
menu items. Sample data is shown below. Adjust the host name, port number and any other
information as needed to match your environment. Future examples in this lab presume that
your connection name is bigsql.



__6. Click Test Connection and verify that the operation succeeds.

__7. Click OK to clear the message.
__8. Click OK again to save the connection.

Page 12 of 37

__9. Verify that your new connection appears in the Database Connections window.

2.3. Using Help facilities
Online tutorials and reference information are part of DSM. This short exercise introduces you to
available Help facilities.
__1. With the Big SQL web tooling launched, click Help > Open Help.

__2. Inspect the information in the new pane that appears on your dashboard.

__3. Click on the Help Links tab and explore the various information available to you, including video
demos, a forum, and a Help search facility.

Page 13 of 37


__4. Optionally, click the arrow key beside the Help pane to close it.

__5. Inspect details about your DSM installation. Click Help > About.

__6. Review the version and build information that appears in a new browser tab.

Page 14 of 37


__7. Optionally, click the Plug-ins and System Properties tabs to reveal further details. A subset of
the information available in each is shown below. Close the tabs when done.

Page 15 of 37

Lab 3 Exploring and querying your database
Now that you’ve launched DSM and established a database connection, you’re ready to write some
queries and explore the contents of your database. After completing this lab, you will know how to:
 Explore table and views in your database.
 Execute queries and inspect results.
 Use the SQL syntax assistant.
Prior to starting this lab, it will be helpful if you have created and populated the GO_REGION_DIM
sample table presented in the Querying Structured Data module of the Getting Started with Big SQL lab
(http://www.slideshare.net/CynthiaSaracco/big-sql40-hol).
Allow 30 minutes to complete this lab.
3.1. Exploring the contents of your database
In this exercise, you will work with the DSM administrative facility to explore the tables and views in your
database.
__1. If necessary, connect to your Big SQL database following instructions provided earlier.
__2. Click Administer > Explore Databases in the menu at left.

__3. When the database explorer appears, click on your database name (bigsql).

Page 16 of 37

__4. Click on the Hadoop tables section and review the list of tables in your database. The following
screen capture was taken from a system in which all sample tables discussed in the separate
Getting Started with Big SQL lab were created and populated.

__5. Click on a table to expose details about it, using the scroll bar as needed. The example below
shows information about the BIGSQL.GO_REGION_DIM table. Note that this is a Hive-
managed table stored in the Hive warehouse and that the default SerDe (LazySimpleSerDe) is
used to process its contents.

__6. Close the tab for the table and return to the database explorer.
__7. Click on the Tables line in the menu at left. A list of tables stored on the Big SQL Head Node
appears. (These are “local” or single-node tables, not tables distributed across your Hadoop
cluster.) If you created EXPLAIN tables as part of a separate exercise in the Getting Started with
Big SQL lab, you’ll see a list similar to those shown below.

Page 17 of 37


__8. Finally, explore the views in your database. Click on Views in the pane at left. Those in the
SYSHADOOP schema include views of system catalog tables.

3.2. Issuing queries and inspecting results
In this exercise, you will work with the SQL Editor to execute a query the system catalog and inspect the
results.
__1. Click Develop > SQL Editor in the menu at left.

__2. Enter a valid user ID and password for your database connection if prompted. Check the box to
save this data and click OK. Skip the next step.

Page 18 of 37


__3. If you were not prompted for an ID and password, click Options at far right to expose a
Database Connection menu. Select the database connection by name from the drop-down
menu. Accept all other defaults and click Connect.

When prompted, enter a valid user ID and password for your database connection. Check the
box to save this data and click OK.

__4. Query the system catalog for information about tables in your database. Paste the following
query into the SQL editor pane at left and click Run.
select tabschema, tabname from syscat.tables fetch first 15 rows only ;

Page 19 of 37


__5. Inspect the Results pane at bottom and verify that the query completed successfully. Use the
vertical scroll bar at right as needed to view the results.

__6. Enlarge the result set pane by clicking on the icon in the upper right corner.

__7. Observe that you can see the full result set more easily and that you can page through its
contents using buttons at the bottom of the page. (Note: Depending on the objects in your
database, your results may be different than those shown below.)

__8. Reset your display. Click the Develop > SQL Editor in the menu at left.

Page 20 of 37


3.3. Optional: Using syntax assistance
DSM includes a syntax assistant to help you visually construct simple Big SQL queries. This exercise
introduces you to this feature of the SQL Editor. Prior to completing this exercise, you must have created
and populated the GO_REGION_DIM sample table outlined in the Querying Structured Data module of
the Getting Started with Big SQL lab (http://www.slideshare.net/CynthiaSaracco/big-sql40-hol)
__1. With the SQL Editor launched, click Syntax Assist.

__2. When a list of tables appears, click on the GO_REGION_DIM table in the BIGSQL schema.

Page 21 of 37


__3. Use Control-Click to select the following columns: COUNTRY_KEY, COUNTRY_CODE, and
ISO_THREE_LETTER_CODE. Click OK.

__4. Inspect the query that appears in the SQL Editor. You can manually modify the query if desired
(perhaps to add a WHERE clause). However, in this case, simply click Run to execute the
query.

Page 22 of 37


__5. Inspect the results, a portion of which are shown below.

Page 23 of 37

Lab 4 Monitoring your environment
You can monitor your database and view event alerts through DSM. This lab introduces you to some
key capabilities available through the monitoring facilities. However, a full discussion of DSM's
monitoring facilities is beyond the scope of this introductory lab.
After completing this lab, you will know how to:
 Examine metrics associated with your Big SQL database.
 Explore system-generated alerts about database events.
Allow 15 - 30 minutes to complete this lab.
Prior to starting this lab, create and populate the sample tables outlined in the Querying Structured Data
module of the Getting Started with Big SQL lab (http://www.slideshare.net/CynthiaSaracco/big-sql40-hol).
4.1. Setting up your environment
Before inspecting metrics about your database activities, you will run some queries to generate a very
small workload.
__1. If you haven't already done so, create and populate the sample tables outlined in the Querying
Structured Data module of the Getting Started with Big SQL lab
(http://www.slideshare.net/CynthiaSaracco/big-sql40-hol). You can use the DSM SQL Editor to
run the required Big SQL commands, if desired.
__2. From the DSM Big SQL Editor, run the following query to retrieve data about goods sold by a
fictional retailer:
SELECT pnumb.product_name, sales.quantity,
meth.order_method_en
FROM
sls_sales_fact sales,
sls_product_dim prod,
sls_product_lookup pnumb,
sls_order_method_dim meth
WHERE
pnumb.product_language='EN'
AND sales.product_key=prod.product_key
AND prod.product_number=pnumb.product_number
AND meth.order_method_key=sales.order_method_key ;
__3. Next, run the following query:
SELECT pll.product_line_en AS Product,
md.order_method_en AS Order_method,
sum(sf.QUANTITY) AS total
FROM
sls_order_method_dim AS md,

Page 24 of 37

sls_product_dim AS pd,
sls_product_line_lookup AS pll,
sls_product_brand_lookup AS pbl,
sls_sales_fact AS sf
WHERE
pd.product_key = sf.product_key
AND md.order_method_key = sf.order_method_key
AND pll.product_line_code = pd.product_line_code
AND pbl.product_brand_code = pd.product_brand_code
GROUP BY pll.product_line_en, md.order_method_en
4.2. Examining database metrics
DSM provides metrics about your Big SQL database and its overall health. This exercise shows you how
to launch the database monitoring tools and begin exploring some metrics. For further details, consult
the product Knowledge Center or online help information.
__1. Launch the database monitoring facility. Click Monitor > Database.

__2. If prompted for a database connection, use the drop-down menu to select the Big SQL database
connection that you created in a previous exercise.

__3. Inspect the Overview information presented. (If necessary, click on the Overview tab.) Verify
that a pie chart with your Database Time Breakdown and a set of Key Metrics are displayed.
Note that specific data you see on your screen will vary from what’s shown below.

Page 25 of 37


__4. Examine details about SQL execution and other aspects in the Database Time Breakdown
section to determine which activities are consuming database resources. (Click the plus sign '+'
next to desired categories.)

__5. Investigate Key Metrics of your choice, such as CPU usage and average response time. Again,
click the plus sign (+) to reveal details about an item.

Page 26 of 37


__6. Optionally, click the Learn More link in the upper right corner to expose online help information.

__7. Explore the impact of recently executed SQL statements. Click the Statements tab (next to the
Overview tab) and then click the Executed Statements tab beneath it.

Page 27 of 37

__8. Scroll through statements executed on your database. Note that information is available
regarding average CPU time, rows read, rows returned, etc.

__9. Click on a given statement and then click the View Details button at top.

__10. Inspect further details about your statement, clicking on the plus sign (+) beside each menu item
to expose information. A portion of such data is shown below

Page 28 of 37


4.3. Examining alerts
DSM includes an alert facility that reports information about potential problems related to database
operations. This exercise shows you how to launch this facility and begin exploring some of its
capabilities. For further details, consult the product Knowledge Center or online help information.
__1. Launch the database monitoring facility. Click Monitor > Alerts.

__2. Inspect the output that appears in the primary pane. Significant historical events will appear
here. For example, the screen capture below was taken from a system in which 2 transactions
were rolled back.

Page 29 of 37

__3. If any system-issued alerts appear on your screen, click on an alert and then click the View
details button. Further information related to the alerting event will appear, beginning with a
summary section exposed through the What's Wrong? page.

If desired, click on Details and Recommended Actions at left to investigate further.
__4. In the upper right corner, click on the Alerts button to expose a list of recent alerts opened in the
last 5 minutes.

Page 30 of 37

Lab 5 Collecting statistics and viewing data access plans
This lab show you how to use DSM to collect statistics about your tables to help the Big SQL query
optimizer make well-informed decisions about data access plans for your queries. In addition, this lab
show you how to view the data access plan selected for a given query using EXPLAIN. The tasks in this
lab are similar to those covered in a separate module on data access plans included in the Getting
Started with Big SQL lab (http://www.slideshare.net/CynthiaSaracco/big-sql40-hol). However, in this lab,
you will use Web tooling rather than a command-line interface to collect statistics and examine data
access plans.
After completing this lab, you will know how to:
 Collect meta data (statistics) about your data.
 Collect and review data access plans for your queries.
Prior to beginning this lab, you need to create and populate the BIGSQL.SLS_PRODUCT_DIM table as
outlined in a separate lab on Querying Structured Data in the Getting Started with Big SQL lab
(http://www.slideshare.net/CynthiaSaracco/big-sql40-hol).

Allow 30 minutes to complete this lab.
5.1. Collecting statistics
DSM enables you to "analyze" -- or collect statistics about -- tables in your Big SQL database. . These
statistics influence query optimization, enabling the Big SQL query engine to select an efficient data
access path to satisfy your query. Collecting and maintaining accurate statistics is highly recommended
when dealing with large volumes of data.
__1. Open the database explorer. (Click Administer > Explore Databases).
__2. Locate the BIGSQL.SLS_PRODUCT_DIM table and click on the box beside it. This activates
additional menu options (buttons) at top.

Page 31 of 37


__3. Click Analyze. Inspect the new menu that appears, including the command pane at bottom.

__4. Customize your work as shown below. In the Columns box, select Full. Then select the
columns for which you want to collect statistics. To do so, drag the following columns from the
Available Columns pane to the pane beside it: product_key, product_number,
product_line_code, product_brand_code.

Page 32 of 37


__5. Click Next.
__6. Accept the defaults on the next page and click Finish.

__7. Examine the commands in the SQL Editor.

Page 33 of 37


__8. Delete the final command (CONNECT RESET;).
__9. Click Run.
__10. Inspect the results at bottom and verify that the commands completed successfully.

5.2. Using EXPLAIN
The EXPLAIN feature enables you to inspect the data access plan selected by the Big SQL optimizer for
your query. Such information is highly useful for performance tuning. This exercise introduces you to
the EXPLAIN feature in DSM.
If you already created EXPLAIN tables as part of the separate lab exercise on data access plans
included in the Getting Started with Big SQL lab (http://www.slideshare.net/CynthiaSaracco/big-sql40-
hol), skip the first 2 steps below.
__1. From the DSM SQL Editor, create the necessary EXPLAIN tables to hold information about your
query plans by calling the SYSINSTALLOBJECTS procedure. The example below shows one
way in which this procedure can be invoked. Casting a NULL in the last parameter causes a
single set of EXPLAIN tables to be created in schema SYSTOOLS, which can be made
accessible to all users.
CALL SYSPROC.SYSINSTALLOBJECTS('EXPLAIN', 'C', CAST (NULL AS VARCHAR(128)), CAST (NULL AS
VARCHAR(128)));

Page 34 of 37

__2. Authorize all Big SQL users to read data from the SYSTOOLS.EXPLAIN_INSTANCE table
created by the stored procedure you just executed.
grant select on systools.explain_instance to public;
__3. Paste or type the following query into the SQL Editor but do not run it:
select distinct product_key, introduction_date
from sls_product_dim;
__4. Click Explain.

__5. Examine the output. If desired, hover over each box in the data access plan to review further
details. Icons in the Query tab enable you to zoom in and out, scroll through the plan, and
perform other functions.

Page 35 of 37

Note that your plan for this query will be different from the sample shown here if you did not
define a primary key constraint on the PRODUCT_KEY column. (With such a constraint, Big
SQL will automatically determine that it does not need to sort the data to eliminate duplicates.
Click on the SQL Statement menu option at top to compare the original query to the optimized
version.)

Page 36 of 37

Lab 6 Summary
Congratulations! You’ve just learned several features of Data Server Manager, IBM’s Web tooling for Big
SQL. To expand your skills and learn more, consult the product's online documentation. In addition, visit
the HadoopDev web site (https://developer.ibm.com/hadoop/) for links to tutorials, blogs, and other
technical resources related to BigInsights and Hadoop.

Page 37 of 37




© Copyright IBM Corporation 2015. Written by C. M. Saracco.
The information contained in these materials is provided for
informational purposes only, and is provided AS IS without warranty
of any kind, express or implied. IBM shall not be responsible for any
damages arising out of the use of, or otherwise related to, these
materials. Nothing contained in these materials is intended to, nor
shall have the effect of, creating any warranties or representations
from IBM or its suppliers or licensors, or altering the terms and
conditions of the applicable license agreement governing the use of
IBM software. References in these materials to IBM products,
programs, or services do not imply that they will be available in all
countries in which IBM operates. This information is based on
current IBM product plans and strategy, which are subject to change
by IBM without notice. Product release dates and/or capabilities
referenced in these materials may change at any time at IBM’s sole
discretion based on market opportunities or other factors, and are not
intended to be a commitment to future product or feature availability
in any way.
IBM, the IBM logo and ibm.com are trademarks of International
Business Machines Corp., registered in many jurisdictions
worldwide. Other product and service names might be trademarks of
IBM or other companies. A current list of IBM trademarks is
available on the Web at “Copyright and trademark information” at
www.ibm.com/legal/copytrade.shtml.