AIOUG-GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Preserve Performance and Rapidly Recover

SandeshRao4 219 views 129 slides Mar 05, 2019
Slide 1
Slide 1 of 129
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100
Slide 101
101
Slide 102
102
Slide 103
103
Slide 104
104
Slide 105
105
Slide 106
106
Slide 107
107
Slide 108
108
Slide 109
109
Slide 110
110
Slide 111
111
Slide 112
112
Slide 113
113
Slide 114
114
Slide 115
115
Slide 116
116
Slide 117
117
Slide 118
118
Slide 119
119
Slide 120
120
Slide 121
121
Slide 122
122
Slide 123
123
Slide 124
124
Slide 125
125
Slide 126
126
Slide 127
127
Slide 128
128
Slide 129
129

About This Presentation

This session will focus on the best practice use of the Oracle Autonomous Health Framework (AHF) with an emphasis on consolidated or private cloud database deployments. It will utilize a workload test driver and schemas that can be used to validate the prognostic and performance management functiona...


Slide Content

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Using Oracle Autonomous Health
Framework to Preserve
Performance and Rapidly Recover
TFA & ORAchk/EXAchk
Sandesh Rao
VP Autonomous Health and Machine Learning

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, timing,and pricing of any
features or functionality described for Oracle’s products may change and remains at the
sole discretion of Oracle Corporation.
2

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Overview & History
Installation and Configuration
Reactive Usage
Proactive Usage
Centralized Usage
1
2
3
4
5
3

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Overview & History
Installation and Configuration
Reactive Usage
Proactive Usage
Centralized Usage
1
2
3
4
5
4

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Introducing…
Autonomous Health
Framework
5
A collection of tools as
components, which work
together autonomously 24x7 to
keep databasesystems healthy
and running while minimizing
human reaction time.

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Avoidthe Pitfalls of
Inefficientand
Incomplete
DiagnosticsCollection
6
Become Proactiveand
Avoid Encountering
Known IssuesHelp Us
Help You!

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |7
Real-time fault
detection, diagnostic
collection & diagnosis
via a single interface
Secure log collection
Continuous log lifecycle
management
Top problem detection
& diagnostics
What is TFA?
TFA makes it quicker & easier
to detect & diagnose Database problems

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Automatic proactive warning
of problems before they
impact you
8
Get scheduled health reports
sent to you in email
What is ORAchk/EXAchk?
Health checks for most impactful
reoccurring problems
Runs in your environment
with no need to send
anything to Oracle
Findings can be integrated
into other tools of choiceEngineered
Systems
Non
Engineered
Systems
EXAchk
Common Framework
ORAchk

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |9
Lots of Pings
Customer Experience Before TFA
Oracle Grid Infrastructure
& Databases
Oracle Support
1Open new Service Request
Collect data from all nodes
without regard to relevance2 3Upload data
Collect more
missing data
(ping)
4 5Upload more
missing data
6Download tools/scripts
(ping)
7Run tools/scripts
Upload results of tools/scripts8
Confidential –Oracle Internal
Multiple
iterations &
pings during
SR resolution

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |10
Lots of Pings
Support Experience Before TFA
1Download all files to laptop or view separately in ISDE
End of shift handover write-ups / discussion
after which the new analysts frequently asked
for some of the same data already uploaded
Navigate 100s of log files to figure out problem areas
3Search SR’s, bugs, MOS notes to find solutions
4
Oracle
Support
2Multiple
iterations &
pings during
SR resolution
Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |Confidential –Oracle Internal11
Customer Experience Before ORAchk/EXAchk
Oracle Databases
“Hi Oracle Support, my database just fell over”
“ahhyer….we published a
note on that a while ago….
…didn’t you see it?”
“hmmmmm”

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |12
Experience Today With TFA & ORAchk/EXAchk
TFA UI (TFA Web)
@ Oracle
Add advice
to SR1
Diagnose SR &
Recommend
Solution
2
Detect
Issue
3
Notification
of Issues5
Trim, Capture,
Package &
Optionally
Upload
Diagnostics
6
Integrate with AHF
4
OthersCollection
Manager
Integrate & Display
Health Checks Results2
Auto
Proactive
Health checks
1
TFA /
ORAchk/EXAchk
@ Customer

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |13
Experience Today With TFA & ORAchk/EXAchk
TFA UI (TFA Web)
@ Oracle
Add advice
to SR1
Diagnose SR &
Recommend
Solution
2
Detect
Issue
3
5
Trim, Capture,
Package &
Optionally
Upload
Diagnostics
6
Integrate with AHF
4
OthersCollection
Manager
Integrate & Display
Health Checks Results2
@ Customer
Notification
of Issues
TFA /
ORAchk/EXAchk
Auto
Proactive
Health checks
1

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |14
Experience Today With TFA & ORAchk/EXAchk
TFA UI (TFA Web)
@ Oracle
Add advice
to SR1
Diagnose SR &
Recommend
Solution
2
5
Trim, Capture,
Package &
Optionally
Upload
Diagnostics
6
OthersCollection
Manager
Integrate & Display
Health Checks Results2
Detect
Issue
3
Integrate with AHF
4
@ Customer
Notification
of Issues
TFA /
ORAchk/EXAchk
Auto
Proactive
Health checks
1

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |15
Experience Today With TFA & ORAchk/EXAchk
TFA UI (TFA Web)
@ Oracle
Add advice
to SR1
Diagnose SR &
Recommend
Solution
2
1
Detect
Issue
3
Trim, Capture,
Package &
Optionally
Upload
Diagnostics
6
Integrate with AHF
4
OthersCollection
Manager
Integrate & Display
Health Checks Results2
Notification
of Issues5
@ Customer
TFA /
ORAchk/EXAchk
Auto
Proactive
Health checks

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |16
Experience Today With TFA & ORAchk/EXAchk
TFA UI (TFA Web)
@ Oracle
Add advice
to SR1
Diagnose SR &
Recommend
Solution
2
1
Detect
Issue
3
5
Integrate with AHF
4
OthersCollection
Manager
Integrate & Display
Health Checks Results2
Trim, Capture,
Package &
Optionally
Upload
Diagnostics
6
@ Customer
Notification
of Issues
TFA /
ORAchk/EXAchk
Auto
Proactive
Health checks

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |17
Experience Today With TFA & ORAchk/EXAchk
TFA UI (TFA Web)
@ Customer
Diagnose SR &
Recommend
Solution
2
1
Detect
Issue
3
5
Trim, Capture,
Package &
Optionally
Upload
Diagnostics
6
Integrate with AHF
4
OthersCollection
Manager
Integrate & Display
Health Checks Results2
@ Oracle
Add advice
to SR1
Notification
of Issues
TFA /
ORAchk/EXAchk
Auto
Proactive
Health checks

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |18
Experience Today With TFA & ORAchk/EXAchk
@ Customer@ Oracle
Add advice
to SR1
1
Detect
Issue
3
5
Trim, Capture,
Package &
Optionally
Upload
Diagnostics
6
Integrate with AHF
4
OthersCollection
Manager
Integrate & Display
Health Checks Results2
TFA UI (TFA Web)
Diagnose SR &
Recommend
Solution
2
Notification
of Issues
TFA /
ORAchk/EXAchk
Auto
Proactive
Health checks

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Overview & History
Installation and Configuration
Reactive Usage
Proactive Usage
Centralized Usage
1
2
3
4
5
19

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
•Quarterly release cycle
–Follow similar release number formatting to DB
–Current release 18.3.0
•Installed by default for Grid Infrastructure
•Available for install from database homes
•Updated via Release Updates
•Also available on My Oracle Support (MOS):
Doc 1513912.1
–Supported to patch GI Homes
20
Getting TFA (Which Includes ORAchk/EXAchk)

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Installation / Upgrade Using My Oracle Support Download
Continuous Service Mode (Preferred)
1.Transfer zip to required machine
2.Unzip
3.Executeself extracting install script as root user
•Will install/upgrade on all cluster nodes
•Will auto discover relevant Oracle Software & Exadata Storage Servers
•Will start monitoring all discovered items for significant events & collect
diagnostics when necessary
21
./installTFA-<platform>

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Installation / Upgrade Using My Oracle Support Download
Standalone Mode Installation
1.Transfer zip to required machine
2.Unzip
3.Execute self extracting install
script as the Oracle Software Owner
22
./installTFA-<platform> -extractto<path>

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Configuration
•No Configuration Required However you can:-
–Change the location where collections are Generated
–Change the amount of space allocated to the Collection Location.
–Set up email notifications
–Configure Log retention and purging.
–Many More Options Documented in the User Guide.
23
tfactlset reposizeMB=20480

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Overview & History
Installation and Configuration
Reactive Usage
Proactive Usage
Centralized Usage
1
2
3
4
5
24

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Upload diagnostic
collection to Oracle
Support
25
Reactive Usage
Oracle Grid Infrastructure
& Databases
Oracle
Support
TFA
4
1
Find events
2
Diagnose
with DB
tools
3
Perform
diagnostic
collection

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Command line
•Specify all command options at the
command line
Shell
1.Set and change context
2.Run commands from within the shell
26
Menu
•Select menu navigation options then
choose the command you want to run
REST
•Invoke commands over HTTPS
TFA Command Interfaces
tfactl <command>tfactl
tfaclt> database MyDB
MyDBtfactl > oratop
tfactl menutfactl rest -start
https://host:port/ords/{api}

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
TFA Utilities To Detect and Analyze Issues
Tool Description
ORAchkor
EXAchk
Provides health checks for the Oracle stack.
Oracle Trace File Analyzer will install: -
•EXAchk for Engineered Systems, see document 1070954.1 for more details
•ORAchk for all non-Engineered Systems, see document 1268927.2for more details
OSWatcherOswatcherCollects and archives OS metrics. These are useful for instance or node evictions
& performance Issues. See document 301137.1 for more details
oratopProvides near real-time database monitoring. See document 1500864.1for more details.
alertsummaryProvides summary of events for one or more database or ASM alert files from all nodes
lsLists all files TFA knows about for a given file name pattern across all nodes
pstackGenerate process stack for specified processes across all nodes
grepSearch alert or trace files with a given database and file name pattern, for a search string.
summaryProvides high level summary of the configuration
27
tfactl <tool>

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
TFA Utilities To Detect and Analyze Issues
Tool Description
viOpens alert or trace files for viewing a given database and file name pattern in the vi editor
tailRuns a tail on an alert or trace files for a given database and file name pattern
paramShows all database and OS parameters that match a specified pattern
dbglevelSets and unsets multiple CRS trace levels with one command
historyShows the shell history for the tfactl shell
changesReports changes in the system setup over a given time period. This includes database
parameters, OS parameters and patches applied
calogReports major events from the Cluster Event log
eventsReports warnings and errors seen in the logs
managelogsShows disk space usage and purges ADR log and trace files
psFinds processes
triageSummarize oswatcher/exawatcherdata
28

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Collecting Diagnostics with TFA
Standard DiagCollectionTargeted DiagCollection via SRDC
1.Run
2.Upload resulting zip file to SR
tfactl diagcollect –srdc<srdc>1.Run
OR
Run
OR
Run
2.Upload resulting zip file to SR
tfactl diagcollect–last <n><d>|<h>
tfactl diagcollect–from <date> -to <time>
tfactldiagcollect
29

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Manual Method
1.Generate ADDM reviewing Document 1680075.1(multiple steps)
2.Identify “good” and “problem” periods and gather AWR reviewing
Document 1903158.1(multiple steps)
3.Generate AWR compare report (awrddrpt.sql) using “good” and
“problem” periods
4.Generate ASH report for “good” and “problem” periods reviewing
Document 1903145.1(multiple steps)
5.Collect OSWatcher data reviewing Document 301137.1(multiple
steps)
6.Collect Hang Analyze output at Level 4
7.Generate SQL Healthcheckfor problem SQL id using Document
1366133.1(multiple steps)
8.Run support provided sqlscripts –Log File sync diagnostic output using
Document 1064487.1(multiple steps)
9.Check alert.log if there are any errors during the “problem” period
10.Find any trace files generated during the “problem” period
11.Collate and upload all the above files/outputs to SR
Automated One Command TFA SRDC
1.Run
30
Targeted Diagnostics –Service Request Data Collections (SRDCs)
tfactl diagcollect –srdcdbperf
[-sr<sr_number>]
Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Detect and Collect
31Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Use ‘tfactl’ to check for recent Errors
bash-4.1# tfactlevents
Output from host : myserver69
INFO :2
ERROR :2
WARNING :0
Event Timeline:
[Oct/18/2018 02:38:25.000]: [db.ogg11204.ogg112041]: Incident details in:
/scratch/app/oradb/diag/rdbms/ogg11204/ogg112041/incident/incdir_102702/ogg112041_ora_5001_i102
702.trc
[Oct/18/2018 02:38:25.000]: [db.ogg11204.ogg112041]: ORA-00600: internal error code, arguments:
[ksprcvsp2], [1596993584], [], [], [], [], [], [], [], [], [], []
[Oct/18/2018 02:38:37.000]: [db.ogg11204.ogg112041]: Incident details in:
/scratch/app/oradb/diag/rdbms/ogg11204/ogg112041/incident/incdir_102703/ogg112041_ora_5001_i102
703.trc
[Oct/18/2018 02:38:37.000]: [db.ogg11204.ogg112041]: ORA-00600: internal error code, arguments:
[ktfbtgex-7], [1015817], [1024], [1015816], [], [], [], [], [], [], [], []
32

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Check to see if a change may have caused the issue ?
-bash-4.1# tfactl changes
Output from host : myserver69
------------------------------
[Oct/17/2018 04:54:15.397]: [RDBMS.myDB1]: Parameter: parallel_max_servers: Value: 8 => 16
[Oct/17/2018 05:12:13.344]: [RDBMS.myDB1]: Parameter: log_archive_dest_1: Value: /var=> /opt
33

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Metadata search capability
•All metadata stored in the TFA index is searchable:
•Searching for all events for a database between certain dates:
34
tfactl search -showdatatypes|-json[json_details]
tfactl search -json
‘{
"data_type":"event",
"content":"oracle",
"database":"rac11g",
"from":“10/01/2018 00:00:00",
"to":"10/21/2018 00:00:00"
}’

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Metadata search capability
•Listing all index events:
•Listing all available datatypes:
35
tfactl search -json‘{"data_type":"event"}’
tfactl search -showdatatypes

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Collect ORA-00600 SRDC
bash-4.1$ ./tfactldiagcollect-srdcORA-00600
Enter the time of the ORA-00600 [YYYY-MM-DD HH24:MI:SS,=ALL] :
Enter the Database Name [=ALL] :
1. Oct/18/2018 02:38:37 : [ogg11204] ORA-00600: internal error code, arguments: [ktfbtgex-7], [1015817],
[1024], [1015816], [], [], [], [], [], [], [], []
2. Oct/18/2018 02:38:25 : [ogg11204] ORA-00600: internal error code, arguments: [ksprcvsp2],
[1596993584], [], [], [], [], [], [], [], [], [], []
Please choose the event : 1-2 [1]
36

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Collect ORA-00600 SRDC
Selected value is : 1 ( Oct/18/2018 02:38:37 )
Scripts to be run by this srdc: ipspackrdahcve1210 rdahcve1120 rdahcve1110
Components included in this srdc: OS CRS DATABASE
Collecting data for local node(s)
Scanning files from Oct/17/2018 20:38:37 to Oct/18/2018 08:38:37
WARNING: End time entered is after the current system time.
Collection Id : 20181018032231myserver69
Detailed Logging at :
/scratch/app/oragrid/tfa/repository/srdc_ora600_collection_Thu_Oct_18_03_22_31_PDT_2018_node_loca
l/diagcollect_20181018032231_myserver69.log
2018/10/18 03:22:36 PDT : NOTE : Any file or directory name containing the string .com will be renamed to
replace .com with dotcom
37

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Collect ORA-00600 SRDC
.-----------------------------------------------------.
|Collection Summary |
+----------------+---------------+---------+-------+
| Host | Status | Size | Time |
+----------------+---------------+--------+--------+
| myserver69| Completed| 2MB | 97s |
'-----------------+---------------+--------+---------'
Logs are being collected to:
/scratch/app/oragrid/tfa/repository/srdc_ora600_collection_Thu_Oct_18_03_22_31_PDT_2018_node_local
/scratch/app/oragrid/tfa/repository/srdc_ora600_collection_Thu_Oct_18_03_22_31_PDT_2018_node_local
/myserver69.tfa_srdc_ora600_Thu_Oct_18_03_22_31_PDT_2018.zip
38

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Analyze
•Each tool can be run using tfactl in shell mode
•Start tfactl shell with
•Run a tool with the tool name
1.Where necessary set context with database <dbname>
2.Then run tool
3.Clear context with database
39
tfactl
tfactl > database MyDB
MyDB tfactl > database
tfactl > orachk
MyDB tfactl > oratop

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Manage logs
40Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Automatic Database Log Purge
•TFA can automatically purge database logs
–OFF by default
–Except on a Domain Service Cluster (DSC),
which it is ON by default
•Turn auto purging on or off:
•Will remove logs older than 30 days
–configurable with:
•Purging runs every 60 minutes
–configurable with:
41
tfactl set manageLogsAutoPurge=<ON|OFF>
tfactl set manageLogsAutoPurgePolicyAge=<n><d|h>
tfactl set manageLogsAutoPurgeInterval=<minutes>

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Manual Database Log Purge
•TFA can manage ADR log and trace files
–Show disk space usage of individual diagnostic destinations
–Purge these file types based on diagnostic location and or age:
•"ALERT“, "INCIDENT“, "TRACE“, "CDUMP“, "HM“, "UTSCDMP“, "LOG“
tfactl managelogs<options>
Runs as the ADR home
owner. So will only be able
to purge files this owner
has permission to delete
Option Description
–show usageShows disk space usage per diagnostic directory for both GI and database logs
-show variation –older <n><m|h|d>Use to determine per directory disk space growth.
Shows the disk usage variation for the specified period per directory.
-purge –older <n><m|h|d>Remove all ADR files under the GI_BASE directory, which are older than the time specified
–gi Restrict command to only diagnostic files under the GI_BASE
–database [all | dbname]Restrict command to only diagnostic files under the database directory. Defaults to all,
alternatively specify a database name
-dryrun Use with –purge to estimate how many files will be affected and how much disk space will be
freed by a potential purge command.
May take a while for a
large number of files
42

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |43
Manual Database Log Purge
tfactl managelogs–show usagetfactl managelogs–show variation –older <n><m|h|d>
Use -gito only
show grid
infrastructure
Use –databaseto only
show database

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |44
Manual Database Log Purge
tfactl managelogs–purge –older n<m|h|d> -dryruntfactl managelogs–purge –older n<m|h|d>
Use –dryrun
for a “what if”

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Summary
45Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Summary
-bash-4.1# tfactlsummary
Executing Summary in Parallel on Following Nodes:
Node : myserver6969
Node : myserver70
Node : myserver71
LOGFILE LOCATION :
/scratch/app/oragrid/tfa/repository/suptools/myserv
er69/summary/root/20181204025828/log/summary
_command_20181204025828_myserver69_8963.log
Component Specific Summary collection :
-Collecting CRS details ... Done.
-Collecting ASM details ... Done.
-Collecting ACFS details ... Done.
-Collecting DATABASE details ... Done.
-Collecting PATCH details ... Done.
-Collecting LISTENER details ... Done.
-Collecting NETWORK details ... Done.
-Collecting OS details ... Done.
-Collecting TFA details ... Done.
-Collecting SUMMARY details ... Done.
46

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Summary
Remote Summary Data Collection : In-Progress -Please wait ...
-Data Collection From Node -myserver70 .. Done.
-Data Collection From Node -myserver71 .. Done.
Prepare ClusterwideSummary Overview ... Done
cluster_status_summary
COMPONENT STATUS DETAILS
+-----------+---------+---------------------------------------------------------------------------------------------------+
CRS OK .-----------------------------------------------------------------------.
| CRS_SERVER_STATUS : ONLINE |
| CRS_STATE : ONLINE |
| CRS_INTEGRITY_CHECK : PASS |
| CRS_RESOURCE_STATUS : OFFLINE Resources Found|
'-----------------------------------------------------------------------'
47

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Summary
ASM PROBLEM .-----------------------------------------------------------------------------------------.
| ASM_DISK_SIZE_STATUS : WARNING -Available Size < 20% |
| ASM_BLOCK_STATUS : PASS |
| ASM_CHAIN_STATUS : PASS |
| ASM_INCIDENTS : PASS |
| ASM_PROBLEMS : FAIL |
'-----------------------------------------------------------------------------------------'
ACFS OFFLINE .---------------------------------.
| ACFS_STATUS : OFFLINE |
'---------------------------------'
48

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Summary
DATABASE PROBLEM
.--------------------------------------------------------------------------------------------------------------------------------------------------.
| ORACLE_HOME_DETAILS |ORACLE_HOME_NAME |
+----------------------------------------------------------------------------------------------------------------+-------------------------------+
| .--------------------------------------------------------------------------------------------------------------.| OraDb11g_home1 |
| | PROBLEMS | INCIDENTS | DB_BLOCKS | DATABASE_NAME | STATUS | DB_CHAINS | |
| +-----------+-----------+----------------+------------------+---------------+----------------+ | |
| | PASS | PASS | PROBLEM | apxcmupg| PROBLEM | PROBLEM | | |
| '------------+-----------+----------------+------------------+---------------+----------------’ | |
'-----------------------------------------------------------------------------------------------------------------+--------------------------------'
PATCH OK
.----------------------------------------------------------------------------------.
| CRS_PATCH_CONSISTENCY_ACROSS_NODES : OK |
| DATABASE_PATCH_CONSISTENCY_ACROSS_NODES : OK |
'----------------------------------------------------------------------------------'
LISTENER OK
.--------------------------------.
| LISTNER_STATUS : OK |
'--------------------------------' 49

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Summary
NETWORK PROBLEM
.-------------------------------------------------------------.
| NODE_APPLICATION_CHECK : FAIL |
| NODE_CONNECTIVITY : FAIL |
| NTP_DAEMON_SLEW_OPTION_CHECK : FAIL |
'-------------------------------------------------------------'
OS OK
.--------------------------------------.
| MEM_USAGE_STATUS : OK |
'--------------------------------------'
TFA OK
.----------------------------------.
| TFA_STATUS : RUNNING |
'----------------------------------'
SUMMARY OK
.-----------------------------------------------------------.
| SUMMARY_EXECUTION_TIME : 0H:1M:48S |
'-----------------------------------------------------------'
+-----------+---------+---------------------------------------------------------------------------------------------------+ 50

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Summary
### Entering in to SUMMARY Command-Line Interface ###
tfactl_summary>list
Components : Select Component -select [component_number|component_name]
1 => overview
2 => crs_overview
3 => asm_overview
4 => acfs_overview
5 => database_overview
6 => patch_overview
7 => listener_overview
8 => network_overview
9 => os_overview
10 => tfa_overview
11 => summary_overview
tfactl_summary>
51

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Find events
52Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
alertsummary
-bash-4.1# tfactlalertsummary
Output from host : myserver69
------------------------------
Reading /scratch/app/oradb/diag/rdbms/apxcmupg/apxcmupg_2/trace/alert_apxcmupg_2.log
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
Oct 29 16:19:37 Database started
Summary: Ora-600=0, Ora-7445=0, Ora-700=0
~~~~~~~
Warning: Only FATAL errors reported
Warning: These errors were seen and NOT reported
Ora-12012 Ora-04063 Ora-06508 Ora-06512 Ora-15064 Ora-03113 Ora-15080
Ora-27061 Ora-00202 Ora-15081 Ora-27072 Ora-00206 Ora-00221 Ora-00345
53

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
alertsummary
Reading /scratch/app/oradb/diag/rdbms/ogg11204/ogg112041/trace/alert_ogg112041.log
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
------------------------------------------------------------------------
Aug 01 08:14:48 Database started
Sep 13 07:08:40 Ora-00700 [kgerev1] ogg112041_ora_31177.trc
Sep 13 07:08:40 Ora-00600 [] ogg112041_ora_31177.trc
Sep 13 08:09:49 Ora-00600 [ktfbtgex-7] ogg112041_ora_8881.trc
Sep 13 08:38:43 Ora-00600 [ktfbtgex-7] ogg112041_ora_24227.trc
Sep 13 10:17:18 Ora-00600 [ktfbtgex-7] ogg112041_ora_10150.trc
Sep 15 04:27:17 SystemStateDumped ogg112041_diag_4271_20180915042717.trc
------------------------------------------------------------------------
Sep 18 14:25:15 Database started
Oct 18 02:38:25 Ora-00600 [ksprcvsp2] ogg112041_ora_5001.trc
Oct 18 02:38:37 Ora-00600 [ktfbtgex-7] ogg112041_ora_5001.trc
54

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
alertsummary
Sep 18 14:25:15 Database started
Oct 18 02:38:25 Ora-00600 [ksprcvsp2] ogg112041_ora_5001.trc
Oct 18 02:38:37 Ora-00600 [ktfbtgex-7] ogg112041_ora_5001.trc
------------------------------------------------------------------------
Summary: Ora-600=8, Ora-7445=0, Ora-700=1
~~~~~~~
Warning: Only FATAL errors reported
Warning: These errors were seen and NOT reported
Ora-00202 Ora-15081 Ora-27072 Ora-15080 Ora-27061 Ora-00206 Ora-00221
Ora-19815 Ora-29913 Ora-29400
Reading /scratch/app/oragrid/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
55

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
calog
56Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
calog
#tfactlcalog
2018-12-05 10:36:56.301000 : (:CLSGN01660:) CLSNS-00017: invalid status: 3
CLSGN-00524: NS query for subdomain "myserver370044.us.oracle.com" failed.
An error was received from an operating system API:
CLSU-00107: operating system function: getaddrinfo; failed with error data: 0; at location: SCLSIN01
CLSU-00101: operating system error message: Error 0
CLSU-00104: additional error information: node name or service name not known
CLSGN-00178: Resolution of name "GNSTESTHOST.myserver370044.us.oracle.com" failed. :
15426651416834114/2275/1 :
2018-12-05 10:46:58.421000 : (:CLSGN01660:) CLSNS-00017: invalid status: 3
CLSGN-00524: NS query for subdomain "myserver370044.us.oracle.com" failed.
An error was received from an operating system API:
CLSU-00107: operating system function: getaddrinfo; failed with error data: 0; at location: SCLSIN01
57

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
ls files
58Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
ls files
tfactlls alert_
Output from host : myserver65
------------------------------
/u01/app/crsusr/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log
Output from host : myserver66
------------------------------
/u01/app/crsusr/diag/asm/+asm/+ASM2/trace/alert_+ASM2.log
/u02/app/racusr/diag/rdbms/ratc1c/ratc1c_1/trace/alert_ratc1c_1.log
Output from host : myserver67
------------------------------
/u01/app/crsusr/diag/asm/+asm/+ASM3/trace/alert_+ASM3.log
/u01/app/crsusr/diag/rdbms/_mgmtdb/-MGMTDB/trace/alert_-MGMTDB.log
/u02/app/racusr/diag/rdbms/ratc1c/ratc1c_2/trace/alert_ratc1c_2.log
59

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
grep files
60Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
grep files
# tfactlgrep 'ORA-15130: diskgroup"MGMT"' alert_
Output from host : myserver65
------------------------------
Searching 'ORA-15130: diskgroup"MGMT' in alert_
Searching /u01/app/crsusr/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
13087:ORA-15130: diskgroup"MGMT" is being dismounted
13917:ORA-15130: diskgroup"MGMT" is being dismounted
15677:ORA-15130: diskgroup"MGMT" is being dismounted
61

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
grep files
Output from host : myserver66
------------------------------
Searching 'ORA-15130: diskgroupMGMT' in alert_
Searching /u01/app/crsusr/diag/asm/+asm/+ASM2/trace/alert_+ASM2.log
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
Searching /u02/app/racusr/diag/rdbms/ratc1c/ratc1c_1/trace/alert_ratc1c_1.log
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
62

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
grep files
Output from host : myserver67
------------------------------
Searching 'ORA-15130: diskgroupMGMT' in alert_
Searching /u01/app/crsusr/diag/asm/+asm/+ASM3/trace/alert_+ASM3.log
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
Searching /u01/app/crsusr/diag/rdbms/_mgmtdb/-MGMTDB/trace/alert_-MGMTDB.log
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
Searching /u02/app/racusr/diag/rdbms/ratc1c/ratc1c_2/trace/alert_ratc1c_2.log
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
63

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
tail files
64Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
tail files
-bash-4.1# tfactltail alert
Output from host : myserver69
------------------------------
==> /scratch/app/11.2.0.4/grid/log/myserver69/alertmyserver69.log <==
2018-11-25 23:28:22.532:
[ctssd(5630)]CRS-2409:The clock on host myserver69 is not synchronous with the mean cluster time. No
action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2018-11-25 23:58:22.964:
[ctssd(5630)]CRS-2409:The clock on host myserver69 is not synchronous with the mean cluster time. No
action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2018-11-26 00:28:23.395:
65

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
tail files
==> /scratch/app/oradb/diag/rdbms/apxcmupg/apxcmupg_2/trace/alert_apxcmupg_2.log <==
Sun Nov 25 06:00:00 2018
VKRM started with pid=82, OS id=4903
Sun Nov 25 06:00:02 2018
Begin automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK"
Sun Nov 25 06:00:37 2018
End automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK"
Sun Nov 25 23:00:28 2018
Thread 2 advanced to log sequence 759 (LGWR switch)
Current log# 3 seq# 759 mem# 0: +DATA/apxcmupg/onlinelog/group_3.289.917164707
Current log# 3 seq# 759 mem# 1: +FRA/apxcmupg/onlinelog/group_3.289.917164707
66

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
tail files
==> /scratch/app/oradb/diag/rdbms/ogg11204/ogg112041/trace/alert_ogg112041.log <==
Clearing Resource Manager plan via parameter
Sun Nov 25 05:59:59 2018
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Sun Nov 25 05:59:59 2018
Starting background process VKRM
Sun Nov 25 05:59:59 2018
VKRM started with pid=36, OS id=4901
Sun Nov 25 22:00:31 2018
Thread 1 advanced to log sequence 305 (LGWR switch)
Current log# 1 seq# 305 mem# 0: +DATA/ogg11204/redo01.log
67

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
tail files
==> /scratch/app/oragrid/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log <==
Thu Nov 22 04:42:22 2018
NOTE: [ocrcheck.bin@myserver69 (TNS V1-V3) 2323] opening OCR file
Fri Nov 23 01:05:39 2018
NOTE: [ocrcheck.bin@myserver69 (TNS V1-V3) 16591] opening OCR file
Fri Nov 23 01:05:41 2018
NOTE: [ocrcheck.bin@myserver69 (TNS V1-V3) 16603] opening OCR file
Fri Nov 23 01:21:12 2018
NOTE: [ocrcheck.bin@myserver69 (TNS V1-V3) 1803] opening OCR file
Fri Nov 23 01:21:12 2018
NOTE: [ocrcheck.bin@myserver69 (TNS V1-V3) 1816] opening OCR file
68

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
vi files
69Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
vi files
-bash-4.1# tfactlvi alert
2018-11-25 19:58:19.481:
[ctssd(5630)]CRS-2409:The clock on host myserver69 is not synchronous with the mean cluster time. No
action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2018-11-25 20:28:19.911:
[ctssd(5630)]CRS-2409:The clock on host myserver69 is not synchronous with the mean cluster time. No
action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2018-11-25 20:58:20.346:
[ctssd(5630)]CRS-2409:The clock on host myserver69 is not synchronous with the mean cluster time. No
action has been taken as the Cluster Time Synchronization Service is running in observer mode.
70

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
tfactlshell history
71Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
tfactlShell History
tfactl> history
05 Dec 18 02:37:17 PST INFO Started session
05 Dec 18 02:37:35 PST COMMAND paramkernel.panic
05 Dec 18 02:37:45 PST COMMAND history
tfactl>
72

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Monitor Database
performance
73Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |74
oratop(Support Tools Bundle)
Near Real-Time Database Monitoring
•Single instance & RAC
•Monitoring current database activities
•Database performance
•Identifying contentions and bottleneck

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
oratop
<OratopOptions>:
-d : real-time (RT) wait events, section 3 (default is Cumulative)
-k : FILE#:BLOCK#, section 4 ltis (EVENT/LATCH)
-m : MODULE/ACTION, section 4 (default is USERNAME/PROGRAM)
-s : SQL mode, section 4 (default is process mode)
-c : database service mode (default is connect string)
-f : detailed format, 132 columns (default: standard, 80 columns)
-b : batch mode (default is text-based user interface)
-n : maximum number of iterations (requires number)
-i: interval delay, requires value in seconds (default: 5s)
e.g:
tfactloratop-database testdb1
tfactloratop-database testdb1 -bn1
75

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
-bash-4.1# tfactloratop-database ogg11204
oratop
76

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
-bash-4.1# tfactloratop-database ogg11204 -d
oratop
77

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
-bash-4.1# tfactloratop-database ogg11204 -s
oratop
78

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |79
Procwatcher(Support Tools Bundle)
Monitor & Examine Database Processes
•Single instance & RAC
•Generates session wait, lock and latch reports as well as call stacks
from any problem process(s)
•Ability to collect stack traces of specific processes using Oracle Tools
and OS Debuggers
•Typically reduces SR resolution for performance related issues
•Runs on ALL major UNIX Platforms
•MOS Note: 459694.1–ProcwatcherInstall Guide

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Procwatcher
-bash-4.1# tfactlprwstart
Mon Nov 26 05:01:09 PST 2018: Starting Procwatcheras user root
Mon Nov 26 05:01:09 PST 2018: Thank you for using Procwatcher. :-)
Mon Nov 26 05:01:09 PST 2018: Please add a comment to Oracle Support Note 459694.1
Mon Nov 26 05:01:09 PST 2018: if you have any comments, suggestions, or issues with this tool.
Procwatcherfiles will be written to: /scratch/app/oragrid/tfa/repository/suptools/prw/root
Mon Nov 26 05:01:09 PST 2018: Started Procwatcher
80

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Procwatcher
-bash-4.1# tfactlprwlog runtime
Mon Nov 26 05:01:44 PST 2018: ..SQL: Running SQLvwaitchains.sqlon SID ASM1
Mon Nov 26 05:01:49 PST 2018: Saving SQL report data for SID apxcmupg_2
Mon Nov 26 05:01:50 PST 2018: No contention found on DB instance apxcmupg_2, no additional data
collection needed
Mon Nov 26 05:01:50 PST 2018: Saving SQL report data for SID ogg112041
Mon Nov 26 05:01:51 PST 2018: No contention found on DB instance ogg112041, no additional data
collection needed
Mon Nov 26 05:01:51 PST 2018: Saving SQL report data for SID ASM1
Mon Nov 26 05:01:52 PST 2018: No contention found on DB instance ASM1, no additional data collection
needed
Mon Nov 26 05:01:55 PST 2018: SQL collection complete after 44 seconds (10 SQLs -average seconds: 4)
Mon Nov 26 05:01:55 PST 2018: Cycle complete after 44 seconds
Mon Nov 26 05:01:55 PST 2018: Sleeping 16 seconds until time to run again per the INTERVAL setting (60
seconds)
################################################################################
81

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Procwatcher
Mon Nov 26 05:02:12 PST 2018: Collecting SQL Data for SID apxcmupg_2
Mon Nov 26 05:02:14 PST 2018: ..SQL: Running SQLvwaitchains.sqlon SID apxcmupg_2
Mon Nov 26 05:02:17 PST 2018: Collecting SQL Data for SID ogg112041
Mon Nov 26 05:02:18 PST 2018: ..SQL: Running SQLvwaitchains.sqlon SID ogg112041
Mon Nov 26 05:02:20 PST 2018: Collecting SQL Data for SID ASM1
Mon Nov 26 05:02:23 PST 2018: ..SQL: Running SQLvwaitchains.sqlon SID ASM1
Mon Nov 26 05:02:26 PST 2018: Saving SQL report data for SID apxcmupg_2
Mon Nov 26 05:02:27 PST 2018: No contention found on DB instance apxcmupg_2, no additional data
collection needed
Mon Nov 26 05:02:27 PST 2018: Saving SQL report data for SID ogg112041
Mon Nov 26 05:02:29 PST 2018: No contention found on DB instance ogg112041, no additional data
collection needed
Mon Nov 26 05:02:29 PST 2018: Saving SQL report data for SID ASM1
Mon Nov 26 05:02:30 PST 2018: No contention found on DB instance ASM1, no additional data collection
needed
Mon Nov 26 05:02:33 PST 2018: Sleeping 38 seconds until time to run again per the INTERVAL setting (60
seconds)
82

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Monitor processes
83Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
psof Processes
-bash-4.1# tfactlpslmd
Output from host : myserver69
------------------------------
oragrid6143 10 Oct29 ?01:13:45 asm_lmd0_+ASM1
oradb7903 10 Oct29 ?00:55:38 ora_lmd0_apxcmupg_2
oradb7905 10 Oct29 ?01:04:42 ora_lmd0_ogg112041
Output from host : myserver70
------------------------------
oragrid6089 10 Oct29 ?01:16:48 asm_lmd0_+ASM2
oradb7035 10 Oct29 ?01:03:55 ora_lmd0_ogg112042
Output from host : myserver71
------------------------------
oragrid8343 10 Dec03 ?00:03:06 asm_lmd0_+ASM3
84

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
pstackof Processes
-bash-4.1# tfactlpstacklmd
Output from host : myserver69
------------------------------
# pstackoutput for pid: 6143
#0 0x000000341cedf0d8 in poll () from
/lib64/libc.so.6
#1 0x00007fcd83fd38a8 in ssskgxp_poll() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#2 0x00007fcd83fcbec2 in sskgxp_selectex() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#3 0x00007fcd83f78b4a in skgxpiwait() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#4 0x00007fcd83f7720a in skgxpwaiti() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#5 0x00007fcd83fb79fe in skgxpwait() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#6 0x0000000003a27525 in ksxpwait()
#7 0x00000000082e3bc9 in ksliwat()
#8 0x00000000082e072d in kslwaitctx()
#9 0x00000000082ddc3b in kslwait()
#10 0x0000000003a568b3 in ksxprcv_int()
#11 0x0000000003a550cc in ksxprcvimd()
#12 0x00000000041ed075 in kjctr_rksxp()
#13 0x00000000041f0633 in kjctrcv()
#14 0x00000000041d04c0 in kjcsrmg()
#15 0x0000000004265c1f in kjmdm()
#16 0x00000000021c941f in ksbrdp()
#17 0x00000000023efdc7 in opirip()
#18 0x000000000169df21 in opidrv()
#19 0x0000000001c7591b in sou2o ()
#20 0x0000000000853206 in opimai_real()
#21 0x0000000001c7bc39 in ssthrdmain()
#22 0x00000000008530fd in main ()
85

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Monitor Processes
# pstackoutput for pid: 7903
#0 0x000000341cedf0d8 in poll () from /lib64/libc.so.6
#1 0x00007fd85dc678a8 in ssskgxp_poll() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#2 0x00007fd85dc5fec2 in sskgxp_selectex() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#3 0x00007fd85dc0cb4a in skgxpiwait() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#4 0x00007fd85dc0b20a in skgxpwaiti() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#5 0x00007fd85dc4b9fe in skgxpwait() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#6 0x0000000004ebacf5 in ksxpwait()
#7 0x00000000094daff9 in ksliwat()
#8 0x00000000094d7b5d in kslwaitctx()
#9 0x00000000094d506b in kslwait()
#10 0x0000000004eea083 in ksxprcv_int()
#11 0x0000000004ee889c in ksxprcvimd()
#12 0x00000000055d6049 in kjctr_rksxp()
#13 0x00000000055d9607 in kjctrcv()
#14 0x00000000055b9494 in kjcsrmg()
#15 0x000000000564ebf3 in kjmdm()
#16 0x00000000026abbe3 in ksbrdp()
#17 0x0000000002910a9b in opirip()
#18 0x0000000001afd845 in opidrv()
#19 0x00000000020db5cf in sou2o ()
#20 0x0000000000a29ab6 in opimai_real()
#21 0x00000000020e18ed in ssthrdmain()
#22 0x0000000000a299ad in main ()
86

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Monitor Processes
# pstackoutput for pid: 7905
#0 0x000000341cedf0d8 in poll () from /lib64/libc.so.6
#1 0x00007ff6260528a8 in ssskgxp_poll() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#2 0x00007ff62604aec2 in sskgxp_selectex() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#3 0x00007ff625ff7b4a in skgxpiwait() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#4 0x00007ff625ff620a in skgxpwaiti() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#5 0x00007ff6260369fe in skgxpwait() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#6 0x0000000004ebacf5 in ksxpwait()
#7 0x00000000094daff9 in ksliwat()
#8 0x00000000094d7b5d in kslwaitctx()
#9 0x00000000094d506b in kslwait()
#10 0x0000000004eea083 in ksxprcv_int()
#11 0x0000000004ee889c in ksxprcvimd()
#12 0x00000000055d6049 in kjctr_rksxp()
#13 0x00000000055d9607 in kjctrcv()
#14 0x00000000055b9494 in kjcsrmg()
#15 0x000000000564ebf3 in kjmdm()
#16 0x00000000026abbe3 in ksbrdp()
#17 0x0000000002910a9b in opirip()
#18 0x0000000001afd845 in opidrv()
#19 0x00000000020db5cf in sou2o ()
#20 0x0000000000a29ab6 in opimai_real()
#21 0x00000000020e18ed in ssthrdmain()
#22 0x0000000000a299ad in main ()
87

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Monitor Processes
Output from host : myserver70
------------------------------
# pstackoutput for pid: 6089
#0 0x000000369a6df0d8 in poll () from /lib64/libc.so.6
#1 0x00007f85fab708a8 in ssskgxp_poll() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#2 0x00007f85fab68ec2 in sskgxp_selectex() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#3 0x00007f85fab15b4a in skgxpiwait() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#4 0x00007f85fab1420a in skgxpwaiti() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#5 0x00007f85fab549fe in skgxpwait() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#6 0x0000000003a27525 in ksxpwait()
#7 0x00000000082e3bc9 in ksliwat()
#8 0x00000000082e072d in kslwaitctx()
#9 0x00000000082ddc3b in kslwait()
#10 0x0000000003a568b3 in ksxprcv_int()
#11 0x0000000003a550cc in ksxprcvimd()
#12 0x00000000041ed075 in kjctr_rksxp()
#13 0x00000000041f0633 in kjctrcv()
#14 0x00000000041d04c0 in kjcsrmg()
#15 0x0000000004265c1f in kjmdm()
#16 0x00000000021c941f in ksbrdp()
#17 0x00000000023efdc7 in opirip()
#18 0x000000000169df21 in opidrv()
#19 0x0000000001c7591b in sou2o ()
#20 0x0000000000853206 in opimai_real()
#21 0x0000000001c7bc39 in ssthrdmain()
#22 0x00000000008530fd in main ()
88

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Monitor Processes
# pstackoutput for pid: 7035
#0 0x000000369a6df0d8 in poll () from /lib64/libc.so.6
#1 0x00007f648acc88a8 in ssskgxp_poll() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#2 0x00007f648acc0ec2 in sskgxp_selectex() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#3 0x00007f648ac6db4a in skgxpiwait() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#4 0x00007f648ac6c20a in skgxpwaiti() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#5 0x00007f648acac9fe in skgxpwait() from
/scratch/app/oradb/product/11.2.0/dbhome_11204/li
b/libskgxp11.so
#6 0x0000000004ebacf5 in ksxpwait()
#7 0x00000000094daff9 in ksliwat()
#8 0x00000000094d7b5d in kslwaitctx()
#9 0x00000000094d506b in kslwait()
#10 0x0000000004eea083 in ksxprcv_int()
#11 0x0000000004ee889c in ksxprcvimd()
#12 0x00000000055d6049 in kjctr_rksxp()
#13 0x00000000055d9607 in kjctrcv()
#14 0x00000000055b9494 in kjcsrmg()
#15 0x000000000564ebf3 in kjmdm()
#16 0x00000000026abbe3 in ksbrdp()
#17 0x0000000002910a9b in opirip()
#18 0x0000000001afd845 in opidrv()
#19 0x00000000020db5cf in sou2o ()
#20 0x0000000000a29ab6 in opimai_real()
#21 0x00000000020e18ed in ssthrdmain()
#22 0x0000000000a299ad in main ()
89

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Monitor Processes
Output from host : myserver71
------------------------------
# pstackoutput for pid: 8343
#0 0x00007f12631d63c8 in poll () from /lib64/libc.so.6
#1 0x00007f12653b18a8 in ssskgxp_poll() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#2 0x00007f12653a9ec2 in sskgxp_selectex() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#3 0x00007f1265356b4a in skgxpiwait() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#4 0x00007f126535520a in skgxpwaiti() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#5 0x00007f12653959fe in skgxpwait() from
/scratch/app/11.2.0.4/grid/lib/libskgxp11.so
#6 0x0000000003a27525 in ksxpwait()
#7 0x00000000082e3bc9 in ksliwat()
#8 0x00000000082e072d in kslwaitctx()
#9 0x00000000082ddc3b in kslwait()
#10 0x0000000003a568b3 in ksxprcv_int()
#11 0x0000000003a550cc in ksxprcvimd()
#12 0x00000000041ed075 in kjctr_rksxp()
#13 0x00000000041f0633 in kjctrcv()
#14 0x00000000041d04c0 in kjcsrmg()
#15 0x0000000004265c1f in kjmdm()
#16 0x00000000021c941f in ksbrdp()
#17 0x00000000023efdc7 in opirip()
#18 0x000000000169df21 in opidrv()
#19 0x0000000001c7591b in sou2o ()
#20 0x0000000000853206 in opimai_real()
#21 0x0000000001c7bc39 in ssthrdmain()
#22 0x00000000008530fd in main ()
90

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Analyse OS Metrics
91Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |92
OS Watcher (Support Tools Bundle)
Collect & Archive OS Metrics
•Executes standard UNIX utilities (e.g. vmstat, iostat, ps,
etc) on regular intervals
•Built in Analyzer functionality to summarize, graph and
report upon collected metrics
•Output is Required for node reboot and performance
issues
•Simple to install, extremely lightweight
•Runs on ALL platforms (Except Windows)
•MOS Note: 301137.1–OS Watcher Users Guide

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Analyse OS Metrics
-bash-4.1# tfactloswbb
Starting OSW Analyzer V8.1.2
OSWatcherAnalyzer Written by Oracle Centerof
Expertise
Copyright (c) 2017 by Oracle Corporation
Parsing Data. Please Wait...
Scanning file headers for version and platform info...
Parsing file myserver69_iostat_18.11.24.0900.dat ...
Parsing file myserver69_iostat_18.11.24.1000.dat ...
……..
93

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Analyse OS Metrics
Enter 1 to Display CPU Process Queue Graphs
Enter 2 to Display CPU Utilization Graphs
Enter 3 to Display CPU Other Graphs
Enter 4 to Display Memory Graphs
Enter 5 to Display Disk IO Graphs
Enter GC to Generate All CPU Gif Files
Enter GM to Generate All Memory Gif Files
Enter GD to Generate All Disk Gif Files
Enter GN to Generate All Network Gif Files
Enter L to Specify Alternate Location of Gif Directory
Enter Z to Zoom Graph Time Scale (Does not change
analysis dataset)
Enter B to Returns to Baseline Graph Time Scale
(Does not change analysis dataset)
Enter R to Remove Currently Displayed Graphs
Enter X to Export Parsed Data to Flat File
Enter S to Analyze Subset of Data(Changes analysis
dataset including graph time scale)
Enter A to Analyze Data
Enter D to Generate DashBoard
Enter Q to Quit Program
Please Select an Option:1
94

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Analyse OS Metrics
Confidential –Oracle Internal/Restricted/Highly Restricted95
myserver69

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Analyse OS Metrics
Confidential –Oracle Internal/Restricted/Highly Restricted96
myserver69

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Check OS / DB
parameters
97Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Check OS or DB parameters
-bash-4.1# tfactlparamkernel.panic
Output from host : myserver69
.-------------------------------------------------------------.
| OSPARAM |
+------------------------------------------------+----------+
| PARAM | VALUE |
+------------------------------------------------+----------+
| kernel.panic| 60 |
+------------------------------------------------+----------+
| kernel.panic_on_io_nmi| 0 |
+------------------------------------------------+----------+
| kernel.panic_on_oops| 1 |
+------------------------------------------------+----------+
| kernel.panic_on_unrecovered_nmi| 0 |
+------------------------------------------------+----------+
Output from host : myserver70
.-------------------------------------------------------------.
| OSPARAM |
+------------------------------------------------+----------+
| PARAM | VALUE |
+------------------------------------------------+----------+
| kernel.panic| 120 |
+------------------------------------------------+----------+
| kernel.panic_on_io_nmi| 0 |
+------------------------------------------------+----------+
| kernel.panic_on_oops| 1 |
+------------------------------------------------+----------+
| kernel.panic_on_unrecovered_nmi| 0 |
+------------------------------------------------+----------+
98

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Diagnose cluster health
99Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Diagnose cluster health
-bash-4.1# chactlquery diagnosis -dboltpacdb-start "2018-11-26 02:52:50.0" -end "2018-11-26 03:19:15.0"
2018-11-26 01:47:10.0 Database oltpacdbDB Control File IO Performance (oltpacdb_1) [detected]
2018-11-26 01:47:10.0 Database oltpacdbDB Control File IO Performance (oltpacdb_2) [detected]
2018-11-26 02:52:15.0 Database oltpacdbDB CPU Utilization (oltpacdb_2) [detected]
2018-11-26 02:52:50.0 Database oltpacdbDB CPU Utilization (oltpacdb_1) [detected]
2018-11-26 02:59:35.0 Database oltpacdbDB Log File Switch (oltpacdb_1) [detected]
2018-11-26 02:59:45.0 Database oltpacdbDB Log File Switch (oltpacdb_2) [detected]
100

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Diagnose cluster health
Problem: DB Control File IO Performance
Description: CHA has detected that reads or writes to the control files are slower than expected.
Cause: The Cluster Health Advisor (CHA) detected that reads or writes to the control files were slow
because of an increase in disk IO.
The slow control file reads and writes may have an impact on checkpoint and Log Writer (LGWR)
performance.
Action: Separate the control files from other database files and move them to faster disks or Solid State
Devices.
101

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Diagnose cluster health
Problem: DB CPU Utilization
Description: CHA detected larger than expected CPU utilization for this database.
Cause: The Cluster Health Advisor (CHA) detected an increase in database CPU utilization
because of an increase in the database workload.
Action: Identify the CPU intensive queries by using the Automatic Diagnostic and Defect Manager
(ADDM)
and follow the recommendations given there. Limit the number of CPU intensive queries
or relocate sessions to less busymachines. Add CPUs if the CPU capacity is insufficent to support the load
without a performance degradation or effects on other databases.
102

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Diagnose cluster health
Problem: DB Log File Switch
Description: CHA detected that database sessions are waiting longer than expected for log switch
completions.
Cause: The Cluster Health Advisor (CHA) detected high contention during log switches
because the redo log files were small and the redo logs switched frequently.
Action: Increase the size of the redo logs.
103

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Find if anything has
changed
104Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Has anything changed recently?
-bash-4.1# tfactl changes
Output from host : myserver69
------------------------------
[Oct/17/2018 04:54:15.397]: Parameter: fs.aio-nr: Value: 95488 => 97024
[Oct/17/2018 04:54:15.397]: Parameter: fs.inode-nr: Value: 764974131561 => 740744131259
[Oct/17/2018 04:54:15.397]: Parameter: kernel.pty.nr: Value: 2 => 1
[Oct/17/2018 04:54:15.397]: Parameter: kernel.random.entropy_avail: Value: 189 => 158
[Oct/17/2018 04:54:15.397]: Parameter: kernel.random.uuid: Value: 36269877-9bc9-40a3-82e0-
1619865096f2 => 7551c5e7-c59f-40fa-b55f-5bd170e8b1ab
[Oct/17/2018 05:46:15.397]: Parameter: fs.aio-nr: Value: 119680 => 122880
[Oct/17/2018 05:46:15.397]: Parameter: fs.inode-nr: Value: 1580316810036 => 1562320768555
[Oct/17/2018 05:46:15.397]: Parameter: kernel.pty.nr: Value: 19 => 18
[Oct/17/2018 05:46:15.397]: Parameter: kernel.random.uuid: Value: 37cc31aa-ee31-459e-8f2a-
0766b34b1b64 => f5176cdc-6390-415d-882e-02c4cff2ae4e
105

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Has anything changed recently?
Output from host : myserver70
------------------------------
[Oct/17/2018 04:54:15.397]: Parameter: fs.aio-nr: Value: 95488 => 97024
[Oct/17/2018 04:54:15.397]: Parameter: fs.inode-nr: Value: 764974131561 => 740744131259
[Oct/17/2018 04:54:15.397]: Parameter: kernel.pty.nr: Value: 2 => 1
[Oct/17/2018 04:54:15.397]: Parameter: kernel.random.entropy_avail: Value: 189 => 158
[Oct/17/2018 04:54:15.397]: Parameter: kernel.random.uuid: Value: 36269877-9bc9-40a3-82e0-
1619865096f2 => 7551c5e7-c59f-40fa-b55f-5bd170e8b1ab
[Oct/17/2018 05:46:15.397]: Parameter: fs.aio-nr: Value: 119680 => 122880
[Oct/17/2018 05:46:15.397]: Parameter: fs.inode-nr: Value: 1580316810036 => 1562320768555
[Oct/17/2018 05:46:15.397]: Parameter: kernel.pty.nr: Value: 19 => 18
[Oct/17/2018 05:46:15.397]: Parameter: kernel.random.uuid: Value: 37cc31aa-ee31-459e-8f2a-
0766b34b1b64 => f5176cdc-6390-415d-882e-02c4cff2ae4e
[Oct/17/2018 16:56:15.398]: Parameter: fs.aio-nr: Value: 97024 => 98560
106

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Overview & History
Installation and Configuration
Reactive Usage
Proactive Usage
Centralized Usage
1
2
3
4
5
107

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Proactively Detect
database issues
108Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
ORAchk/EXAchk email Notification
•Automatically started & configured to run Critical Health Checks
•You only need to configure your email for notification
109
tfactl orachk/exachk -set “[email protected]

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
ORAchk/EXAchk
Report
Confidential –Oracle Internal/Restricted/Highly Restricted110

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Configure Diagnostic Collection email Notification
•Set notification email for any problem detected:
•To set notification email for specific
ORACLE_HOMEs include the OS home owner:
111
tfactl set [email protected]
tfactl set notificationAddress=oracle:[email protected]

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |112
Event Notification

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Analysis in MOS
113Confidential –Oracle Internal

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Overview & History
Installation and Configuration
Reactive Usage
Proactive Usage
Central Repository and UI
1
2
3
4
5
118

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |119
Deploy with Minimum Footprint and Maximum Manageability
Oracle 18c Domain Services Cluster
Application
Member
Cluster
Database
Member
Cluster
Database
Member
Cluster
Oracle Domain Services Cluster
Database
Member
Cluster
Application
Member
Cluster
Database
Member
Cluster
ORACLE CLUSTER DOMAIN
Management Repository Service
Trace File Analyzer Service
Grid Names Service
Storage Services
QoS Management Service
Rapid Home Provisioning Service
Confidential –Oracle Internal/Restricted/Highly Restricted
•Hosts Framework as Services
•Reduces local resource footprint
•Centralizes management
•Speeds deployment and
patching
•Optional Shared Storage
•Supports multiple versions and
platforms going forward

120

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Domain Services Cluster Already Has TFA User Interface
•Central TFA Repository utilizing ACFS Storage
•Member Clusters Send TFA Collections to the TFA Service on DSC
•TFA Service indexes the Collection and runs Analysers.
•New UI will be shipped in 19
121

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Standalone User Interface
•TFA Collector will upload to central repository
•TFA UI analyses files and generates
–Events TimeLine
–Anomaly TimeLine using Applied Machine Learning
–Root Cause Analysis and Recommendations where available.
–Interface to easily access all files and analyser reports.
•Already used in Oracle Database Cloud.
•Coming On Premin 19
122

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |123

124

125

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Maintenance Slot Identification
126Confidential –Oracle Internal
ORAchk/EXAchk results are
automatically uploaded to TFA &
automatically processed

127
New OrachkDashboard will be Available
in 19

128

Copyright © 2018,Oracle and/or its affiliates. All rights reserved. |
Maintenance Slot Identification
129Confidential –Oracle Internal