The statspack feature should be installed by running the following SQL scripts as the SYS user:
This should be run from sqlplus as follows:
The script will install the statspack schema owned by a new user called PERFSTAT. The script will ask you to supply the PERFSTAT user’s password, default tablespace and temporary tablespace.
The default tablespace is the tablespace that shall be used to store the statspack schema tables. It is therefore recommended to create a new tablespace for this purpose before running the spcreate.sql script.
A statspack snapshot is taken by running the following procedure as the PERFSTAT user:
Snapshots can also be run using different levels of information. Level 5 is the default level; however sometimes I use level 7 as this will give additional information regarding I/O performance. For example:
Execute statspack.snap(i_snap_level => 7);
The following table details what other snap levels are available:
This level captures general statistics, including rollback segment, row cache, SGA, system events, background events, session events, system statistics, wait statistics, lock statistics, and Latch information.
This level includes capturing high resource usage SQL Statements, along with all data captured by lower levels.
This level includes capturing SQL plan and SQL plan usage information for high resource usage SQL Statements, along with all data captured by lower levels.
This level captures segment level statistics, including logical and physical reads, row lock, itl and buffer busy waits, along with all data captured by lower levels.
This level includes capturing Child Latch statistics, along with all data captured by lower levels.
For storing general performance information it is common to run the statspack.snap (default level of 5) process via a background job every hour.
However, when investigating a specific performance problem this may not be frequent enough. For example imagine that it takes you 1 hour to complete a 5 mile trip. This may sound fairly reasonable until further analysis shows that 45 minutes of the trip was spent driving down a 1 mile strip of road where you had to stop at 7 traffic lights. Obviously this 1 mile was a big bottleneck to the journey and if an alternative quicker route could be taken this would sufficiently cut down the overall travel time.
Therefore when investigating a specific performance issue more frequent snapshots may be required to capture where the bottlenecks are taking place; Personally if I was monitoring a task which took 1 hour to complete I would take snapshots at least once every 15 minutes.
If you want to gather more frequent (less than an hour) statspack reports on an on-going basis it may be advisable to run a level 0 snapshot. For example for a general idea of on-going performance levels you should be able to run a level 0 snapshot every 15 minutes via a background job without causing much of an overhead to general Oracle resources.
Statspack reports are generated by running the script spreport.sql as the PERFSTAT user as follows:
The report will you provide a list of available snapshots with a relevant id and time of the snapshot.
You will be asked to enter a snapshot id for the starting point of the report, snapshot id for the ending point of the report as well as the name of the report.
The script will then generate a file, based on the provided inputs, containing the statspack report.
Oracle performance is based on the formula:
Response Time = Service Time + Wait Time
Service Time is the value of “CPU used by this session” from the statspack report; this is calculated using the formula:
Service Time = CPU Parse + CPU Recursive + CPU Other
CPU Parse is the value of “parse time cpu” from the statspack report.
CPU Recursive is the value of “recursive cpu usage” from the statspack report.
Therefore CPU Other is calculated using the formula:
Service Time – CPU Parse – CPU Recursive
The following statspack statistics are measured in centiseconds (one hundredth of a second) - which is apparently also how long it takes for a stroke of lightning to strike:
CPU used by this session
parse time cpu
Recursive cpu usage
From Oracle 9i the “%Total Elapsed Time” from the “Top 5 Timed Wait Events” section of the statspack report is a direct % of Response Time.
Top 5 Timed Wait Events
% Total Elapsed Time
wait for unread message on broadcast channel
Db file sequential read
Library cache load lock
ARCH wait on SENDREQ
The CPU time in the top 5 timed wait events equates to Service Time.
CPU Other (using the above example values) = 596,540 cs
CPU time can be broken down further using the calculation:
%CPU Statistic = CPU Statistic / Service Time * %Total Elapsed CPU Time
Of the 24.74% total elapsed time spent waiting on CPU time:
% CPU Other = 596,540 cs / 1,150,887 cs x 24.74 % = 12.82%
% CPU Parse = 18,975 cs / 1,150,887 cs x 24.74 % = 0.41%
% CPU Recursive = 535,372 cs / 1,150,887 cs x 24.74 % = 11.51%
Although the majority (52.90%) of the overall wait times is on “wait for unread message on broadcast channel”; this is actually an idle event and as such is unlikely to cause any performance issues. So the next significant values to investigate are CPU Other and CPU Recursive.
If % CPU Other value is significant the next step would be to look in the SQL ordered by Gets section of the statspack report to find queries with a high number of buffer gets (logical I/Os) and identify candidates for tuning.
If % CPU Parse value is significant, this can be caused by cursors being repeatedly opened and closed every time they are executed instead of being opened once, kept open for multiple executions and only closed when they are no longer required. The SQL ordered by Parse Calls section of the statspack report can help identify such cursors.
A significantly high % CPU Recursive value is caused by Oracle performing Recursive calls, this can be defined as follows:
Sometimes, to execute a SQL statement, the Oracle Server must issue additional statements. Such Statements are called recursive calls or recursive SQL statements. For example, if you insert a row into a table that does not have enough space to hold that row, the Oracle Server makes recursive calls to allocate the space dynamically if dictionary managed tablespaces are being used.
Recursive calls are also generated due to the unavailability of dictionary info in the dictionary cache, firing of database triggers, execution of DDL, execution of SQL within PL/SQL blocks, functions or stored procedures and enforcement of referential integrity constraints.
Probably a good starting point for tackling a significant % CPU Recursive value is to look in SQL ordered by Gets section of the statspack report to find queries with a high number of buffer gets (logical I/Os) and identify candidates for tuning.
Also look in the Dictionary Cache Stats section of the statspack report to check how efficiently the Dictionary Cache is being utilised.
Disk I/O related waits
If there is a significantly high value of disk related wait events (e.g. db file scattered read or db file sequential read) then check the SQL ordered by Reads section of the statspack report to find queries performing the most reads from disk and identify candidates for tuning.
There are also two sections that provide an analysis of how the underlying file systems are performing; these are “Tablespace IO Stats for DB” and “File IO Stats for DB”. If IO rates are approaching or are above 20 ms then this suggests a review of the file systems are required to see if they can be configured more optimally or whether faster disks can be deployed.
Latch related waits
Statspack has 2 sections to help identify causes of Library Cache / Shared Pool Latch issues; SQL ordered by Sharable Memory and SQL ordered by Version Count. Statements with many versions (multiple child cursors with the same parent cursor i.e. identical SQL text but different properties such as owning schema of objects, optimizer session settings, types & lengths of bind variables, etc.) are un-sharable. This means they can consume excessive memory resources in the Shared Pool and cause performance problems related to parsing e.g. Library Cache and Shared Pool latch contention or lookup time e.g. Library Cache latch contention.
The below is an extract from a statspack report showing the SQL with the highest amount of buffer gets (logical IOs) for the duration of the report.
The SQL statements in this section may appear there for different reasons; for example:
Divide the elapsed time by the number of executions to see how efficient the code is as a single execution.
The HASH VALUE can be used to identify the full SQL statement; for example if the code hasn’t been aged out of the Shared Pool; you can run:
SELECT * from V$SQLTEXT
WHERE HASH_VALUE = <HASH_VALUE>
ORDER BY PIECE;
However, Oracle supplies another utility to view not only the full SQL but also the explain plan for any given HASH_VALUE in the statspack report. This report can be generated by running the script as the PERFSTAT user:
The sprepsql.sql script will ask you for the starting and ending snapshot id of the report as well as the hash_value of the relevant SQL statement and name of the report to be generated. Below is a sample of such a report using the hash_value 885803753:
If you are taking statspack snapshots on a regular basis it is a good idea to clean up and remove old snapshots from the database to avoid excessive use of disk space.
Oracle provides a script to do this which can be run as the PERFSTAT user as follows:
The script will provide a list of snapshot ids; and will ask you to select the starting and ending id for which it will purge all snapshots in between that range.
Once complete the transaction will still be open (uncommitted) so you can either commit (confirm the action) or rollback (undo the action) at that point.