Hungry DBA .com for DBAs who like food

Installing Statspack

The statspack feature should be installed by running the following SQL scripts as the SYS user:

$ORACLE_HOME/rdbms/admin/spcreate.sql

This should be run from sqlplus as follows:

SQL> @?/rdbms/admin/spcreate.sql

The script will install the statspack schema owned by a new user called PERFSTAT.  The script will ask you to supply the PERFSTAT user’s password, default tablespace and temporary tablespace.

The default tablespace is the tablespace that shall be used to store the statspack schema tables.  It is therefore recommended to create a new tablespace for this purpose before running the spcreate.sql script. 

Running Statspack

A statspack snapshot is taken by running the following procedure as the PERFSTAT user:

Execute statspack.snap;

Snapshots can also be run using different levels of information.  Level 5 is the default level; however sometimes I use level 7 as this will give additional information regarding I/O performance.  For example:

Execute statspack.snap(i_snap_level => 7);

The following table details what other snap levels are available:

SNAP LEVEL

Definition

0

This level captures general statistics, including rollback segment, row cache, SGA, system events, background events, session events, system statistics, wait statistics, lock statistics, and Latch information.

5

This level includes capturing high resource usage SQL Statements, along with all data captured by lower levels.

6

This level includes capturing SQL plan and SQL plan usage information for high resource usage SQL Statements, along with all data captured by lower levels.

7

This level captures segment level statistics, including logical and physical reads, row lock, itl and buffer busy waits, along with all data captured by lower levels.

10

This level includes capturing Child Latch statistics, along with all data captured by lower levels.

For storing general performance information it is common to run the statspack.snap (default level of 5) process via a background job every hour.  

However, when investigating a specific performance problem this may not be frequent enough.  For example imagine that it takes you 1 hour to complete a 5 mile trip.  This may sound fairly reasonable until further analysis shows that 45 minutes of the trip was spent driving down a 1 mile strip of road where you had to stop at 7 traffic lights.  Obviously this 1 mile was a big bottleneck to the journey and if an alternative quicker route could be taken this would sufficiently cut down the overall travel time.

Therefore when investigating a specific performance issue more frequent snapshots may be required to capture where the bottlenecks are taking place; Personally if I was monitoring a task which took 1 hour to complete I would take snapshots at least once every 15 minutes.

If you want to gather more frequent (less than an hour) statspack reports on an on-going basis it may be advisable to run a level 0 snapshot.  For example for a general idea of on-going performance levels you should be able to run a level 0 snapshot every 15 minutes via a background job without causing much of an overhead to general Oracle resources.

Generating Statspack Reports

Statspack reports are generated by running the script spreport.sql as the PERFSTAT user as follows:

SQL> @?/rdbms/admin/spreport.sql

The report will you provide a list of available snapshots with a relevant id and time of the snapshot.

You will be asked to enter a snapshot id for the starting point of the report, snapshot id for the ending point of the report as well as the name of the report.

The script will then generate a file, based on the provided inputs, containing the statspack report.

Analysing Wait Events

Oracle performance is based on the formula:

Response Time = Service Time + Wait Time 

Service Time is the value of “CPU used by this session” from the statspack report; this is calculated using the formula:

Service Time = CPU Parse + CPU Recursive + CPU Other 

CPU Parse is the value of “parse time cpu” from the statspack report.

CPU Recursive is the value of “recursive cpu usage” from the statspack report.

Therefore CPU Other is calculated using the formula:

Service Time – CPU Parse – CPU Recursive

The following statspack statistics are measured in centiseconds (one hundredth of a second) - which is apparently also how long it takes for a stroke of lightning to strike:

Statistic

Example Value

CPU used by this session

1,150,887

parse time cpu

18,975

Recursive cpu usage

535,372

From Oracle 9i the “%Total Elapsed Time” from the “Top 5 Timed Wait Events” section of the statspack report is a direct % of Response Time. 

Top 5 Timed Wait Events 

Event

Waits

Time (s)

% Total Elapsed Time

wait for unread message on broadcast channel

24,457

24,603

52.90

CPU time

 

11,509

24.74

Db file sequential read

2,013,123

3,045

6.55

Library cache load lock

1,889

1,833

3.94

ARCH wait on SENDREQ

6,570

1,683

3.62

The CPU time in the top 5 timed wait events equates to Service Time.

CPU Other (using the above example values) = 596,540 cs

CPU time can be broken down further using the calculation:

%CPU Statistic = CPU Statistic / Service Time * %Total Elapsed CPU Time 

For example:

Of the 24.74% total elapsed time spent waiting on CPU time:

% CPU Other              = 596,540 cs   / 1,150,887 cs             x 24.74 %        = 12.82%

% CPU Parse               = 18,975 cs     / 1,150,887 cs             x 24.74 %        = 0.41%

% CPU Recursive        = 535,372 cs   / 1,150,887 cs             x 24.74 %        = 11.51%

Although the majority (52.90%) of the overall wait times is on “wait for unread message on broadcast channel”; this is actually an idle event and as such is unlikely to cause any performance issues.  So the next significant values to investigate are CPU Other and CPU Recursive. 

 Identifying Problem Areas from the Wait Event Data

CPU Other 

If % CPU Other value is significant the next step would be to look in the SQL ordered by Gets section of the statspack report to find queries with a high number of buffer gets (logical I/Os) and identify candidates for tuning.

CPU Parse

If % CPU Parse value is significant, this can be caused by cursors being repeatedly opened and closed every time they are executed instead of being opened once, kept open for multiple executions and only closed when they are no longer required.  The SQL ordered by Parse Calls section of the statspack report can help identify such cursors.

CPU Recursive 

A significantly high % CPU Recursive value is caused by Oracle performing Recursive calls, this can be defined as follows:

Sometimes, to execute a SQL statement, the Oracle Server must issue additional statements.  Such Statements are called recursive calls or recursive SQL statements.  For example, if you insert a row into a table that does not have enough space to hold that row, the Oracle Server makes recursive calls to allocate the space dynamically if dictionary managed tablespaces are being used.

Recursive calls are also generated due to the unavailability of dictionary info in the dictionary cache, firing of database triggers, execution of DDL, execution of SQL within PL/SQL blocks, functions or stored procedures and enforcement of referential integrity constraints.

Probably a good starting point for tackling a significant % CPU Recursive value is to look in SQL ordered by Gets section of the statspack report to find queries with a high number of buffer gets (logical I/Os) and identify candidates for tuning.

Also look in the Dictionary Cache Stats section of the statspack report to check how efficiently the Dictionary Cache is being utilised.

Disk I/O related waits 

If there is a significantly high value of disk related wait events (e.g. db file scattered read or db file sequential read) then check the SQL ordered by Reads section of the statspack report to find queries performing the most reads from disk and identify candidates for tuning.

There are also two sections that provide an analysis of how the underlying file systems are performing; these are “Tablespace IO Stats for DB” and “File IO Stats for DB”.  If IO rates are approaching or are above 20 ms then this suggests a review of the file systems are required to see if they can be configured more optimally or whether faster disks can be deployed.

Latch related waits 

Statspack has 2 sections to help identify causes of Library Cache / Shared Pool Latch issues; SQL ordered by Sharable Memory and SQL ordered by Version Count.  Statements with many versions (multiple child cursors with the same parent cursor i.e. identical SQL text but different properties such as owning schema of objects, optimizer session settings, types & lengths of bind variables, etc.) are un-sharable.  This means they can consume excessive memory resources in the Shared Pool and cause performance problems related to parsing e.g. Library Cache and Shared Pool latch contention or lookup time e.g. Library Cache latch contention. 

Analysing inefficient SQL within the statspack report 

The below is an extract from a statspack report showing the SQL with the highest amount of buffer gets (logical IOs) for the duration of the report.

SQL ordered by Gets for DB: PIPR  Instance: PIPR1  Snaps: 13890 -13907

-> End Buffer Gets Threshold:   10000

-> Note that resources reported for PL/SQL includes the resources used by

   all SQL statements called within the PL/SQL code.  As individual SQL

   statements are also reported, it is possible and valid for the summed

   total % to exceed 100

                                                     CPU      Elapsd

  Buffer Gets    Executions  Gets per Exec  %Total Time (s)  Time (s) Hash Value

--------------- ------------ -------------- ------ -------- --------- ----------

     27,582,160       15,533        1,775.7   11.0  2291.33   3554.56  883547115

Module: JDBC Thin Client

call pk_bdslplus_data.p_get_instrument_data(:1,:2,:3)

     24,534,650       15,534        1,579.4    9.8  2207.23   3288.46 2592888974

Module: JDBC Thin Client

SELECT 'true' POS_IND FROM C_ADHD_AS_DEALT_HOLDING ADHD, C_BF_BA

SE_FUND BF, C_IN_INSTRUMENT CIN WHERE NVL(ADHD.C_ADHD_INDIRECT_M

KR,'N') = 'N' AND BF.C_BF_CODE = ADHD.C_ADHD_BF_CODE AND CIN.C_I

N_ID = ADHD.C_ADHD_IN_ID AND ADHD.C_ADHD_EXP_VALUE IS NOT NULL A

ND CIN.C_IN_SEDOL = :B1 AND CIN.C_IN_INTERNAL_IND != 'Y'

     18,016,650            1   18,016,650.0    7.2   254.74    324.61  864021573

Module: txpu2250.exe@mgunix61a (TNS V1-V3)

SELECT tb1.a_tb_tf_code,        slh.a_slh_linkage_ind,        sl

h.a_slh_error_text,        tb1.a_tb_in_sedol,         tb1.a_tb_i

n_id,         NVL(tb1.a_tb_units_bal,0),         NVL(tb1.a_tb_82

_cost_bal,0),        NVL(tb1.a_tb_np_cost_bal,0),        NVL(tb1

.a_tb_unindexed_82_cost_bal,0),        tir1.c_tirt_bond_tax_basi

     17,557,086       15,345        1,144.2    7.0  1886.30   2127.02  885803753

Module: JDBC Thin Client

SELECT /*+ INDEX(cin c_in_x_2) */ CIN.C_IN_SEDOL "@sedol", TO_CH

AR(SYSDATE,'YYYY-MM-DD')||'T'||TO_CHAR(SYSDATE,'HH24:MI:SS')||'-

00:00' "@datetime", CIN.C_IN_ID "fodbId", NVL(CIN.C_IN_CLEAN_PRI

CE,0) "price", NVL(CIN.C_IN_PRICE_DATE,CTL.C_CTL_VALUATION_DATE)

 "priceDate", CIN.C_IN_PRICE_CR_CODE "priceCrCode", CIN.C_IN_MOD

 

The SQL statements in this section may appear there for different reasons; for example:

  • The SQL may be very efficient when executed individually; but still appears in the statspack report because it has a high number of executions.  The aim for these SQL statements is to see if the code can be changed to reduce the number of executions. 
  • The SQL has a low number of executions but has an expensive explain plan cost; for example perhaps the code is doing a full table scan on a table with many rows.  The aim for these SQL statements is to see if the code can be tuned more optimally.
  • A combination of above where the code is inefficient as well as executed many times.  The aim for these types of SQL statements is to reduce the number of executions while also seeing if the code can be tuned more optimally.
  • Divide the elapsed time by the number of executions to see how efficient the code is as a single execution.

    The HASH VALUE can be used to identify the full SQL statement; for example if the code hasn’t been aged out of the Shared Pool; you can run:

    SELECT * from V$SQLTEXT

    WHERE HASH_VALUE = <HASH_VALUE>

    ORDER BY PIECE;

    However, Oracle supplies another utility to view not only the full SQL but also the explain plan for any given HASH_VALUE in the statspack report.  This report can be generated by running the script as the PERFSTAT user:

    SQL> @?/rdbms/admin/sprepsql.sql

    The sprepsql.sql script will ask you for the starting and ending snapshot id of the report as well as the hash_value of the relevant SQL statement and name of the report to be generated.  Below is a sample of such a report using the hash_value 885803753:

    STATSPACK SQL report for Hash Value: 885803753  Module: JDBC Thin Client

    DB Name         DB Id    Instance     Inst Num Release     Cluster Host

    ------------ ----------- ------------ -------- ----------- ------- ------------

    PIPR           767093029 PIPR1               1 9.2.0.5.0   YES     mgunix60a

     Start Id     Start Time         End Id      End Time       Duration(mins) --------- ------------------- --------- ------------------- --------------     13890 19-Nov-07 22:00:05      13907 20-Nov-07 05:00:01          419.93

    SQL Statistics

    ~~~~~~~~~~~~~~

    -> CPU and Elapsed Time are in seconds (s) for Statement Total and in

       milliseconds (ms) for Per Execute

                                                           % Snap

                         Statement Total      Per Execute   Total

                         ---------------  ---------------  ------

            Buffer Gets:      17,557,086          1,144.2    7.02

             Disk Reads:          24,261              1.6     .22

         Rows processed:          15,343              1.0

         CPU Time(s/ms):           1,886            122.9

     Elapsed Time(s/ms):           2,127            138.6

                  Sorts:          15,344              1.0

            Parse Calls:          15,346              1.0

          Invalidations:               0

          Version count:               4

        Sharable Mem(K):             205

             Executions:          15,345

    SQL Text

    ~~~~~~~~

    SELECT /*+ INDEX(cin c_in_x_2) */ CIN.C_IN_SEDOL "@sedol", TO_CH

    AR(SYSDATE,'YYYY-MM-DD')||'T'||TO_CHAR(SYSDATE,'HH24:MI:SS')||'-

    00:00' "@datetime", CIN.C_IN_ID "fodbId", NVL(CIN.C_IN_CLEAN_PRI

    CE,0) "price", NVL(CIN.C_IN_PRICE_DATE,CTL.C_CTL_VALUATION_DATE)

     "priceDate", CIN.C_IN_PRICE_CR_CODE "priceCrCode", CIN.C_IN_MOD

    IFIED_DURATION "modifiedDuration", CIN.C_IN_GROSS_REDEM_YIELD "g

    rossRedemYield", NVL(PK_INST_PROP.F_GET_IPV_VALUE(CIN.C_IN_ID,'I

    C_CRI'), 'UNKNOWN') "sector", DECODE(CIN.C_IN_CHECK_CODE,'Y','tr

    ue','false') "checked", DECODE(CIN.C_IN_LIVE_DEAD_MKR,'L','false

    ','true') "dead", CIN.C_IN_PRICE_BASIS "priceBasis", NVL(CIN.C_I

    N_HOLDINGS_MKR,'N') "holdingsMkr", CIN.C_IN_UNITISED_CF_CODE "un

    itisedClientFundcode" , PK_BDSL_DATA.IN_FODB_FUND(:B1) "in_fodb_

    fund_ind", CIN.C_IN_PRICE_XD_MKR "priceXdMkr" FROM C_IN_INSTRUME

    NT CIN,C_CTL_CONTROL CTL WHERE C_IN_CURRENT_MKR = 'Y' AND CIN.C_

    IN_SEDOL = :B1

    All Optimizer Plan(s) for this Hash Value

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Shows all known Optimizer Plans for this Hash value, and the Snap Id's they

    were first found in the shared pool

    -> ordered by Snap Id

        Plan

     Hash Value  Snap Id        Cost Optimizer

    ------------ -------- ---------- --------------------

      2958655084    12579       289  CHOOSE

      2958655084    12756       291  CHOOSE

      2958655084    13122       326  CHOOSE

      2958655084    13516       329  CHOOSE

      2958655084    13867       240  CHOOSE

      2958655084    14212       241  CHOOSE

    Plans in shared pool between Begin and End Snap Ids

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Shows the Execution Plans found in the shared pool between the begin and end

    snapshots specified.  The values for Rows, Bytes and Cost shown below are those

    which existed at the time the first-ever snapshot captured this plan - these

    values often change over time, and so may not be indicative of current values

    -> Rows indicates Cardinality, PHV is Plan Hash Value

    -> ordered by Plan Hash Value

    --------------------------------------------------------------------------------

    | Operation                      | PHV/Object Name     |  Rows | Bytes|   Cost |

    --------------------------------------------------------------------------------

    |SELECT STATEMENT                |----- 2958655084 ----|       |      |      5 |

    |MERGE JOIN CARTESIAN            |                     |     1 |   47 |      5 |

    | TABLE ACCESS FULL              |C_CTL_CONTROL        |     1 |    8 |      2 |

    | BUFFER SORT                    |                     |     1 |   39 |      3 |

    |  TABLE ACCESS BY INDEX ROWID   |C_IN_INSTRUMENT      |     1 |   39 |      3 |

    |   INDEX RANGE SCAN             |C_IN_X_2             |     1 |      |      2 |

    --------------------------------------------------------------------------------

    End of Report

     

    Purging Statspack Snapshots from the Database 

    If you are taking statspack snapshots on a regular basis it is a good idea to clean up and remove old snapshots from the database to avoid excessive use of disk space.

     Oracle provides a script to do this which can be run as the PERFSTAT user as follows:

    SQL> @?/rdbms/admin/spurge.sql

    The script will provide a list of snapshot ids; and will ask you to select the starting and ending id for which it will purge all snapshots in between that range.

    Once complete the transaction will still be open (uncommitted) so you can either commit (confirm the action) or rollback (undo the action) at that point.