DBA Gang: Reading Statspack

In Oracle, Performance Tuning is based on the following formula:

Response Time = Service Time + Wait Time

Where

Service Time is time spent on the CPU
Wait Time is the sum of time spent on Wait Events i.e. non-idle time spent waiting for an event to complete or for a resource to become available.

Service Time is comprised of time spent on the CPU for Parsing, Recursive CPU usage (for PLSQL and recursive SQL) and CPU used for execution of SQL statements (CPU Other).

Service Time = CPU Parse + CPU Recursive + CPU Other

The above components of Service Time can be found from the following statistics:

· Service Time from CPU used by this session

· CPU Parse from parse time cpu

· CPU Recursive from recursive cpu usage

From these, CPU Other can be calculated as follows:

CPU other = CPU used by this session - parse time CPU - recursive CPU usage

Many performance-tuning tools (including Statspack) produce a list of the top wait events. For example, Statspack’s report contains the "Top 5 Wait Events" section.(Pre-Oracle9i Release 2).

It is a common mistake to start dealing with Wait Events first and not taking in consideration the corresponding response time. So always compare the time consumed by the top wait events to the 'CPU used by this session' and identify the biggest consumers.

Here is an example where CPU Other was found to be a significant component of total Response Time even though the report shows direct path read as top wait event:

Top 5 Wait Events

Events	Waits	Wait Time(cs)	% Total Wt Time
direct path read	4232	10827	52.01
db file scattered read	6105	6264	30.09
direct path write	1992	3268	15.70
control file parallel write	893	198	.95
db file parallel write	40	131	.63

Statistic	Total	Per Second	Per Trans
CPU used by this session	358806	130.5	12372.6
parse time cpu	38	0.0	1.3
recursive cpu usage	186636	67.9	6435.7

From these figures we can obtain:

· Wait Time = 10,827 x 100% / 52,01% = 20,817 cs

· Service Time = 358,806 cs

· Response Time = 358,806 + 20,817 = 379,623 cs

· CPU Other = 358,806 - 38 - 186,636 = 172,132 cs

If we now calculate percentages for the top Response Time components:

· CPU Other = 45.34%

· CPU Recursive = 49.16%

· direct path read = 2.85%

· etc. etc.

So we can see the I/O-related Wait Events actually are not a significant component of the overall Response Time. For us it makes sense concentrate our tuning effort on the service time component.

CPU Other is a significant component of Response Time, so a possible next step is to look at the CPU intensive SQL and not at direct path read wait event.

Starting with Oracle9i Release 2, Statspack presents Service Time (obtained from the statistic CPU used by this session) together with the top Wait Events in a section called Top 5 Timed Events, which replaces the section Top 5 Wait Events of previous releases.

Here is an example:

Top 5 Timed Events

Events	Waits	Time(s)	% Total Ela Time
library cache lock	141	424	76.52
db file scattered read	3367	96	17.4
CPU time		32	5.79
db file sequential read	161	1	.18
control file parallel write	40	0	.05

Statistic	Total	Per Second	Per Trans
CPU used by this session	3211	4.3	1605.5
parse time cpu	59	0.1	29.5
recursive cpu usage	232	0.3	116.0

These figures give us directly the percentages of the Wait Events against the total Response Time so no further calculations are necessary to assess the impact of Wait Events. Service Time is presented as CPU time in this section and corresponds to the total CPU utilisation. We can drill down to the various components of Service Time as follows:

· CPU Other = 3,211 - 59 - 232 = 2,920 cs

· CPU Other = 2,920 / 3,211 x 5.79% = 5.26%

· CPU Parse = 59 / 3,211 x 5.79% = 0.11%

· CPU Recursive = 232 / 3,211 x 5.79% = 0.42%

In this example, the main performance problem was an issue related to the Library Cache.
The second most important time consumer was waiting for physical I/O due to multiblock reads (db file scattered read).

Identifying problematic SQL’s from Statspack

From the above calculations you will get the significant components which caused the performance problem. Based on this components lets decide on the various Statspack section to identify the problematic SQL’s.

Other CPU

If this shows CPU other as being significant the next step will be to look at the SQL performing most block accesses in the SQL by Gets section of the Statspack report. A better execution plan for this statement resulting in fewer Gets/Exec will reduce its CPU consumption.

CPU Parse

If CPU Parse time is a significant component of Response Time, it can be because cursors are repeatedly opened and closed every time they are executed instead of being opened once, kept open for multiple executions and only closed when they are no longer required. The SQL ordered by Parse Calls can help find such cursors.

Disk I/O related waits.

Identifying SQL statements responsible for most physical reads from the Statspack section SQL ordered by Reads has similar concepts as for SQL ordered by Gets.
% Total can be used to evaluate the impact of each statement. Reads per Exec together with Executions can be used as a hint of whether the statement has a suboptimal execution plan causing many physical reads or if it is there simply because it is executed often. Possible reasons for high Reads per Exec are use of unselective indexes require large numbers of blocks to be fetched where such blocks are not cached well in the buffer cache, index fragmentation, large Clustering Factor in index etc.

Latch related waits.

Statspack has 2 sections to help find such unsharable statements, SQL ordered by Sharable Memory and SQL ordered by Version Count. This can help with Shared Pool and Library Cache/Shared Pool latch tuning. Statements with many versions (multiple child cursors with the same parent cursor i.e. identical SQL text but different properties such as owning schema of objects, optimizer session settings, types & lengths of bind variables etc.) are unsharable. This means they can consume excessive memory resources in the Shared Pool and cause performance problems related to parsing e.g. Library Cache and Shared Pool latch contention or lookup time e.g. Library Cache latch contention.

DBA Gang

Tuesday, January 13, 2009

Reading Statspack

No comments:

Post a Comment

Followers

Blog Archive

About Me