System Triage IV - Access Path Identification Part I
The intent of this blog is to not provide you with a detailed analysis of all of the access paths available in Oracle. It is intended to show you how to identify access paths using the various tools the Oracle system provides. It is important that I provide you with some background information on access paths before we begin using the various tools that display access path information. Part one of this blog will provide you with that background information. The blog will also provide you with additional resources that will help you have a better understanding of the different access paths that are available to the optimizer.
Optimization
and Access Paths
When an application submits a SQL statement to the database for execution, several
steps are performed before it begins accessing data. These steps are collectively
called "parsing". During the parse phase, the database ensures that
the statement is syntactically correct, that the objects it is accessing exist
in the environment, determines if the statement has executed before, etc., etc.
Generating a new access path is a relatively expensive process. If the statement has been executed before, Oracle will attempt to use the same access path that was created during the earlier parse. If the statement hasn't been executed before, or it is invalidated for some reason, Oracle will activate the optimizer to generate the statement's access path to the data. The access path is the physical steps Oracle will take to retrieve the data and return it to the calling application.
So little is known about the inner-workings of the cost based optimizer that is often called the Oracle database's "little black box." The optimizer's job is to analyze the statement being executed and determine the most efficient access path to the data for that statement. Craig Mullins provides an excellent description of a database optimizer in The DB2 Developer's Guide "the optimizer is equivalent to an expert system. An expert system is a standard set of rules when combined with situational data can return an expert opinion."
The cost based optimizer uses statistics generated by the DBMS_STATS procedure as the situational data when creating its expert opinion on which access path to the data is most optimal. These statistics are stored in the data dictionary and describe the objects space characteristics, data uniqueness and data distribution.
The cost-based optimizer is only as good as the statistics it uses as input. Statistics collections should be run on a regular basis to ensure that the statistics are current (representative of the data being accessed). The optimizer is then able to create a highly accurate access path that is based on the least cost. If statistics are not available, the optimizer uses a simple algorithm to calculate the statistics, which often leads to "less than optimal" access paths. In other words, Oracle guesses and it is sometimes not a very good guesser.
Object Statistics
Object statistics describe the contents of the data and how it is stored in
the physcal database objects. Before we continue, it may be beneficial for me
to provide you with a few examples:
If a table only contains a few hundred rows, queries may run faster if the optimizer chooses to read all of the blocks in the table as opposed to using an index. The I/O generated traversing the index blocks to get to the table row entries would be higher than if Oracle read just the blocks allocated to the table being accessed.
What if we access a table using a column with poor cardinality? Cardinality describes the uniqueness of a particular value stored in a column. Poor cardinality means that the column has many values that are the same. If our statement contains a WHERE clause that searches for a column value that is contained in 90% of the table's rows, it is best that Oracle, once again, read each and every row in that table.
Conversely if a WHERE clause searches for a column value that appears in 1% of the table rows, it would be beneficial for the optimizer to choose an index. The optimizer also uses object statistics to determine which join method should be used and the order in which the joined tables are accessed.
Here are just a few of the object statistcs that the optimizer uses as input during access path generation:
- Number of rows in the tables being accessed
- Number of blocks allocated to the table
- Average table row length
- Number of distinct values for a column
- Number of nulls in a column
- Number of leaf blocks allocated to indexes
- Number of leveles in the B-Tree
- Clustering factor - how the order of the index matches the order of the table rows
Access Path
Prediction vs The Access Path Actually Taken During Execution
It is important to understand that there can be a difference between the access
path that Oracle predicts a statement will take versus the actual access path
taken during execution.
Many years ago, I learned this fact the hard way. I was executing a performance analysis test plan on a database that we were migrating from Version 8 to Version 9. I kept using the Explain command (discussed in the next blog) to display the access path of a statement I was attempting to tune. The access path looked perfect to me. It was using the correct indexes, join methods, join sequence But the transaction continued to perform poorly. Describing the statement as running "poorly" is like describing the Titanic as having a "small leak". It ran forever.
I began to question my ability to tune, Oracle Version 9I and my chosen career path. I just couldn't figure out why this seemingly excellent access path was affecting the statement's performance. I decided it was time to run a SQL Trace (discussed in next blog) to generate more detailed performance information. I ran the TKPROF to format the output. I used the explain option to dump the access paths and reviewed the results. I stumbled across the titles "Row Source Operation" and "Execution Plan". I noticed that the access paths were different. Hmm, how could that be? I began to dig into the Oracle 9i documentation for TKPROF and found that "Row Source Operation" is the access path that was taken during the statement's execution and the "Explain Plan" is the access path that Oracle is predicting the statement will take before it executes.
Here's an excerpt of a recent TKPROF report showing two different access paths for the same statement. If you carefully review the two sets of access paths, it should become clear that they are different.
Bind Variables
As I stated, sometimes those access paths are the same and sometimes they aren't.
The times "they aren't" are the times that will complicate your performance
tuning analysis. Oracle will change the access path for a couple of different
reasons. One reason is Bind Variable Peeking.
Bind variables are used to simplify application program coding and improve the performance of the parsing process. Oracle matches statements being executed to statements that are already parsed and stored in the shared pool. If Oracle finds an exact match, it will use the statement in the shared pool. When the statement is found in the shared pool, it is described as being a soft parse. If Oracle does not find any matching SQL in the shared pool, it will perform a hard parse which requires more steps to be performed than a soft parse. As with mostly everything elese, less steps = faster performance. When Oracle looks for a matching SQL statement in the shared pool, the statements must match EXACTLY. That is, the statement must match in case, line breaks and spacing. If the SQL doesn't match exactly, Oracle executes a hard parse.
Here's an example of a series of SQL statements using hardcoded variables:
select first_name,
last_name, dept_id from scott.emp where emp_id = 103487;
select first_name, last_name, dept_id from scott.emp where emp_id = 989566;
select first_name, last_name, dept_id from scott.emp where emp_id = 436709;
Here's the same SQL statement using bind variables:
select first_name,
last_name, dept_id from scott.emp where emp_id = :empid;
select first_name, last_name, dept_id from scott.emp where emp_id = :empid;
select first_name, last_name, dept_id from scott.emp where emp_id = :empid;
The first set of SQL statements will all look different to the Oracle optimizer. As a result, Oracle will be forced to hard parse each execution of the statement. One of the steps in a hard parse is to generate the statement's access path. Regenerating a statement's access path each time it is executed is an extremely inefficient process and can drive the utilization of "finite CPU resources" to unacceptable levels. If you execute these exact statements again with the same values for emp_id (103847, 989566 and 436709), Oracle will find the matching statements (if they haven't been flushed or invalidated) and not have to perform a hard parse.
The second set of SQL statements highlights the use of bind variables. The application program will move the values 103847, 989566 and 436709 into the EMPID bind variable and send it to Oracle for parsing and execution. Because the bind variable makes the statement looks the same, Oracle will perform a hard parse for the first execution and a soft parse of the statement for subsequent executions. A soft parse will not regenerate the access path. This leads to faster statement execution and a reduction in resource utilization.
Using bind variables is not a panacea. If you want to learn the pros and cons of bind variables, read Jonathan Lewis's book titled Cost-Based Oracle Fundamentals. I have provided a link in the Access Path Education paragraph at the end of this blog.
Bind Variable
Peeking Potential Issues
Oracle 9I introduced bind variable peeking to improve the performance of statements
using bind variables. During the SQL statement's execution, Oracle will look
at the contents of the bind variable and use that information to generate the
access path. One example of where this helps is skewed data. Skewed data means
that one particular value in a column may occur a few times (high cardinality),
while another value may occur many times (poor cardinality). The best access
path for a value that occurs only a few times could differ from the access path
from the same statement that has a value in the bind variable that occurs many
times. For example, DBAs often use histograms on skewed data to provide more accurate situational data to the optimizer. This often causes issues when bind peeking occurs.
A bind variable peeking problem may occur when the access path generated during the hard parse uses "an unlucky set of values" as Jonathan Lewis describes it in Cost-Based Oracle Fundamentals. That "unlucky set of values" in the bind variables could generate an optimal access path for the initial execution of that statement but not be as optimal for future executions.
If we go back to my TKPROF output, the access path changes are a result of Oracle used bind peeking to identify the contents of the bind variable. Oracle then changed the access path the statement took during execution.
Bind Variable
Interpretation in SQL*PLUS
When
you explain in statements in SQL*PLUS, you also need to be aware of how the
tool interprets the bind variables. Bind variable interpretation will also affect
access path predictions. Tom Kyte has a discussion on bind
variable interpretation on his Ask Tom website.
Which Access
Path Information Can You Trust?
Because the access path that Oracle predicts a statement will use often differs
from the one that it actually uses, I will highlight which output you can trust
and which you can't. I'll also provide you with a few helpful hints and tips
on how to influence some of the tools to predict an access path that more closely
matches the access path taken during run time.
Access Path
Education
You can start your education on the different access paths that are available
to the optimizer by reading Oracle's Database Performance Tuning Guide that
is provided in each Oracle release's documentation. Before you buy third-party
books on any topic, I highly suggest that you read Oracle's documentation first.
If you want to learn how the optimizer works, I highly suggest that you read Jonathan Lewis's book titled Cost-Based Oracle Fundamentals. It is one of the most educational and informative books I have ever read on the cost-based optimizer. Here's my review of the book for DBAZine.
What's Next?
In my next blog, we'll begin our review of the various tools we can use to identify
the access path a given SQL statement will take.
I tried several scenerios like below but I can not find a demostration for this.
Best regards.
Following tests were done on Windows Xp with Oracle XE;
sql*plus >
-- creating a demo table
connect hr/hr
drop table tbl_bind_peeking purge;
create table tbl_bind_peeking nologging as
select rownum id, owner, name, line from dba_source
order by owner, name, line;
create index nui_owner on tbl_bind_peeking(owner) nologging ;
-- statistics without histogram
exec dbms_stats.gather_table_stats(USER, 'tbl_bind_peeking', CASCADE=>TRUE);
commit;
-- first with OUTLN
conn hr/hr
set autotrace traceonly statistics
variable x varchar2(16);
alter session set max_dump_file_size=unlimited;
ALTER session SET timed_statistics = true;
alter session set STATISTICS_LEVEL = ALL ;
alter session set tracefile_identifier = TRACE_NOHIST_OUTLN_FIRST ;
Alter session set SQL_Trace = true ;
exec :x := 'OUTLN';
select * from tbl_bind_peeking where owner = :x;
Alter session set SQL_Trace = false ;
-- than with SYS
conn hr/hr
set autotrace traceonly statistics
variable x varchar2(16);
alter session set max_dump_file_size=unlimited;
ALTER session SET timed_statistics = true;
alter session set STATISTICS_LEVEL = ALL ;
alter session set tracefile_identifier = TRACE_NOHIST_SYS_SECOND ;
Alter session set SQL_Trace = true ;
exec :x := 'SYS';
select * from tbl_bind_peeking where owner = :x;
Alter session set SQL_Trace = false ;
-- flush shared pool and try the reverse case
conn / as sysdba
alter system flush shared_pool;
-- this time first with SYS
conn hr/hr
set autotrace traceonly statistics
variable x varchar2(16);
alter session set max_dump_file_size=unlimited;
ALTER session SET timed_statistics = true;
alter session set STATISTICS_LEVEL = ALL ;
alter session set tracefile_identifier = TRACE_NOHIST_SYS_FIRST ;
Alter session set SQL_Trace = true ;
exec :x := 'SYS';
select * from tbl_bind_peeking where owner = :x;
Alter session set SQL_Trace = false ;
-- than with OUTLN
conn hr/hr
set autotrace traceonly statistics
variable x varchar2(16);
alter session set max_dump_file_size=unlimited;
ALTER session SET timed_statistics = true;
alter session set STATISTICS_LEVEL = ALL ;
alter session set tracefile_identifier = TRACE_NOHIST_OUTLN_SECOND ;
Alter session set SQL_Trace = true ;
exec :x := 'OUTLN';
select * from tbl_bind_peeking where owner = :x;
Alter session set SQL_Trace = false ;
-- creating histogram for the skewed column
conn hr/hr
exec dbms_stats.gather_table_stats(user, 'tbl_bind_peeking', method_opt => 'for all indexed columns size 254', cascade => true );
commit;
-- flush shared pool before tests with histograms
conn / as sysdba
alter system flush shared_pool;
-- first with OUTLN
conn hr/hr
set autotrace traceonly statistics
variable x varchar2(16);
alter session set max_dump_file_size=unlimited;
ALTER session SET timed_statistics = true;
alter session set STATISTICS_LEVEL = ALL ;
alter session set tracefile_identifier = TRACE_WITHHIST_OUTLN_FIRST ;
Alter session set SQL_Trace = true ;
exec :x := 'OUTLN';
select * from tbl_bind_peeking where owner = :x;
Alter session set SQL_Trace = false ;
-- than with SYS
conn hr/hr
set autotrace traceonly statistics
variable x varchar2(16);
alter session set max_dump_file_size=unlimited;
ALTER session SET timed_statistics = true;
alter session set STATISTICS_LEVEL = ALL ;
alter session set tracefile_identifier = TRACE_WITHHIST_SYS_SECOND ;
Alter session set SQL_Trace = true ;
exec :x := 'SYS';
select * from tbl_bind_peeking where owner = :x;
Alter session set SQL_Trace = false ;
-- flush shared pool for the reverse test
conn / as sysdba
alter system flush shared_pool;
-- this time first with SYS
conn hr/hr
set autotrace traceonly statistics
variable x varchar2(16);
alter session set max_dump_file_size=unlimited;
ALTER session SET timed_statistics = true;
alter session set STATISTICS_LEVEL = ALL ;
alter session set tracefile_identifier = TRACE_WITHHIST_SYS_FIRST;
Alter session set SQL_Trace = true ;
exec :x := 'SYS';
select * from tbl_bind_peeking where owner = :x;
Alter session set SQL_Trace = false ;
-- than with OUTLN
conn hr/hr
set autotrace traceonly statistics
variable x varchar2(16);
alter session set max_dump_file_size=unlimited;
ALTER session SET timed_statistics = true;
alter session set STATISTICS_LEVEL = ALL ;
alter session set tracefile_identifier = TRACE_WITHHIST_OUTLN_SECOND;
Alter session set SQL_Trace = true ;
exec :x := 'OUTLN';
select * from tbl_bind_peeking where owner = :x;
Alter session set SQL_Trace = false ;
cmd >
tkprof oracos_ora_452_trace_nohist_outln_first.trc oracos_ora_452_trace_nohist_outln_first.txt explain=hr/hr sys=no waits=yes sort=execpu
tkprof oracos_ora_452_trace_nohist_sys_second.trc oracos_ora_452_trace_nohist_sys_second.txt explain=hr/hr sys=no waits=yes sort=execpu
tkprof oracos_ora_2044_trace_nohist_sys_first.trc oracos_ora_2044_trace_nohist_sys_first.txt explain=hr/hr sys=no waits=yes sort=execpu
tkprof oracos_ora_2044_trace_nohist_outln_second.trc oracos_ora_2044_trace_nohist_outln_second.txt explain=hr/hr sys=no waits=yes sort=execpu
tkprof oracos_ora_1648_trace_withhist_outln_first.trc oracos_ora_1648_trace_withhist_outln_first.txt explain=hr/hr sys=no waits=yes sort=execpu
tkprof oracos_ora_3820_trace_withhist_sys_second.trc oracos_ora_3820_trace_withhist_sys_second.txt explain=hr/hr sys=no waits=yes sort=execpu
tkprof oracos_ora_2276_trace_withhist_sys_first.trc oracos_ora_2276_trace_withhist_sys_first.t explain=hr/hr sys=no waits=yes sort=execpu
tkprof oracos_ora_3572_trace_withhist_outln_second.trc oracos_ora_3572_trace_withhist_outln_second.trct explain=hr/hr sys=no waits=yes sort=execpu
Results Summary >
-- OUTLN_FIRST_NOHIST
Rows Row Source Operation
------- ---------------------------------------------------
21 TABLE ACCESS BY INDEX ROWID tbl_bind_peeking (cr=8 pr=2 pw=0 time=26898 us)
21 INDEX RANGE SCAN NUI_OWNER (cr=5 pr=2 pw=0 time=26725 us)(object id 50493)
-- SYS_SECOND_NOHIST
Rows Row Source Operation
------- ---------------------------------------------------
91991 TABLE ACCESS BY INDEX ROWID tbl_bind_peeking (cr=12819 pr=192 pw=0 time=374432 us)
91991 INDEX RANGE SCAN NUI_OWNER (cr=6315 pr=192 pw=0 time=183997 us)(object id 50493)
-- SYS_FIRST_NOHIST
Rows Row Source Operation
------- ---------------------------------------------------
91991 TABLE ACCESS BY INDEX ROWID tbl_bind_peeking (cr=12819 pr=0 pw=0 time=282003 us)
91991 INDEX RANGE SCAN NUI_OWNER (cr=6315 pr=0 pw=0 time=92007 us)(object id 50493)
-- OUTLN_SECOND_NOHIST
Rows Row Source Operation
------- ---------------------------------------------------
21 TABLE ACCESS BY INDEX ROWID tbl_bind_peeking (cr=8 pr=0 pw=0 time=122 us)
21 INDEX RANGE SCAN NUI_OWNER (cr=5 pr=0 pw=0 time=65 us)(object id 50493)
-- OUTLN_FIRST_WITHHIST
Rows Row Source Operation
------- ---------------------------------------------------
21 TABLE ACCESS BY INDEX ROWID tbl_bind_peeking (cr=8 pr=0 pw=0 time=111 us)
21 INDEX RANGE SCAN NUI_OWNER (cr=5 pr=0 pw=0 time=52 us)(object id 50493)
-- SYS_SECOND_WITHHIST << how to get TABLE ACCESS FULL here?
Rows Row Source Operation
------- ---------------------------------------------------
91991 TABLE ACCESS BY INDEX ROWID tbl_bind_peeking (cr=12819 pr=0 pw=0 time=282166 us)
91991 INDEX RANGE SCAN NUI_OWNER (cr=6315 pr=0 pw=0 time=92014 us)(object id 50493)
-- SYS_FIRST_WITHHIST
Rows Row Source Operation
------- ---------------------------------------------------
91991 TABLE ACCESS FULL tbl_bind_peeking (cr=7453 pr=0 pw=0 time=110765 us)
-- OUTLN_SECOND_WITHHIST << or how to get INDEX RANGE SCAN here?
Rows Row Source Operation
------- ---------------------------------------------------
21 TABLE ACCESS FULL tbl_bind_peeking (cr=1343 pr=0 pw=0 time=19359 us)