Access Path Scientific Analysis Part III

Now that we have an understanding of how we can influence access paths using hints and session parameter changes, let’s continue our discussion by reviewing the various types of indexes as well as indexing strategies that affect Oracle access path selection. We’ll complete this series next week when we use all of the information we have learned to perform our own scientific analysis on Oracle optimization.

Introduction to Oracle Indexes
Generally, the fastest way to access Oracle data is with an index. The Oracle database contains several different indexing types that are designed to provide complementary performance functionality.

While standard B-tree indexes are most effective for columns containing a high number of different values (high selectivity), bitmapped indexes are most appropriate for columns with a relatively limited number (low selectivity) of different values. The low selectivity statement above comes with several caveats. Instead of spending this entire blog on the bitmap index/low selectivity issue, please turn to Jonathan Lewis's DBAZine article titled "Bitmap Indexes Part 1". Oracle also provides function-based indexes to allow index access using SQL that contains column manipulations in the WHERE clause.

Administrators supporting large data stores use partitioned indexes to decompose large index structures into smaller, more manageable pieces called index partitions. Starting with the 8i release, Oracle places index data in the separate index partitions based on the index's partitioning key.

Before we begin our reviewing some of the more popular Oracle index types, let me provide you with a few thoughts on indexes in general.

How Many Indexes Can I Build?
This subject has always been a matter for great debate. The DBA must balance the performance of SELECT statements with their DML (INSERT, UPDATE and DELETE) counterparts. SELECT statements that return a limited number of rows from a table yet access non-indexed columns will suffer from poor performance. Conversely, if you have too many indexes on a particular table, DML statements may be adversely affected.

The DBA must take the business requirements, application processing workload and workload scheduling into consideration when determining how many indexes to build. If you compare the performance improvements an index makes on a SELECT statement to the negative affect it has on DML statements, you will find that the benefits of building the index usually far outweigh the performance drawbacks.

Indexes on columns in the WHERE clause of SELECT statements can reduce query times by minutes and even hours. The creation of additional indexes may add additional time to on-line transactions that execute DML statements. Additional indexes will have the greatest negative impact on DML statements that access a large number of rows. The more rows that are inserted, deleted or changed, the greater the negative impact will be. Traditionally, programs that process large volumes of rows are scheduled to execute during off-hours.

The DBA must also consider the needs of the business. What process is more important to the business unit - getting the data in or getting the data out? Who complains the most? Is it the business user that must wait minutes (or hours) for their transaction or report to retrieve data or the business user that is waiting an extra few seconds for their update transaction to complete? Is the nightly batch window tight on time?

The DBA will need to find out how much time the additional indexes add to programs that process large volumes of rows. In addition, the DBA must determine when these programs run. If they run at night or do not require high-performance, consider building the index.

If the transaction update performance requirements are excessive (dot com applications are one example), keep the number of indexes to a minimum. A good recommendation is to build a set of tables that have no indexes for lighting-fast update performance and move the data to historical tables (with proper indexing) during off-hours to improve retrieval performance.

Index Monitoring
Determining if an index will increase performance is a pretty straightforward process. The administrator is focusing their tuning efforts on a particular query and is able to gather the specific information necessary to assist in the decision making process.

Dropping unused indexes is also an important part of application tuning. We learned previously that indexes force Oracle to occur additional I/O every time a row is inserted or deleted into the table they are built upon. Every update of the table's columns incurs additional I/O to all indexes defined on those columns. Unused indexes also waste space and add unnecessary administrative complexity. Since unused indexes (excluding those used to enforce integrity constraints) do not add any benefits to the environment, why keep them?

Determining if indexes were being used in releases prior to Oracle9i was a time consuming and error-prone process. EXPLAIN plan and trace output could be used but there was no single mechanism that monitored index usage at the database level.

Starting with release 9i, the Oracle database simplifies the index usage monitoring process by providing the ALTER INDEX MONITOR USAGE command. The statement below turns monitoring on for the index SCOTT.EMPIDX while the second statement ends the monitoring session:

ALTER INDEX scott.empidx MONITORING USAGE;
ALTER INDEX scott.empidx NOMONITORING USAGE;

The V$OBJECT_USAGE table can then be accessed to determine if the index was used during the monitoring session. When the session is started, Oracle clears the information in V$OBJECT_USAGE for the index being monitored and enters a new start time identifying when the index monitoring session started. After the index monitoring session is concluded, the USED column in the V$OBJECT_USAGE table will contain the value 'YES' if the index was used during the monitoring session and the value 'NO' if it was not.

Parameters that Impact Index Usage
The parameters listed below influence the Oracle cost-based optimizer to favor or not favor index access. Please note that this is not an all inclusive list. It is a listing of parameters that I feel have the greatest chance to influence the optimizer to choose, or not choose, index access paths. We'll use these parameters during our scientific analysis.

OPTIMIZER_MODE = first_rows or first_rows_nnnn - The optimizer chooses the best plan for fast delivery of the first few rows or the first nnnn rows. The first_rows_nnn replaces the first_rows parameter in later Oracle releases. The first_rows is available for backward compatibility. More often than not, that access path will include an index. This optimizer mode tends to favor nested loop joins over hash and merge scan.

It is important to note that using this mode is not a switch. It won't change each and every table scan and hash join to index access and the nested loop join method. The optimizer, at times, will favor index access and nested loop joins. The reverse goes for the all_rows optimization mode below.
OPTIMIZER_MODE = all_rows - The optimizer chooses the best plan for fast delivery of all of the rows that queries return. The optimizer may decide to choose a full table scan over index access and hash joins instead of nested loop.
OPTIMIZER_INDEX_COST_ADJ = xxxx - This parameter lets you tune the optimizer to be more or less index "friendly." It allows the administrators to influence the optimizer to make it more or less prone to selecting an index access path over a full table scan.

The default for this parameter is 100. This setting tells the optimizer to evaluate index access paths at the regular cost. Any other value makes the optimizer evaluate the access path at that percentage of the regular cost. For example, a setting of 50 makes the index access path look half as expensive as normal.
OPTIMIZER_INDEX_CACHING = xxxx - You set this parameter to a value between 0 and 100 to indicate the percentage of the index blocks the optimizer should assume are in the cache. Setting this parameter to a higher value makes the indexes on the inner table of a nested loop joins look less expensive to the optimizer. The end result is that the optimizer may for favor a nested loop join using an index.

Index Affects on Access Paths
If a table only contains a few hundred rows, queries may run faster if the optimizer chooses to read all of the blocks in the table as opposed to using an index. The I/O generated traversing the index blocks to get to the table row entries would be higher than if Oracle read just the blocks allocated to the table being accessed.

What if we access a larger table using a column with poor selectivity? Selectivity describes the number of different values stored in a column. Poor selectivity means that the column has many values that are the same. If our statement contains a WHERE clause that searches for a column value that is contained in 90% of the table's rows, it is best that Oracle, once again, read each and every row in that table. Conversely if a WHERE clause searches for a column value that appears in 1% of the table rows, it would be beneficial for the optimizer to choose an index.

Here are some examples of index access paths that we will see during our testing:

Index only - Oracle is able to read all of the data required to satisfy the query's data needs from the index structure alone.
Index to table - Oracle searches through the index structure looking for the key value(s), then uses row identifiers to probe the table to satisfy the data request.
Index unique scans - The SQL statement accesses an index using a column (or columns) that are defined in a unique or primary key index with an equality condition.
Index range scans - Oracle scans a set of entries in the index to satisfy a query.
Index skip scans - Oracle is able to break down a multi-column index and view them as smaller subindexes. This is achieved by Oracle "skipping" the leading columns in the index and using columns that appear later in the index's definition.
Full scans - Oracle scans all of the index entries. Kind of like a tablescan on an index. Oracle drops down to the leaf blocks and traverses the leaf blocks using the leaf pointers.
Fast full scan - Oracle uses multi-block reads to read both leaf and non-leaf blocks. Non-leaf (branch blocks) are discarded.

But indexes influence more than just index-related access paths. Indexes can also impact the type of join operations used. Here's an "over the top" example to clarify. If you are joining two tables together in a SQL statement on join columns that have good selectivity (returns relatively few rows compared to the table size), Oracle will favor index access paths and a nested loop join. If you don't have indexes on the join columns, Oracle may choose table scans using a hash join instead of the nested loop join method.

Index Types
Let's continue our discussion by reviewing some of the more popular types of indexes: B-Tree, Bitmap and Function.

B-Tree Indexes
A traditional B-Tree index stores the key values and pointers in an inverted tree structure. The key to good B-Tree index performance is to build the index on columns having a lot of different values. Oracle describes this as "good selectivity" Oracle is able to quickly bypass rows that do not meet the search criteria when searching through indexes built on columns having a high degree of selectivity.

Bitmap Indexes
As I stated previously in this blog, there is a sense of confusion about bitmap indexes and the benefits they provide to columns with low selectivity. Instead of spending this entire blog on the bitmap index/low selectivity issue, please turn to Jonathan Lewis's DBAZine article titled "Bitmap Indexes Part 1".

The optimizer can be stubborn at times. It can be particularly stubborn when you want it to choose a single bitmapped index for an access path. A single bitmap index may not be chosen at all. The optimizer will be more inclined to choose bitmapped indexes as an access path if it can use multiple bitmapped indexes simultaneously. That's where the benefits of bitmaps are realized.

Bitmap Indexes and Concurrency
Anyone accustomed to database programming understands the potential for concurrency problems. When one application program tries to update data that is in the process of being changed by another, the DBMS must sometimes forbid access until the modification is complete in order to ensure data integrity.

Each entry in a B-Tree index entry contains a single ROWID. When the index entry is locked during an update, a single row is affected. A bitmap lock affects a range of entries which could have a negative impact on other transactions attempting to update rows already locked.

Locking issues affect data manipulation operations in Oracle. As a result, bitmapped indexes are not appropriate for OLTP applications that have a high level of concurrent insert, update and delete operations. Concurrency is usually not an issue in a data warehousing environment where the data is maintained by bulk loads, inserts and updates.

Function Based Indexes
Oracle 8i solved an indexing problem that had been affecting database performance for close to a decade. Before Oracle8i, any SQL statement that contained a function or expression on the columns being searched on in the WHERE clause could not use an index.

For example, the statement:

SELECT * FROM employee_table
WHERE Upper(first_name) = 'CHRIS';

would not use an index. A full table scan would be required to retrieve the desired result set. We now know that we are able to use B-tree and bitmap indexes to speed query performance.

In Oracle8i and later releases, we are able to build both bitmap and B-tree indexes on columns containing the aforementioned functions or expressions. The following index could be used to increase performance of the query:

CREATE INDEX upper_first_name on employee_table (upper(first_name));

Function based indexes will affect the performance of DML statements that manipulate the columns contained in the function based index. The more complex of the expression used, the more time the database will require to update the index.

Wrapup
The intent of this blog was not to provide you with an all-inclusive education on Oracle indexes. There are folks that spend dozens of hours learning about access paths and the affects that indexes, hints, statistics and parameters have on Oracle optimization. We are laying the groundwork to begin our own scientific analysis of Oracle access paths. As I stated previously, we need to begin experimenting on our own to fully understand the Oracle optimization process. In my next blog, we'll use the information we learned in this series to start our experimentation.

Monday, February 26, 2007 | Permalink | Comments (0)

trackback URL: http://www.dbazine.com/blogs/blog-cf/chrisfoot/blogentry.2007-02-24.7493837565/sbtrackback

DBAzine.com

Sections

Personal tools

Menu

Who Are You?

Access Path Scientific Analysis Part III