Configuring 10G R2 Grid Control Statistics Gathering Job Default Options
Maintenance
Windows
There are two critical components in 10G R2 Grid Control that affect statistics
gathering. First, we need to make sure that Oracle's maintenance windows are
configured optimally. In my
last blog, we learned how to set 10G R2 Grid Control's maintenance windows
to ensure that Oracle's routine maintenance tasks don't compete with application
programs for finite system resources. You don't want to be forced to tell an application
manager that his nightly batch job stream is running long and will be competing
with his application's on-line transactions because your statistics gathering
job made them run longer. Trust me, that happens.
Manual and Automatic
Statistics Gathering Options
Creating an optimal statistics gathering environment also requires us to review
the default statistics gathering options. These options will be used for Oracle's
automatic statistics gathering jobs. In addition, these options will be used
if you do not specifically set them in manually scheduled statistics gathering
jobs. We'll learn how to schedule statistics gathering jobs manually in my next
blog.
Navigating to
the Options Page
Let's begin by activating 10G R2 Grid Control and navigating to the Manage Optimizer
Statistics Home page. We accomplish that by selecting the 'Manage Optimizer
Statistics' link displayed on the middle, left hand side of 10G R2 Grid Control's
Database Administration Home page. 10G EM R2 responds by displaying the
Manage
Optimizer Statistics Home page.
On the right hand side of the panel, there will be a column name in blue titled 'Related Links'. Under that panel you'll see a navigation link titled 'Statistics Options'. When we click on the link, 10G R2 Grid Control responds by displaying the Gather Optimizer Statistics Default Options page.
Some General
Advice
Before we begin reviewing the options on this page, let me provide you with
some general advice. A lot of the options provide an Oracle recommended setting
called "Auto". I would suggest that you leave the options set to "Auto"
and only change them when you feel that a manual setting will produce a more
desirable result. Remember that Oracle is building more and more intelligence
into the 10G Grid Control tool set. The product is really becoming an expert
system. But the default settings are intended to create the most desirable statistics
gathering environment for the general database population. It could be that
these generic settings may need to be adjusted to fit your environment's specific
needs.
You monitor the efficiency of the automatic statistics jobs by looking at the statistics displayed on the Manage Optimizer Statistics Home page. The information on this panel tells us:
- If the database's automatic statistics gathering job is enabled
- When the statistics gathering job is scheduled to run next
- The total number of previous runs
- The date of the last run
- If the last run was successful
- Elapsed time
- The number of objects it analyzed
If you look closely at the number displayed in the Objects Analyzed display line (in this case 344), you'll notice that it is also a link. Clicking on the link will tell Oracle to display the Objects Analyzed report, which lists all of the objects that have been analyzed in the last run. The report also provides the reason why each object was analyzed (stale or missing statistics). You can use this report as well as the information provided by the LAST_ANALYZED column in DBA_TABLES and DBA_INDEXES to determine if Oracle's automatic statistics gathering job is analyzing your tables as often as you would like.
If the automatic statistics gathering job is running longer than you would like, or not gathering statistics as often as you would like, it may be beneficial to set some of the options manually.
Option Review
Let's begin our review at the top left hand side of the page. As I stated previously,
I will provide you with general advice for each option. As we know, there are
a myriad of configuration combinations available. If the database has been upgraded
from a previous release, a common practice in the field is to set the options
to match the manual statistics gathering jobs that were run previously. Administrators
feel safe knowing that the automatic statistics gathering job's configuration
closely resembles the old manual statistics gathering jobs that worked for them
in the past.
Retention Period
The first setting we see is for the Retention Period. Retention period determines
how many day's worth of optimizer statistics will be retained in the repository.
In an upcoming blog, I'll show you how to restore a set of saved statistics.
If your SQL statement's performance degrades due to access paths changes, you
can restore the statistics to a day when you know the statements were running
normally to determine if statistics changes were the cause.
Estimate Percentage
Seasoned DBAs know that statistics gathering jobs run on large tables can take
a loooonnng time. It is a very common practice to reduce the amount of time
the jobs take by reducing the amount of data they analyze. It is important to
note, that in most cases, the higher the percentage of the object you analyze,
the more intelligent the optimizer's decisions will be.
You can choose Auto if you would like Oracle to choose for you. Many DBAs choose 20% as the percentage of estimation. If the database consists of many small tables, you can set the percentage higher to ensure that the optimizer knows as much about the data as possible.
Recommendation - Choose Auto and monitor SQL statement performance. If the optimizer begins to make poor access path decisions, set the percentage manually and monitor again.
Granularity
The granularity determines what type of partitioned objects will be analyzed.
The options are Auto, Global, Global and Partition, All, Partition and Sub-Partition.
Do you want to analyze just the partitions and sub-partitions or do you want
to analyze just global objects, like indexes, that access more than one partition?
Recommendation - Choose auto. Oracle will determine which partitioned objects have had enough data changes made to them to warrant their analysis.
Cascade
Cascade determines if just the table will be analyzed or the table and all of
its indexes.
Recommendation - Choose Auto or True. In most cases, Oracle will choose to analyze the indexes along with the table when Auto is chosen. If you want to make sure that all of the table's indexes are analyzed, choose TRUE.
Histograms
This is the only option that I am a little hesitant to make a recommendation
on. Histograms is a complex topic that I can't cover in a few paragraphs. Oracle's
default setting is to build histograms. Histograms improve SQL statement plans
that access skewed data. Before you begin deviating from Oracle's recommendation,
you need to fully understand what histograms are. I recommend that you start
your education by reading the Oracle documentation. If highly recommend that
you also buy Jonathan Lewis's book titled Cost
Based Oracle Fundamentals. I have learned more about histograms from that
one book than I have from all other sources combined.
Recommendation - If you are experienced in Histograms and know you have skewed data, build the histograms. If you don't have skewed data, don't build them. If you don't have enough experience or education, build the Histograms, read the Oracle manuals, buy Jonathan's book and make an educated decision at a later time.
Degree of Parallelism
Determines how many parallel processes that will be generated. Oracle's statistics
gathering jobs can create multiple child processes that analyze the data in
parallel. The child processes access the data and report back to the parent
process when complete. The more parallel processes that are generated, the faster
the statistics job will run and the more system resources it will consume.
The administrator
must balance statistics gathering job performance with the amount of resources
the job consumes. DBAs use the parallel setting in conjunction with the Estimate
Percentage option to maximize statistics gathering job performance.
The options are Auto (Oracle chooses), Default (based on initialization parameters),
Table Default (parallel option set at the object level), Degree (number of parallel
processes set manually).
Recommendation - Choose Auto once again. Many DBAs feel more comfortable setting the value manually (Degree option) because that is what it was set to before. We have manually set our Degree to 12 on most of our data warehouse databases. We have historical performance statistics that tell us that generating that number of parallel processes to gather statistics provides an optimal balance between the statistics gathering job's performance and the percentage of resources it consumes on the host platform.
Cursor Invalidation
A cursor is invalidated when one of the objects it depends on is changed. This
causes the cursor to be reparsed. New statistics may change access paths and,
as a result, the statements accessing those newly analyzed objects will need
to be reparsed.
Recommendation - Choose Auto or Immediate. I don't have enough information yet to determine the quality of Oracle's Auto option. Most of our statistics gathering jobs run during the application's nightly maintenance windows, so the performance degradation that occurs as a result of cursor invalidation does not have a significant impact. We have chosen Immediate for most of these environments.
Target Object
Class
The cost-based optimizer is only as good as the statistics it uses as input.
Statistics collections should be run on a regular basis to ensure that the statistics
are current (representative of the data being accessed). In Oracle9i, the GATHER
AUTO option of the DBMS_STATS procedure could be used to help determine if statistics
generation was required. If more than 10% of the rows changed in the table since
the last analyze was performed, the DBMS_STATS procedure (with the GATHER_AUTO
option activated) analyzed the table.
If you choose Auto, 10G R2 Grid Control uses the same methodology. It will identify the percentage of data that is changed for each object under analysis and determine if new statistics are warranted. Choosing All will gather statistics for all objects in the database and choosing Oracle will gather statistics for system component schemas only.
Recommendation - Choose Auto or All in most cases. If the application has numerous tables (our Peoplesoft application has tens of thousands of database objects), use the auto option to ensure that you only gather statistics on objects that have changed. Other applications that have a smaller number of objects or have the majority of objects changed by application programs and are performance sensitive, choose All.
Summary
I hope you enjoyed this blog. Please remember that my personal recommendations
are just that - my personal opinion.
Thanks for reading.