There is an odd contradiction that we all encounter for most databases, especially if they are predominantly used during the business day. Here is how that contradiction comes to be – it is in the way that we obtain and use optimizer statistics on those databases. The contradiction runs like this:
We can demonstrate this easily with the following script run in 18c.
One of the talks I gave recently on the Groundbreaker Yatra tour was on the changes that arrive with the optimizer when you move to 12c and above. We talked about changes to histograms, all of the adaptive features etc, but also, one of the simplest but coolest optimizer improvements when it comes to loading data. When you perform a direct-path insert into an empty table, or a create-table-as-select (CTAS) operation, there is no need to perform an additional DBMS_STATS call to gather statistics at the conclusion of the load. The load process itself handles that for you. Here’s a simple example of that in action
Way back in Oracle 8.0 we introduced the “Object-Relational” database, which was “the next big thing” in the database community back then. Every vendor was scrambling to show just how cool their database technology was with the object-oriented programming paradigm.
Don’t get me wrong – using the Oracle database object types and features associated with them has made my programming life a lot easier over the years. But for me, it’s always been pretty much limited to that, ie, programming, not actually using the object types in a database design as such. Nevertheless, using objects as columns, or even creating tables of objects is supported by the database. For example, I can create a object type of MY_OBJECT (which could itself be made up of objects) and then have a table, not with that object as a column, but actually a table of that object.
Just a short post today on something that came in as a question for the upcoming Office Hours session which I thought could be covered quickly in a blog post without needing a lot of additional discussion for which Office Hours is more suited to.
The question was:
“When I gather statistics using DBMS_STATS, can I just create a statistic table and pass that as a parameter to get the results of the gather”
And the answer simply is “No” but let me clear up the confusion.
I was told about a PeopleSoft customer experiencing an Oracle error when collecting statistics during an Application Engine.
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">ORA-06533: Subscript beyond count
ORA-06512: at "SYS.DBMS_STATS"…
It is possibly a manifestation of a database bug. The workaround was not to use AUTO_SAMPLE_SIZE, so instead, this customer initially coded an explicit call to DBMS_STATS.GATHER_TABLE_STATS with a specific sample size.
This blog is not about the bug, but how to manage the workaround.
I gave this presentation to UKOUG PeopleSoft Roadshow 2018
PeopleSoft presents some special challenges when it comes to collecting and maintaining the object statistics used by the cost-based optimizer.
This is just a prototype of a queue-based concurrent statistics implementation - using the same basic implementation I've used a a couple of years ago to create indexes concurrently.There are reasons why such an implementation might be useful - in 11.2.0.x the built-in Concurrent Stats feature might turn out to be not really that efficient by creating lots of jobs that potentially attempt to gather statistics for different sub-objects of the same table at the same time - which can lead to massive contention on Library Cache level due to the exclusive Library Cache locks required by DDL / DBMS_STATS calls.In 12.1 the Concurrent Stats feature obviously got a major re-write by using some more intelligent processing what and how should be processed concurrently - some of the details are exposed via the new view DBA_OPTSTAT_OPERATION_TASKS, but again I've seen it running lots of very
This is just a short 12c update on my post about gathering statistics on tables with many columns from some time ago.I'm currently investigating the "Incremental Statistics" feature in 12.1.0.2 for a client, which probably will be worth one or more other posts, but since we're still in the process of evaluating and installing various patches it's too early to write about that.As part of the investigation I've noticed a significant change in behaviour in 12.1.0.2 compared to previous versions when it comes to gathering statistics on tables with many columns, hence this post here.The key message of the original post was that DBMS_STATS needs potentially several passes when gathering statistics on tables with many columns, which meant a significant increase in overall work and resource consumption, exaggerated by the fact that tables with that many colu
It appears that, somewhere in the 10.2.0.5 and 11.2.0.2 patchsets, Oracle introduced some additional unwanted functionality to the “GATHER_*_STATS” procedures in the DBMS_STATS package.
I have been working on a customer’s 11.2.0.3.0 database supporting a data-mart application where the data loading programs call DBMS_STATS.GATHER_TABLE_STATS as a concluding part of load processing, which means that the procedure gets called a *lot*.
We noticed that some calls to the same procedure were waiting excessively on the event “enq: TX – allocate ITL entry” and being blocked by sessions calling similar DBMS_STATS procedures, themselves in turn waiting excessively on the event “row cache lock” on data dictionary tables like SYS.CON$ (i.e. underlying the DBA_CONSTRAINTS view) and SYS.WRI$_OPTSTAT_SYNOPSIS_HEAD$.
I have been considering how to collect optimizer statistics for a PeopleSoft system running on an Oracle 11g database. Despite 11g being several years old, most of my current customers are still using 10g, though some are looking at the upgrade to 11g. I believe a slightly different approach is required.
In 2009, I wrote a series of blog postings on the subject of collecting statistics. However these were all based on Oracle 10g. I proposed a PL/SQL package that would use meta-data in a database table to determine how to collect statistics on a table, or deliberate supress collection of statistics.
Recent comments
1 year 46 weeks ago
2 years 6 weeks ago
2 years 10 weeks ago
2 years 11 weeks ago
2 years 15 weeks ago
2 years 37 weeks ago
3 years 5 weeks ago
3 years 34 weeks ago
4 years 19 weeks ago
4 years 19 weeks ago