Search

OakieTags

Who's online

There are currently 0 users and 32 guests online.

Recent comments

Affiliations

11g

How to Get Started with Amazon EC2 (Oracle 11g XE example)

I’ve just published Oracle Database 11g Express Edition Amazon EC2 image (AMI) but most of you have never used Amazon EC2… Not until now! This is a guide to walk you thorough the process of getting your very first EC2 instance up and running. Buckle up — it’s going to be awesome!

  1. Go to Amazon Web Services and open an account. You could use one that you buy your books with.
  2. Go to AWS Management Console for EC2 and sign up for Amazon EC2. You will need your credit card for this. You will not be charged anything unless you are either start using EC2 instances or allocate EBS storage and other related items. The sign-up page shows you all the pricing. You will especially like “Free tier for new AWS customers” section that gives you 750 hours of Micro instance uptime, 10 GB of EBS storage some bandwidth and few small goodies. This mean that you will not be charged anything in the beginning of your experiments. They will also do phone verification — I can’t remember I’ve seen it last time so it must be reasonable new. Works for cell phones too. Activation usually takes just few minutes and you’ll get an email confirmation and you get access to EC2, VPC, S3 and SNS. Direct link to AWS Management Console for EC2
  3. Now you can launch your first instance. So let’s start Oracle 11g XE beta image that I published just recently. Click “Launch Instance” then select “Community AMIs” tab. It will start loading AMIs list and it will take ages so don’t wait for it to finish and search for “pythian” – you will get pythian-oel-5.6-64bit-Oracle11gXE-beta image with AMI ID ami-e231cc8b the latest at the time of this writing.

    Select that image.
  4. On the next tab choose instance size. It’s enough to use Micro instance to start playing with Oracle 11g XE but be prepared that Micro instance doesn’t guarantee any CPU capacity so it might be “bursty” but, hey — it’s free or costs peanuts if you run out of free time. You could also choose an availability zone closer to you.
  5. On the next screen leave everything by default. You could select what to do when you shutdown the instance from inside the instance. Stop will keep your instance and EBS storage allocated and you can start it and all your changes will persist. However, you will be charged for allocated EBS storage (if you go beyond free 10GB) but it’s very little. “Terminate” will actually release EBS storage if you shutdown your instance. Note that you can always stop and terminate instances from AWS Management Console. I usually leave option on “Stop” to avoid accidental data loss. You can skip defining any tags — this is optional metadata so you can orient better in your instances. I recommend you at least specify a descriptive name to make sure you clearly distinguish multiple running instances later.
  6. If you didn’t have a Key Pair created in the past, you will do that at the next step. This is basically public / private key pair and you get to download private part — save it and keep it safe and don’t share this .pem file with anybody. Someone with access to it can gain root access to your images! You can always create more than one Key Pair by the way.
  7. Next, you will need to either select an existing Security Group or create a new one. Default security group doesn’t fit us because you want to open other ports to access you 11g XE database. You can keep default group and only access by SSH if local access from SQL*Plus command prompt is all you need. It’s also the safest way but for your playground, you might want more flexibility. For 11gXE instance you will probably want SSH access (port 22), SQL*Net access (port 1521) and APEX access (port 8080). I also like to open ICMP for ping. Be sure you understand what you are doing if you will be placing any sensitive data there. I also open it to the world (source 0.0.0.0/0) so anybody who knows the passwords or have correct shard keys setup, can get on your instance. You can limit it to your current IP only (and you can change the policy online if you IP changes later — use AWS Management Console). There are bunch of site that would tell you your public IP (providing you don’t use a proxy coming from another IP) like this one. To limit access from that IP only enter it in the source as xxx.xxx.xxx.xxx/32. Of course, you can enter subnets too if you know what I’m talking about.
  8. That’s it — all that’s left is click the “Launch” button.
  9. You will then see your image as “pending” in the console and usually just seconds later it switches into “running” state. Note that it will take a minute or so to boot and launch sshd daemon so you can connect via SSH. You can also check console log by choosing “Get System Log” from the context menu (it does take few minutes usually so it will come back empty until then). The easiest way to connect is to choose “Connect” from the context menu — it will present you instructions to connect as root using the .pam key file you downloaded when creating your Key Pair earlier on. Note that if you are on Unix, you will need to set proper permission for your key to ensure safeguarding — chmod 600 AlexG.pem.

    You can also get the public IP alias from instance details as “Public DNS” – just select and instance and scroll details in the bottom pane. For that particular image, I also enable public key authentication so you can simply add your public key to oracle’s ~/.ssh/authorized_keys file — it’s already there with correct permissions. This way I don’t have to go via root every time.
    If you are a Windows user using Putty, you can convert your .pem file into Putty Private Key (.ppk) file following Marcin’s comment.
  10. Database and listener will auto-start. You can open 11g XE web interface. In my example it’s http://ec2-50-17-156-24.compute-1.amazonaws.com:8080/apex/apex_admin for Administration and http://ec2-50-17-156-24.compute-1.amazonaws.com:8080/apex for APEX web user interface. Note that it’s not SSL connection so you don’t want to use it for any sensitive data unless you reconfigure to https. This is also the time you want to change passwords from default ones.
  11. You can access your database over SQL*Net via sqlplus, SQL Developer or any other tool.
  12. You will see the instance and EBS volume attached in your AWS Management Console. If you stop the instance, you will see that the EBS volume is still attached so you data is still there when you start it. If you terminate the instance, all you changes and data will be gone since the EBS volume will be detached and deleted. You can, however, launch another instance as many time as you want from the same AMI. Just make sure you change the passwords after the launch!

That’s all — you can now start playing with Oracle 11g XE without paying a penny (or very little), without consuming any resources on your own laptop/desktop and have as many of them running as you want. And you can always start from scratch if you screw something up.

Oracle Database 11g XE Beta — Amazon EC2 Image

That’s right folks! Playing with latest beta of free Oracle Database 11g Express Edition couldn’t be any easier than that. If you are using Amazon EC2, you can have a fully working image with 64 bit Oracle Linux and Oracle 11g XE database running in a matter of few clicks and a minute to get the instance to boot.

Image — ami-ae37c8c7
Name — pythian-oel-5.6-64bit-Oracle11gXE-beta-v4
Source — 040959880140/pythian-oel-5.6-64bit-Oracle11gXE-beta-v4

You can find it in public images and at this point it’s only in US East region.

If you never used Amazon EC2 before, see detailed step-by-step guide on how to get started with EC2 on the example of this 11g XE image.

This image works great with Amazon EC2 Micro instance and I configured it specifically for Micro instance. Micro instance costs you only 2 cents per hour to run or even less than 1 cent if you are using spot instance requests (and there is free offer for new AWS users as Niall mentioned in the comments).

So what’s there?

  • Oracle Enterprise Linux 5.6 64 bit (I started with 5.5 and updated to the latest)
  • Oracle Database 11g XE Beta (oracle-xe-11.2.0-0.5.x86_64)
  • Database created and configured to start on boot
  • APEX coming with 11g XE configured on port 8080 and remote access enabled
  • 10GB root volume on EBS with 5+GB free for user data. You could store up to 11GB of data in 11g XE and there is a way to grow volumes if you need but for more critical use then playground, I’d allocate separate EBS volumes anyway.


Few things worth to mention:

  • I enabled public key authentication (“PubkeyAuthentication yes” in /etc/ssh/sshd_config) so you can setup shared key to login directly as oracle OS user – just copy your public key to /home/oracle/.ssh/authorized_keys.
  • SYS and SYSTEM password is “pythian”. Change it!
  • ADMIN password in APEX is “pythian” — change it on the first login.
  • Micro instance has 613 MB of RAM and no swap — no instance (ephemeral) storage.
  • Oracle database and listener autostarts on boot. You can use /etc/init.d/oracle-xe stop/start as root too.
  • listener.ora has been modified to include (HOST=) so that it starts on any hosname/IP.
  • APEX remote access is enabled! DBMS_XDB.SETLISTENERLOCALACCESS(FALSE)
  • Ports 1521 and 8080 are open to the world on local iptables firewall. You still need to configure proper Security Group to be able to access those ports.
  • Access APEX on http://{public-ec2-ip}:8080/apex and admin on http://{public-ec2-ip}:8080/apex/apex_admin. There is currently an issue that APEX stops working after few minutes of run-time returning 404 code. Might be a bug in beta or installation issue (for example, I run it with no swap on Micro instance).

I will be keeping the AMI up to date as things develop so AMI id could change — check back here of just search public AMIs for the latest image. I setup short URL for this page — http://bit.ly/Oracle11gXE.

If you don’t know how to use Amazon EC2 – I recommend to read the second chapter of Expert Oracle Practices: Oracle Database Administration from the Oak Table. This chapter was written by Jeremiah Wilton who’s been long time playing with Amazon EC2 for Oracle before any of us even thought of it.

When few folks confirm that it works, I’ll submit an image vi http://aws.amazon.com/amis/submit.


Update 4-Apr-2011: Create v3 image – fixed typo in database passwords, fixed retrieval of public key for ssh login as root, changed startup sequence so that ssh keys are initialized earlier as well public key retrieval.
Update 4-May-2011: Created v4 image – Increased SGA size to 212M. Set large_pool to 32M (Automatic SGA management doesn’t do it’s job properly – this is why APEX was not working – not enough large pool memory allocated). Enabled DIRECT IO and ASYNC IO for filesystem – buffered IO slowed down things a lot. Now APEX is actually pretty usable on Micro instance. Remember that you can run it on large instance to run in comfort but you are overpaying since there is 2 CPUs in large instance and 7.5GB of RAM while you can’t use more than 1GB. Of course, you could disable Direct IO and use OS buffering to take advantage of more RAM but can’t leverage both cores with APEX (it limits capacity to a single core).
Update 23-Jul-2011: If you need to use networking services from APEX (like web-service, sending emails and etc) then you need to configure network ACLs for APEX_040000 user.

Oracle11g: Analyze Table Validate Structure Cascade “FAST” (Slow Burn)

I always take notice when Oracle introduces a new “FAST” option, so it was with some excitement when I first noticed in Oracle 11g Rel 1 there was a new FAST option when running the ANALYZE TABLE CASCADE VALIDATE STRUCTURE command.   This was described in the manuals as introducing a hashing scheme that was significantly [...]

Things worth to mention and remember (II) - Parallel Execution Control 2

Continuing from the previous installment of this series I'll cover in this post some of the inevitable classics regarding Parallel Execution Control. So forgive me if you're bored by the repetition of known facts - however I still see these things too often used incorrectly, therefore I decided: This is worth to mention and remember!

- Up to and including version 10.2 PARALLEL (without any parameters) and NOLOGGING are valid keywords only in DDL commands

- Applies to all versions: NOLOGGING can not be used as a hint. It can only be specified as part of DDL, for example ALTER INDEX ... REBUILD PARALLEL NOLOGGING.

Oracle11g: Zero Sized Unusable Indexes Part II (Nathan Adler)

In my previous post, I discussed how Oracle from 11g R2 onwards will automatically drop the segment and associated storage from unusable index objects. Mohamend Houri asked in the comments section the excellent question of just how useful this feature will be in real life cases when typically indexes are not left in an unusuable state for a [...]

Parallel DML - Conventional (non-direct-path) Inserts As Select

In a recent discussion I've mentioned that I thought to remember that the DML part of conventional load as select inserts will always be executed serially, even with parallel DML enabled and requesting parallel DML execution. It's important to understand in this context that this is not the same as the parallel query execution of the SELECT part, which is possible independently from the parallel DML part.

After that discussion I realized that it was quite some time ago that I tested this scenario, probably it was back then with some 10.2 version.

So I quickly put together a small test case that I ran on 11g versions and the results were quite surprising which motivated me to take a closer look.

Oracle11g: Zero Sized Unusable Indexes (Zeroes)

Following on from my previous discussion on “Create On Demand” segments, Oracle 11g R2 has also introduced storage saving initiatives in relation to useable indexes.  Starting with a simple Oracle 10g example, we create a table and associated index:        If we now make the index unusable:        We notice that [...]

Concurrent Index Creation

When I read the recent post by the optimizer group about the new concurrent gather stats feature added in 11.2.0.2 it reminded me of the fact that I intended to publish something based on the same idea already some time ago.

The Problem

It was motivated by a client's regular need during a transition phase from non-Exadata to Exadata to create literally thousands of indexes with potentially a multitude of (sub-)partitions as fast as possible - as part of a full datapump import job of a multi-terabyte database running 11.1.0.7 and 11.2.0.1 (Exadata V2).

There are actually two issues regarding the index creation part of a large database import:

1. The datapump import performs the index creation only by a single worker thread even when using the PARALLEL worker thread import feature. Although an index could be created in parallel if you have thousands of smaller index objects this single worker thread potentially does not make efficient use of the available hardware resources with high-end configurations, including and in particular Exadata.

2. There is a nasty bug 8604502 that has been introduced with 11.1.0.7 that affects also 11.2.0.1 (fixed in 11.2.0.2 and a generic one-off patch is available on My Oracle Support for 11.1.0.7 and 11.2.0.1): The IMPDP creates all indexes serially, even those supposed to be created in parallel, and only after the creation ALTERs them to the defined PARALLEL degree. Note that the fix actually only fixes the problem at actual execution time, even with the fix installed (and in 11.2.0.2) the SQLFILE option of IMPDP still generates CREATE INDEX DDLs that will always have the parallel degree set to PARALLEL 1 (see MOS document 1289032.1 and bug 10408313 - INDEXES ARE CREATED WITH PARALLEL DEGREE 1 DURING IMPORT which has been closed as not being a bug). This "not-being-a-bug" also affects all other versions that support the datapump utility - the SQLFILE option always generates CREATE INDEX scripts with the parallel degree set to 1 no matter what the actual degree of the index is supposed to be. It's only the ALTER INDEX DDL command following the CREATE INDEX command that sets the parallel degree correctly.

These two issues in combination meant to them that a full database import job took ages to complete the index creation step after loading quite quickly the vast amount of table data in parallel.

In case of partitioned indexes there is another complication independently from the mentioned issues: Oracle uses only one parallel slave per partition for creation - in case of large and/or few partitions this again doesn't make efficient use of the available resources.

Oracle therefore provides several means to speed up index creation and rebuild tasks, in particular the documented DBMS_PCLXUTIL package that is around since the Oracle 8 days to overcome the above mentioned limitation of partitioned index creation by spawning multiple jobs each rebuilding an index partition in parallel.

Another, undocumented feature is the DBMS_INDEX_UTL package that is obviously used internally as part of several maintenance operations, for example those DDLs that include the "UPDATE INDEXES" clause. According to the spec it allows to rebuild multiple indexes concurrently by spawning multiple jobs - however since it is undocumented it might not be safe to use in production-like configurations - furthermore it might be changed in future releases without further notice and therefore is potentially unreliable.

A Solution

Since the client wanted a quick solution that ideally addressed all of the above issues I came up with a simple implementation that uses Advanced Queueing and background jobs to create as many indexes as desired concurrently.

The solution is targeted towards the client's scenario, so the following is assumed:

- There is a SQL file that contains the CREATE INDEX statements. This can easily be generated via IMPDP based on the dump files using the SQLFILE option.

- To address the CREATE INDEX (not-being-a-)bug (the bugfix for the bug 8604502 still generates incorrect CREATE INDEX DDLs with the SQLFILE option of IMPDP as mentioned above) I've created a combination of "sed" and "awk" unix scripts that take the IMPDP SQLFILE potentially including all DDLs commands as input and create a output file that consists solely of the CREATE INDEX commands with correct PARALLEL clauses based on the ALTER INDEX command following the CREATE INDEX in the script

- To address the lengthy index creation process I've created a small PL/SQL package that sets up the required AQ infrastructure, takes the CREATE INDEX DDL file as input, populates a queue with the index creation commands and spawns as many worker threads as specified that will take care of the actual index creation (that in turn might be a parallel index creation)

As a side note it is interesting that Oracle actually allows to build several indexes concurrently on the same segment (which makes totally sense but does probably not happen too often in practice).

Note that in principle this code could be used as a general template to execute arbitrary DDLs concurrently (of course with corresponding modifications).

The following link allows to download an archive that contains the following subdirectories:

- correct_parallel_clause: This directory contains the Unix scripts mentioned above that allow to process a SQLFILE generated by IMPDP and output a DDL file that solely consists of the CREATE INDEX commands contained in the SQLFILE. The generated CREATE INDEX statements also use a correct PARALLEL clause - the degree is taken from the ALTER INDEX DDL command following the CREATE INDEX in the SQLFILE. For further details refer to the README.txt in that directory. Note that the script at present does not handle Domain Indexes, only conventional and bitmap.

- source: Contains the package source for the concurrent index creation, furthermore a package that is required by the provided automated unit testing (see below for more details) and a script that prompts for the required details to initiate a concurrent index creation. The README.txt in that directory provides a quick start guide how to use the concurrent index creation.

- test: Contains two flavours of test harnesses for automated unit testing of the package. One based on the unit testing feature implemented in SQLDeveloper 2.1.1, and another one based on "dbunit", an open-source unit testing framework based on jUnit. The README.txt in the respective subdirectories explain how to use these unit tests.

How to use it

The usage is split into two parts: The first part deals with preparing a suitable text file that consists of the CREATE INDEX commands, the second part is about processing this text file with as many worker threads as desired.

Preparing the file is straightforward: You can use the "transform_all_sql.sh" script to generate the required CREATE INDEX script from a DDL script created via IMPDP SQLFILE.

The script has been tested primarily with bash, sed and awk under Cygwin 1.7.1 and OEL5, different Unix flavors might have different versions of the shell, awk or sed and therefore might behave differently.

Simply put all four Unix scripts in the "correct_parallel_clause" directory into the same directory, mark them as executable and run the "transform_all_sql.sh" like that:

./transform_all_sql.sh < input_file > output_file

where "input_file" is the file generated via IMPDP SQLFILE option and "output_file" will be the result.

In order to perform the parallel index creation, you need an account that has suitable privileges granted. Since it is assumed that the indexes will have to be created in different schemas this account will have to have extended privileges granted. The package is implemented using invoker's rights so granting these privileges via roles is sufficient. A quick and dirty solution could be creating a temporary account and granting simply the DBA role to it (this is what I used to do to test it). Note that the account also requires EXECUTE privileges on the DBMS_AQ and DBMS_AQADM packages for the AQ stuff. It also needs a simple logging table where errors and progress will be written to as well as a type that is used as payload of the queue. Obviously the account also needs to be able to create jobs - in this version of the package this is done via DBMS_SCHEDULER. At execution time the package is going to create a queue plus queue table that also needs to be stored in a tablespace - so you should make sure that the account (or at least the database) that executes the index creation has an appropriate default tablespace defined.

You can simply run the "pk_create_index_concurrent.sql" script (located in the "source" directory) in such a suitable account which will deinstall/install all required objects.

The execution of the index creation is then straightforward (taken from the package specification):

/**
* The main entry point to create indexes via parallel threads / AQ
* @param p_directory_name The directory where the file resides that contains the CREATE INDEX DDLs
* @param p_file_name The file name in the directory above
* @param p_parallel_degree_set_1 The number threads to start for the worker thread 1 which usually
represents the SERIAL_INDEX threads - G_AUTO_PARALLEL_DEGREE means use the CPU_COUNT and
CLUSTER_DATABASE_INSTANCES parameter to determine number of threads automatically
* @param p_parallel_degree_set_2 The number threads to start for the worker thread 2 which usually
represents the PARALLEL_INDEX threads - G_AUTO_PARALLEL_DEGREE means get the CPU_COUNT and
CLUSTER_DATABASE_INSTANCES parameter to determine number of threads automatically,
however 1 is the default here since we assume that these indexes use parallel DDL
* @param p_job_submit_delay The number of seconds each job will be delayed to allow Oracle
proper load balancing in a cluster, default 30 seconds (commented out at present due to
odd locking issues on the queue table in RAC environments)
* @param p_sleep_seconds The number of seconds to wait for the threads to startup
before attempting to teardown the AQ infrastructure again
* @param p_optional_init Optionally a SQL can be passed usually used to initialize the session
for example forcing a particular parallel degree
* @param p_worker_set_id_1
The character identifier used to identify the indexes to process by the first worker thread set
Default value is "SERIAL_INDEX"
* @param p_worker_set_id_2
The character identifier used to identify the indexes to process by the second worker thread set
Default value is "PARALLEL_INDEX"
**/
procedure create_index_concurrent(
p_directory_name in varchar2
, p_file_name in varchar2
, p_parallel_degree_set_1 in integer default G_AUTO_PARALLEL_DEGREE
, p_parallel_degree_set_2 in integer default 1
, p_job_submit_delay in integer default 30
, p_sleep_seconds in integer default 10
, p_optional_init in varchar2 default null
, p_worker_set_id_1 in varchar2 default G_WORKER_SET_ID_1
, p_worker_set_id_2 in varchar2 default G_WORKER_SET_ID_2
);

Note that the "p_job_submit_delay" parameter is currently not used - there were some odd locking issues on the AQ table in case of a RAC environment when using that option so I have commented out its usage at present - I haven't had a chance yet to investigate further what the problem actually was.

So the only required input to the CREATE_INDEX_CONCURRENT procedure is the name of the directory object that points to the directory where the file to process resides and the name of the file itself.

You probably want to specify the number of worker threads for the two sets: The idea here is to distinguish between the creation of serial and parallel indexes. The first parameter specifies the number of worker threads used for serial indexes, the second one the number of concurrent threads for parallel indexes.

The default is CPU_COUNT * INSTANCES threads for serial indexes and a single thread for parallel indexes.

If you don't want/need this separation of serial and parallel indexes simple use the same "worker_set_id" for both parameters "p_worker_set_id_1" and "p_worker_set_id_2" and specify the desired total parallel degree in one of the degree parameters and set the other one to 0 (the 0 is required otherwise one of the DBMS_SCHEDULER.CREATE_JOB calls will fail with a "duplicate job name/job name already exists").

The "p_sleep_seconds" parameter is only used to allow the jobs spawned to put a lock on the queue table - the teardown is then going to wait until all locks have been removed and therefore all queue processing has ended. The default of 10 seconds was sufficient in all cases I've encountered.

Since the package requires as prerequisite a directory where the file to process resides, I've prepared the script "create_index_concurrent.sql" that guides through the required inputs and takes care of that step as well.

It takes the full O/S path to the file and the file name as input, creates a directory CREATE_INDEX_CONCURRENT_DIR pointing to that directory and prompts then for the two degrees as input and the names of the two worker thread sets before calling the CREATE_INDEX_CONCURRENT stored procedure.

Caveats

Please note that you should double-check not to pass a non-transformed SQLFILE generated via IMPDP to the procedure - the results may be dire since the generated SQLFILE always contains much more than the bare CREATE INDEX commands, no matter what options you use for IMPDP. Always use the provided Unix scripts to post-process the SQLFILE before initiating the index creation.

Furthermore you need to be aware of the current limitation of the package that it does not attempt to tokenize the file contents. It simply uses a semicolon as delimiter to separate the DDL commands. This should be sufficient for most cases, but in case you have a function-based index using a string expression containing a semicolon as part of the index definition this will not work as expected. Also if you plan to use this package for other DDL execution activities like CTAS statements you might again hit this limitation if the DDL text contains semicolons.

Note that creating indexes using this tool results potentially in different index statistics than creating the indexes using IMPDP since IMPDP by default also imports the index statistics whereas the indexes created using this tool will end up with the current index statistics automatically generated during index creation (from 10g onwards, and the code requires at least 10.2). If you want to have the index statistics imported you can run IMPDP after the index creation using the INCLUDE=INDEX_STATISTICS option. This should complete fairly quickly and will import the index statistics only.

If you have SERVEROUTPUT enabled by default then you will very likely see some errors that will be printed by the initial attempt to tear down the AQ infrastructure. These errors are expected if the previous run was completed successfully or in case of the initial run and can be ignored (and will be catched/ignored by the default implementation).

Note also that all provided scripts except for the Unix shell scripts use DOS file format - under OEL this isn't a problem but it might be on your platform.

Finally the inevitable disclaimer: Although this has been tested thoroughly it comes with absolutely no warranty. Use it at your own risk and test it in your environment before attempting any runs against anything important.

Monitoring the execution

The code logs errors and progress into the table CREATE_INDEX_CONCURRENT_LOG. At present the code logs every attempt to execute DDL into the table as well as any errors that are raised during that DDL execution.

So the table can be used for both, monitoring the progress as well as checking for errors. The code currently continues the execution in case of errors encountered using the dreaded WHEN OTHERS THEN NULL construct, but the code is already prepared for a more granular error handling if required - see the defined exceptions and commented out exception handler.

You can view the queue contents in the corresponding queue view created by the AQ setup (AQ$CREATE_INDEX_QUEUE) in order to see the data to process. Note that due to the fact that all worker threads do not commit the queue transaction you won't be able to see the progress in the queue table until all worker threads committed. If you don't like that you can remove the wait and "teardown_aq" call at the end of the main procedure "create_index_concurrent" and uncomment the dequeue option "visibility=immediate" in the "create_index_thread" procedure. You would need then to call "teardown_aq" in a separate step as desired. With this modification you can monitor the progress by monitoring the queue, but the provided automated unit testing won't work with that variant since it relies on the main call to wait for all worker threads to complete before validating the results.

However you can see the progress also in the log table using the following sample query:

select
to_char(log_timestamp, 'DD-MON-YYYY HH24:MI:SS.FF') as log_timestamp
, sql_statement
, message
from
create_index_concurrent_log
order by
log_timestamp desc;

If you want to perform more sophisticated queries on the that table you might need to use some casts similar to the following, because the text columns are defined as CLOBs in order to be able to hold the complete DDLs and error messages in case of errors. The casts allow you to perform for example GROUP BYs etc.

select
to_char(log_timestamp, 'DD-MON-YYYY HH24:MI:SS.FF') as log_timestamp
, cast(substr(sql_statement, 1, 30) as varchar2(30)) as index_name
, cast(substr(message, 1, 128) as varchar2(128)) as worker_set_id
from
create_index_concurrent_log
order by
log_timestamp desc;

The Unit Testing

Here we come to a completely different issue that is off-topic for this post, however in my experience so far it seems to be a very important one and I hopefully will have the time to cover it in the future with separate posts.

Generally speaking I've seen to many shops that don't follow best-practice when it comes to database deployment and development, therefore here is what you should know/do about it ideally - in a nutshell:

- Treat your database like source code, which means put everything related to the database under version control. This includes not only the obvious database source code but also DDL and DML scripts for schema evolution
- Use unit testing to test database code. Automate this unit testing
- Automate the deployment of your database related changes
- Install a continuous integration environment that runs the automated deployment and unit tests regularly, for example every night
- Automate deployment everywhere - starting from the development databases up to the production environment
- Follow your guidelines strictly - for example any hotfix-like adhoc change should still go through the established processes - code changes, testing, deployment etc.

I've helped several clients in the past to setup corresponding tools and processes for implementing above - if you are interested, get in touch with me.

So as a bonus, if you haven't spent too much time yet with above mentioned topics, in order to get you started at least with automated unit testing, I've included two different examples for this small source provided, one using the built-in unit test feature of SQLDeveloper and the other one using "dbunit". You can find both in the corresponding subdirectories of the "test" folder in the archive.

The unit testing is based on the "pk_create_index_concur_test.sql" package that is used to setup and teardown the environment for running the unit test. It assumes at present the existence of a directory "C:\app\oracle\admin\orcl112\dpdump" on O/S level. It will create a directory object for the path and attempt to create/write a file used for the unit test runs. You can pass any valid O/S directory path to the "pk_create_index_concur_test.setup" procedure if you want/need to use a different one.

All provided automated tests assume that both scripts, "pk_create_index_concurrent.sql" and "pk_create_index_concur_test.sql" have been run in the schema that should be used for test runs.

You can use the SQLDeveloper Unit Test feature to run the provided Unit Test. You can either use the GUI to import and run the test, or you can use a command line version that is actually using ANT to run the UTUTIL command line tool that comes with SQLDeveloper. You can read and follow the instructions in the "README.txt" in the test/SQLDeveloper directory how to do so. You'll need to setup a unit test repository initially if you want to use SQLDeveloper's unit testing feature either way (GUI or UTUTIL command line). See the SQLDeveloper's user's guide or online help how to do that (Hint: Menu item "Extras->Unit Testing" gets you started).

If you don't like the SQLDeveloper unit test approach or you are simply to lazy to install the tool, the unit test repository etc., you can alternatively try the automated unit testing using "dbunit". Follow the instructions in the "README.txt" in the test/dbunit directory how to run the unit tests using "dbunit".

This version of the package has successfully been tested using these unit tests on 10.2.0.4, 10.2.0.5, 11.1.0.7, 11.2.0.1 and 11.2.0.2 (after all it's dead easy with automated unit testing :-).

Summary

The provided tool set should represent a solid foundation for the given task of concurrent index creation. In particular it has been designed with the following in mind:

- Efficient use of privileges granted via roles: The package uses invoker's rights and most operations use dynamic SQL to avoid compilation issues, therefore granting the required privileges to the account used via roles should be sufficient

- The Unix scripts should be able to deal with table-, schema- and database-level datapump formats from Oracle 10g and 11g (all these variants use slightly different texts to identify the relevant sections of the generated SQLFILE by IMPDP)

- Optional use of two separate worker thread sets: This allows the concurrent creation of a multitude of indexes, be it serial or parallel, with clear distinction between the handling of serial (possibly many worker threads) and parallel indexes (usually only a few worker threads)

- Support for arbitrarily sized SQL: The DDL commands for (sub-)partitioned indexes can become quite large due to the way the Oracle meta data API generates the SQL. Therefore these generated SQLs can easily exceed the usual 32KB limit for PL/SQL character strings. The implementation uses CLOBs for the processed SQLs (and DBMS_SQL in versions lower than 11 to handle these piecewise) to support these potentially very large SQLs

- RAC/Grid/Cluster support via DBMS_SCHEDULER: The usage of DBMS_SCHEDULER allows a fine grained control of the resource consumption by the optional use of job classes (not implemented yet but can easily be added - it is a simple additional parameter to the CREATE_JOB procedure) that allow to specify a resource consumer group and a specific service name for the spawned worker threads

- Automated Unit Testing support: The provided unit test harness allows for easy testing of modifications to the code

Oracle11g Creation On Demand Indexes (Invisible Touch)

Prior to Oracle11g Release 2, the default and minimum size of a segment is one extent. So in the below example, where we create a table and five associated indexes:     Each of the segments has been allocated an extent, including each of the indexes.   However, since Oracle11g Release 2, this default behaviour has changed. [...]