Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

“Let’s Talk Database” is Back !! Canberra/Sydney/Melbourne

Due to popular demand, I’ve been asked by Oracle to again run some “Let’s Talk Database” events this month. Dates and venues are as follows: Wednesday, 23 May – Canberra (Cliftons Canberra, 10 Moore St): Registration Link. Tuesday, 29 May – Melbourne (Oracle Melbourne Office, 417 St Kilda Road): Registration Link. Wednesday, 30 May – Sydney […]

A look into oracle redo, part 11: log writer worker processes

Starting from Oracle 12, in a default configured database, there are more log writer processes than the well known ‘LGWR’ process itself, which are the ‘LGnn’ processes:

$ ps -ef | grep test | grep lg
oracle   18048     1  0 12:50 ?        00:00:13 ora_lgwr_test
oracle   18052     1  0 12:50 ?        00:00:06 ora_lg00_test
oracle   18056     1  0 12:50 ?        00:00:00 ora_lg01_test

These are the log writer worker processes, for which the minimal amount is equal to the amount public redo strands. Worker processes are assigned to a group, and the group is assigned to a public redo strand. The amount of worker processes in the group is dependent on the undocumented parameter “_max_log_write_parallelism”, which is one by default.

The actual usage of the worker processes is dependent in the first place on the value of the undocumented parameter “_use_single_log_writer”, for which the default value is ‘ADAPTIVE’, which means it’s switching automatically between ‘single log writer mode’, which is the traditional way of the LGWR process handling everything that the log writer functionality needs to do, and the ‘scalable log writer mode’, which means the log writer functionality is presumably using the log writer worker processes.

Other values for “_use_single_log_writer” are ‘TRUE’ to set ‘single log writer mode’, or ‘FALSE’ to set ‘scalable log writer mode’ fixed.

I assume most readers of this blog will know that the master log writer idle work cycle is sleeping on a semaphore (semtimedop()) under the wait event ‘rdbms ipc message’ for 3 seconds, then performs some “housekeeping”, after which it’ll sleep again repeating the small cycle of sleeping and housekeeping. For the log writer worker processes, this looks different if you look at the wait event information of the log writer worker processes:

135,59779,@1    14346                    DEDICATED oracle@memory-presentation.local (LGWR)	    time:1909.44ms,event:rdbms ipc message,seq#:292
48,34282,@1     14350                    DEDICATED oracle@memory-presentation.local (LG00)	    time:57561.85ms,event:LGWR worker group idle,seq#:150
136,24935,@1    14354                    DEDICATED oracle@memory-presentation.local (LG01)	    time:112785.66ms,event:LGWR worker group idle,seq#:74

The master log writer process (LGWR) has been sleeping for 1.9s when I queried the database, and it will sleep for 3 seconds, and then do some work and sleep again. However, the log writer worker processes have been sleeping for much longer: LG00 for 57.6s and LG01 for 112.8s, and the event is different: ‘LGWR worker group idle’. How is this implemented? Let’s look!

$ strace -p $(pgrep lg01)
strace: Process 14354 attached
semtimedop(360448, [{27, -1, 0}], 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
semtimedop(360448, [{27, -1, 0}], 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)

I used strace on the LG01 process, and it’s still doing the same as most idle background processes are doing: sleeping on a semaphore for 3 seconds. But, it does not end its wait like LGWR does, the event the log writer worker processes are waiting in keeps on being timed.

Using a pin tools debugtrace shows the following:

 | | < semtimedop+0x000000000023 returns: 0xffffffffffffffff
 | | > __errno_location(0x38000, 0x7ffce278c328, ...)
 | | | > fthread_self(0x38000, 0x7ffce278c328, ...)
 | | | < fthread_self+0x000000000024 returns: 0
 | | < __errno_location+0x000000000010 returns: 0x7f7e930a26a0
 | < sskgpwwait+0x00000000014e returns: 0
 < skgpwwait+0x0000000000e0 returns: 0
 > ksuSdiInProgress(0x19e80, 0x19e80, ...)
 < ksuSdiInProgress+0x000000000035 returns: 0
 > sltrgftime64(0x19e80, 0x19e80, ...)
 | > clock_gettime@plt(0x1, 0x7ffce278c3a0, ...)
 | | > clock_gettime(0x1, 0x7ffce278c3a0, ...)
 | | < clock_gettime+0x000000000069 returns: 0
 | < clock_gettime+0x00000000003a returns: 0
 < sltrgftime64+0x00000000004c returns: 0x19c253f3ff
 > kslwo_getcbk(0xa2, 0xd80fa62, ...)
 < kslwo_getcbk+0x000000000017 returns: 0
 > kgslwait_last_waitctx_time_waited_usecs(0x7f7e930a29a0, 0x6dfd01c0, ...)
 < kgslwait_last_waitctx_time_waited_usecs+0x000000000045 returns: 0x25e5e80
 > kskiorm(0x6d1854a8, 0, ...)
 < kskiorm+0x00000000001e returns: 0
 > kfias_iswtgon_ksfd(0x6d1854a8, 0, ...)
 < kfias_iswtgon_ksfd+0x00000000002b returns: 0
 > kxdbio_has_work(0x7ffce278c3c4, 0x6003d010, ...)
 < kxdbio_has_work+0x000000000027 returns: 0
 > skgpwwait(0x7ffce278c630, 0x7f7e930a7ca0, ...)
 | > kslwait_conv_wait_time(0x2dc6c0, 0x7f7e930a7ca0, ...)
 | < kslwait_conv_wait_time+0x000000000027 returns: 0x2dc6c0
 | > sskgpwwait(0x7ffce278c630, 0x7f7e930a7ca0, ...)
 | | > semtimedop(0x38000, 0x7ffce278c328, ...)
 | | < semtimedop+0x000000000023 returns: 0xffffffffffffffff

And a full stack trace of a log writer worker look like this:

$ pstack $(pgrep lg01)
#0  0x00007feda8eaebda in semtimedop () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000000010f9cca6 in sskgpwwait ()
#2  0x0000000010f9a2e8 in skgpwwait ()
#3  0x0000000010a66995 in ksliwat ()
#4  0x0000000010a65d25 in kslwaitctx ()
#5  0x00000000031fb4d0 in kcrfw_slave_queue_remove ()
#6  0x00000000031fad2a in kcrfw_slave_group_main ()
#7  0x00000000012160fa in ksvrdp_int ()
#8  0x000000000370d99a in opirip ()
#9  0x0000000001eb034a in opidrv ()
#10 0x0000000002afedf1 in sou2o ()
#11 0x0000000000d0547a in opimai_real ()
#12 0x0000000002b09b31 in ssthrdmain ()
#13 0x0000000000d05386 in main ()

If you combine the pstack backtrace and the debugtrace information, you see that the idle cycle does not leave the ‘ksliwat’ function, so the wait event is not finished. Quickly looking at the other functions, it’s easy to spot it reads the system clock (sltrgftime64), updates some information (kgslwait_last_waitctx_time_waited_usecs) and then performs some proactive IO checks (kskiorm, kfias_iswtgon_ksfd, kxdbio_has_work) after which it calls the post/wait based functions to setup the semaphore again.

Conclusion so far is the log writer workers do perform a 3 second sleep just like the master log writer, however the wait event ‘LGWR worker group idle’ is not interrupted like ‘rdbms ipc message’ is for the master log writer. This means the wait time for the event for each worker process indicates the last time the worker process actually performed something. A next logical question then is: but what do the log writer worker processes perform? Do they entirely take over the master log writer functionality, or do they work together with the master log writer?

In order to fully understand the next part, it is very beneficial to read up on how the log writer works in ‘single log writer’ mode, where the master log writer handling the idle and work cycle itself:
https://fritshoogland.wordpress.com/2018/02/20/a-look-into-into-oracle-redo-part-4-the-log-writer-null-write/
https://fritshoogland.wordpress.com/2018/02/27/a-look-into-oracle-redo-part-5-the-log-writer-writing/

If you want to perform this investigation yourself, make sure the database is in ‘scalable log writer’ mode, by setting “_use_single_log_writer” to FALSE. This is exactly what I did in order to make sure a log write is done in ‘scalable log writer’ mode.

Now let’s first apply some logic. Above the idle cycle of a log writer worker process is shown. Based on the ‘log writer null write’ blog post, we know that the log writer does advance the LWN and On-disk SCN every 3 seconds. Clearly, the log writer worker process does not do that. So that must mean the master log writer is still performing that function. It would also make very much sense, because it doesn’t matter for scalability if the master log writer performs the function of advancing the LWN and On-disk SCN or a worker process, nothing is waiting on it. Plus, if the master log writer performs most of its functions just like in ‘single log writer’ mode, the change to scalable mode would mean no change for client processes, any committing process must semop() the log writer to start writing.

Let’s look at the relevant debugtrace output of the master log writer in scalable log writer mode:

 | > kcrfw_redo_write_driver(0, 0, ...)
 | | > kcrfw_handle_member_write_errors(0, 0, ...)
 | | < kcrfw_handle_member_write_errors+0x000000000020 returns: 0x600161a0
 | | > kcmgtsf(0, 0, ...)
 | | | > sltrgatime64(0, 0, ...)
 | | | | > sltrgftime64(0, 0, ...)
 | | | | | > clock_gettime@plt(0x1, 0x7fff1fe13010, ...)
 | | | | | | > clock_gettime(0x1, 0x7fff1fe13010, ...)
 | | | | | | < clock_gettime+0x000000000069 returns: 0
 | | | | | < clock_gettime+0x00000000003a returns: 0
 | | | | < sltrgftime64+0x00000000004c returns: 0x53747fe42
 | | | < sltrgatime64+0x00000000003e returns: 0x155d4fd
 | | < kcmgtsf+0x00000000032f returns: 0x3a182314
 | | > kcrfw_slave_adaptive_updatemode(0, 0x600161a0, ...)
 | | < kcrfw_slave_adaptive_updatemode+0x000000000080 returns: 0x7efe34d1f760
 | | > kcrfw_defer_write(0, 0x600161a0, ...)
 | | < kcrfw_defer_write+0x000000000038 returns: 0x7efe34d1f760
 | | > kcrfw_slave_queue_find(0, 0x600161a0, ...)
 | | < kcrfw_slave_queue_find+0x0000000000f1 returns: 0
 | | > kcrfw_slave_queue_setpreparing(0, 0x1, ...)
 | | < kcrfw_slave_queue_setpreparing+0x000000000021 returns: 0
 | | > kcrfw_slave_group_switchpic(0, 0x1, ...)
 | | < kcrfw_slave_group_switchpic+0x000000000050 returns: 0x699b4508
 | | > skgstmGetEpochTs(0, 0x1, ...)
 | | | > gettimeofday@plt(0x7fff1fe13070, 0, ...)
 | | | < __vdso_gettimeofday+0x0000000000fe returns: 0
 | | < skgstmGetEpochTs+0x000000000049 returns: 0x20debfd6192e5
 | | > kcsnew3(0x600113b8, 0x7fff1fe13228, ...)
 | | | > kcsnew8(0x600113b8, 0x7fff1fe13070, ...)
 | | | | > kslgetl(0x60049800, 0x1, ...)
 | | | | < kslgetl+0x00000000012f returns: 0x1
 | | | | > kslfre(0x60049800, 0x1, ...)
 | | | | < kslfre+0x0000000001e2 returns: 0
 | | | < kcsnew8+0x000000000117 returns: 0
 | | | > ub8_to_kscn_impl(0x66c3c7, 0x7fff1fe13228, ...)
 | | | < ub8_to_kscn_impl+0x000000000031 returns: 0
 | | < kcsnew3+0x00000000006f returns: 0x8000
 | | > ktfwtsm(0x3a182314, 0x7fff1fe13228, ...)
 | | | > kcmgtsf(0x2, 0x7fff1fe13228, ...)
 | | | | > sltrgatime64(0x2, 0x7fff1fe13228, ...)
 | | | | | > sltrgftime64(0x2, 0x7fff1fe13228, ...)
 | | | | | | > clock_gettime@plt(0x1, 0x7fff1fe12fe0, ...)
 | | | | | | | > clock_gettime(0x1, 0x7fff1fe12fe0, ...)
 | | | | | | | < clock_gettime+0x000000000069 returns: 0
 | | | | | | < clock_gettime+0x00000000003a returns: 0
 | | | | | < sltrgftime64+0x00000000004c returns: 0x537484a6d
 | | | | < sltrgatime64+0x00000000003e returns: 0x155d511
 | | | < kcmgtsf+0x0000000001b2 returns: 0x3a182314
 | | | > kcmtdif(0x3a182314, 0x3a182314, ...)
 | | | < kcmtdif+0x00000000001b returns: 0
 | | | > ksl_get_shared_latch_int(0x60050340, 0x6ddb1408, ...)
 | | | < ksl_get_shared_latch_int+0x00000000016b returns: 0x1
 | | <> kslfre(0x60050340, 0x66c3c7, ...)
 | | < kslfre+0x0000000001e2 returns: 0
 | | > kcn_stm_write(0x7fff1fe13228, 0x66c3c7, ...)
 | | | > kstmgetsectick(0x7fff1fe13228, 0x66c3c7, ...)
 | | | < kstmgetsectick+0x00000000003a returns: 0x5ae4c494
 | | | > ksl_get_shared_latch_int(0x6004ee40, 0x6ddb1408, ...)
 | | | < ksl_get_shared_latch_int+0x00000000016b returns: 0x1
 | | <> kslfre(0x6004ee40, 0x2244, ...)
 | | < kslfre+0x0000000001e2 returns: 0
 | | > kcrfw_redo_write_initpic(0x699b4508, 0x7fff1fe13228, ...)
 | | | > kscn_to_ub8_impl(0x7fff1fe13228, 0x7fff1fe13228, ...)
 | | | < kscn_to_ub8_impl+0x00000000003e returns: 0x66c3c7
 | | < kcrfw_redo_write_initpic+0x0000000000dc returns: 0x3a182314
 | | > kscn_to_ub8_impl(0x7fff1fe13228, 0, ...)
 | | < kscn_to_ub8_impl+0x00000000003e returns: 0x66c3c7
 | | > kcrfw_gather_lwn(0x7fff1fe13268, 0x699b4508, ...)
 | | | > kslgetl(0x6abe4538, 0x1, ...)
 | | | < kslgetl+0x00000000012f returns: 0x1
 | | | > kcrfw_gather_strand(0x7fff1fe13268, 0, ...)
 | | | < kcrfw_gather_strand+0x0000000000c2 returns: 0
 | | | > kslfre(0x6abe4538, 0x17d5f, ...)
 | | | < kslfre+0x0000000001e2 returns: 0
 | | | > kslgetl(0x6abe45d8, 0x1, ...)
 | | | < kslgetl+0x00000000012f returns: 0x1
 | | | > kcrfw_gather_strand(0x7fff1fe13268, 0x1, ...)
 | | | < kcrfw_gather_strand+0x0000000000c2 returns: 0
 | | | > kslfre(0x6abe45d8, 0x137, ...)
 | | | < kslfre+0x0000000001e2 returns: 0
 | | < kcrfw_gather_lwn+0x00000000065c returns: 0xffffffff
 | | > krsh_trace(0x1000, 0x200, ...)
 | | < krsh_trace+0x00000000005d returns: 0
 | | > kspgip(0x71e, 0x1, ...)
 | | < kspgip+0x00000000023f returns: 0
 | | > kcrfw_slave_queue_setpreparing(0, 0, ...)
 | | < kcrfw_slave_queue_setpreparing+0x000000000021 returns: 0
 | | > kcrfw_slave_queue_flush_internal(0x1, 0, ...)
 | | < kcrfw_slave_queue_flush_internal+0x0000000000d7 returns: 0x1
 | | > kcrfw_do_null_write(0, 0, ...)
 | | | > kcrfw_slave_phase_batchdo(0, 0, ...)
 | | | | > kcrfw_slave_phase_enter(0, 0x9b, ...)
 | | | | < kcrfw_slave_phase_enter+0x000000000449 returns: 0
 | | | <> kcrfw_slave_phase_exit(0, 0x9b, ...)
 | | | < kcrfw_slave_phase_exit+0x00000000035a returns: 0
 | | | > kcrfw_post(0, 0, ...)
 | | | | > kcrfw_slave_single_getactivegroup(0, 0, ...)
 | | | | < kcrfw_slave_single_getactivegroup+0x000000000047 returns: 0x6a9a0718
 | | | | > kspGetInstType(0x1, 0x1, ...)
 | | | | | > vsnffe_internal(0x19, 0x1, ...)
 | | | | | | > vsnfprd(0x19, 0x1, ...)
 | | | | | | < vsnfprd+0x00000000000f returns: 0x8
 | | | | | | > kfIsASMOn(0x19, 0x1, ...)
 | | | | | | <> kfOsmInstanceSafe(0x19, 0x1, ...)
 | | | | | | < kfOsmInstanceSafe+0x000000000031 returns: 0
 | | | | | < vsnffe_internal+0x0000000000a7 returns: 0
 | | | | | > kspges(0x115, 0x1, ...)
 | | | | | < kspges+0x00000000010f returns: 0
 | | | | < kspGetInstType+0x0000000000b1 returns: 0x1
 | | | | > kcrfw_slave_phase_enter(0x1, 0x9b, ...)
 | | | | < kcrfw_slave_phase_enter+0x00000000006f returns: 0x9b
 | | | | > kcscu8(0x60016290, 0x7fff1fe12f98, ...)
 | | | | < kcscu8+0x000000000047 returns: 0x1
 | | | | > kcsaj8(0x60016290, 0x7fff1fe12f38, ...)
 | | | | < kcsaj8+0x0000000000dc returns: 0x1
 | | | | > kcrfw_slave_phase_exit(0x1, 0x9b, ...)
 | | | | < kcrfw_slave_phase_exit+0x00000000008e returns: 0
 | | | | > kslpsemf(0x97, 0, ...)
 | | | | | > ksl_postm_init(0x7fff1fe0ac30, 0x7fff1fe12c50, ...)
 | | | | | < ksl_postm_init+0x00000000002b returns: 0
 | | | | < kslpsemf+0x0000000006b5 returns: 0x1f
 | | | | > kcrfw_slave_barrier_nonmasterwait(0x6a9a0720, 0x4, ...)
 | | | | < kcrfw_slave_barrier_nonmasterwait+0x000000000035 returns: 0x600161a0
 | | | < kcrfw_post+0x000000000c1c returns: 0xd3
 | | < kcrfw_do_null_write+0x0000000000b2 returns: 0xd3
 | < kcrfw_redo_write_driver+0x000000000535 returns: 0xd3

The highlighted functions are extra functions executed when the instance is set to scalable log writer mode, or when adaptive mode has set the instance to scalable log writer mode. This means that the changes between the modes is minimal when there’s no writes, and outside of a few extra functions, the log writer does exactly the same.

The absence of any spectacular changes in the behaviour of the log writer when in scalable log writer mode when there are no writes does hint what the actual changes will be of the scalable mode, which is how writing is handled. In single log writer mode, the most time the log writer is process is likely to spend on is writing the change vectors into the online redologfiles, and maybe, if you have a bad application (!) semop()-ing foreground sessions will be second, if there are a large number of processes committing, because every process needs to be semop()-ed individually. These two functions, along with some other functionality are exactly what the log writer worker processes are doing.

This means that foreground processes do nothing different in scalable log writer mode, they signal (semop) the master log writer, which will investigate the public redo strands, and if the master log writer finds change vectors to write, it will assign log writer worker processes to perform the write, and the log writer worker process will semop() the foreground sessions to indicate the redo has been written when the instance is in post/wait mode, or do not semop() when the instance is in polling mode.

This is the entire function flow of a write when the instance is in scalable log writer mode:

 | > kcrfw_slave_queue_insert(0, 0xd3, ...)
 | | > kcrfw_slave_group_setcurrsize(0, 0, ...)
 | | < kcrfw_slave_group_setcurrsize+0x0000000001d1 returns: 0x1
 | | > _intel_fast_memcpy(0x6a9a05f8, 0x7ffdae335fa0, ...)
 | | <> _intel_fast_memcpy.P(0x6a9a05f8, 0x7ffdae335fa0, ...)
 | | <> __intel_ssse3_rep_memcpy(0x6a9a05f8, 0x7ffdae335fa0, ...)
 | | < __intel_ssse3_rep_memcpy+0x000000002798 returns: 0x6a9a05f8
 | | > kcrfw_slave_group_postall(0, 0xf0, ...)
 | | | > ksvgcls(0, 0xf0, ...)
 | | | < ksvgcls+0x000000000021 returns: 0
 | | | > ksl_post_proc(0x6ddb32f0, 0, ...)
 | | | <> kskpthr(0x6ddb32f0, 0, ...)
 | | | <> kslpsprns(0x6ddb32f0, 0, ...)
 | | | | > ksl_update_post_stats(0x6ddb32f0, 0, ...)
 | | | | | > dbgtTrcData_int(0x7f464c0676c0, 0x2050031, ...)
 | | | | | | > dbgtBucketRedirect(0x7f464c0676c0, 0x7ffdae335338, ...)
 | | | | | | < dbgtBucketRedirect+0x000000000050 returns: 0x1
 | | | | | | > dbgtIncInMemTrcRedirect(0x7f464c0676c0, 0x6fa, ...)
 | | | | | | < dbgtIncInMemTrcRedirect+0x000000000035 returns: 0x1
 | | | | | | > skgstmGetEpochTs(0x7f464c0676c0, 0x6fa, ...)
 | | | | | | | > gettimeofday@plt(0x7ffdae334e40, 0, ...)
 | | | | | | | < __vdso_gettimeofday+0x0000000000fe returns: 0
 | | | | | | < skgstmGetEpochTs+0x000000000049 returns: 0x20e067375b55d
 | | | | | | > dbgtrRecAllocate(0x7f464c0676c0, 0x7ffdae3352e0, ...)
 | | | | | | | > dbgtrPrepareWrite(0x7f464c0676c0, 0x65accba0, ...)
 | | | | | | | < dbgtrPrepareWrite+0x00000000011c returns: 0x4
 | | | | | | < dbgtrRecAllocate+0x000000000144 returns: 0x1
 | | | | | | > _intel_fast_memcpy(0x65acda30, 0x7ffdae3353d8, ...)
 | | | | | | <> _intel_fast_memcpy.P(0x65acda30, 0x7ffdae3353d8, ...)
 | | | | | | <> __intel_ssse3_rep_memcpy(0x65acda30, 0x7ffdae3353d8, ...)
 | | | | | | < __intel_ssse3_rep_memcpy+0x000000002030 returns: 0x65acda30
 | | | | | | > dbgtrRecEndSegment(0x7f464c0676c0, 0x7ffdae3352e0, ...)
 | | | | | | < dbgtrRecEndSegment+0x00000000011c returns: 0x77c000a4
 | | | | | < dbgtTrcData_int+0x000000000323 returns: 0x77c000a4
 | | | | < ksl_update_post_stats+0x00000000024f returns: 0x77c000a4
 | | | | > skgpwpost(0x7ffdae335480, 0x7f464c0acca0, ...)
 | | | | <> sskgpwpost(0x7ffdae335480, 0x7f464c0acca0, ...)
 | | | | | > semop@plt(0xc0000, 0x7ffdae335410, ...)
 | | | | | < semop+0x00000000000f returns: 0
 | | | | < sskgpwpost+0x00000000009a returns: 0x1
 | | | < kslpsprns+0x0000000001c3 returns: 0
 | | < kcrfw_slave_group_postall+0x0000000000a8 returns: 0
 | < kcrfw_slave_queue_insert+0x0000000001b6 returns: 0x667bc540

After the instance has established there are change vectors in kcrfw_gather_lwn, in single log writer mode, the function kcrfw_redo_write is called, which will call kcrfw_do_write which handles the writing, and kslpslf to semop any waiting processes among other things. Now in scalable log writer mode, kcrfw_slave_queue_insert is called which assigns work to worker processes, and then kcrfw_slave_group_postall is called to semop one or more worker processes.

The worker processes are sleeping on a semaphore, and if a process gets signalled, it exits the kcrfw_slave_queue_remove function, ends the wait event, and calls kcrfw_redo_write, just like the master log writer process would call in single log writer mode, which includes doing the write (kcrfw_do_write) and posting the foregrounds (kslpslf), exactly all the functions.

Conclusion.
The adaptive scalable log writer processes function has been silently introduced with Oracle 12, although a lot of the used functionality has been available more or less in earlier versions. It is a fully automatic feature which will turn itself on and off based on heuristics. The purpose of this article is to explain how it works and what it is doing. Essentially, all the functionality that surrounds a log writer write has been moved to a worker process, which means the work can be done in parallel with multiple processes, whilst all the work outside of the work around the write, which is not performance critical, is left with the master log writer.

18c Scalable Sequences Part I (Saviour Machine)

One of the problems with populating the value of a Primary Key (or some such), is that they can cause contention, especially in relation to indexes. By having an indexed value that monotonically increases, the right-hand most index leaf block is continually being accessed, which results in a “hot block”, which in turn results in […]

IOUG Collaborate 18

IMG_5325
The IOUG Collaborate 18 is now done.
I presented 2 sessions there:

From Transportable Tablespaces to Pluggable Databases

The introduction comes from a 5 minutes talk at Oracle Open World 2016 in the ‘EOUC Database ACES Share Their Favorite Database Things’, on the history of having tablespaces self-contained (with a relative file number in 8.0 and locally managed tablespaces in 8.1). I’ve added a demo on a feature that is not well known – using RMAN to transport tablespaces without the need to have the source in read-only, available since 10g. And I demoed all PDB movement features in 12cR2, 12cR2 and 18c: remote clone, refreshable clones, PDB switchover, Online relocation,…

A full article on the topic is available on Oracle Scene: http://viewer.zmags.com/publication/07098028#/07098028/8 and feel free to gove feedback here if you are using those features. Lot of interesting comments went after the session.

IMG_5324

12.2 Multitenant New Security Features to Clarify DevOps and DBA role separation

This session is basically a demo of lockdown profiles and resource manager settings at PDB level. With an introduction on DevOps because the goal of those features is to be able to lower the roundtrips between Dev and Ops by giving nearly full privileges on the PDB. Those features were developed by Oracle for their own managed cloud services: Exadata Express Cloud Service and Autonomous Data Warehouse. You are the administrator of your PDB there, but locked down to what cannot break the CDB, and limited to the resources you pay for.

I’ll give this session next Month in Dusseldorf at DOAG Datenbank: https://programm.doag.org/datenbank/2018/#/scheduledEvent/558645, so you still have the occasion to see how this Autonomous Data Warehouse Cloud service works from command line.

This is clearly an alternative to having Oracle Database on Docker, where containers have a clear separation between the user data and metadata (in the PDB) and the software distribution and data (in ORACLE_HOME, and in CDB$ROOT). But experience shows a slow adoption of multitenant, and developers are asking for Docker containers. But the separation is not so easy: it is clear that the user data must be in an external volume and the software (the Oracle Home – or at least the minimal part of it required to run the database and options). But a big part of the software (for example the dbms_… packages) is also in the database, in CDB$ROOT. Here again feel free to comment.

 

Cet article IOUG Collaborate 18 est apparu en premier sur Blog dbi services.

Europe June 2018 Seminars: Confirmed Venues for Oslo, Munich and Zürich

I’m very excited to confirm all venues for the European June 2018 dates of my popular “Oracle Indexing Internals and Best Practices” seminar. This is a must attend seminar of benefit to not only DBAs, but also to Developers, Solution Architects and anyone else interested in designing, developing or maintaining high performance Oracle-based applications. It’s […]

Collaborate 18 Session: Networking for the Cloud DBA

Thank you all for coming to my session today. In my years of experience managing DBAs, I find that the basic concepts of networking are always the weakest link in the knowledge to get things together, at least for the DBAs. In this session you learned how netmask, CIDR, routing table works, how that fits together in a cloud world, where a knowledge of networking is not just nice to have; but an absolute must.

Download the preso here. As always I highly appreciate your comments, here as comments or directly to me via twitter @ArupNanda or by email arup@proligence.com.

Collaborate 18 Session: Microservices Using Python and Oracle

Many thanks to all those who came to my session at Collaborate 18 today. I sincerely hope you enjoyed the session and learned something new in the process.

Here are the files you can download:

1. The presentation
2. The list_avail template
3. The hotel.py python program (download the file and remove the .txt extension)
4. The hoteld.py python program

In case you wanted to visit the Python learning series I wrote, here is where you can find it. bit.ly/python4plsql

As I mentioned, I presented about Oracle and Python at Oracle Code event a couple of months ago in New York City. Here is where you can get the scripts and the presentation.

Enterprise Manager Silent Installation

Introduction

This is another post in my series on silent installations using response files. In previous posts, I covered silent installations for the Grid Infrastructure, RDBMS, post installation configuration steps, and creating the listener and database using response files. In this post, I’ll cover installing Enterprise Manager using a response file.

Generating the Response Files

Unlike all the other installations and configurations using response files, for some reason Oracle Corporation in its wisdom has seen fit not to simply provide RSP files for an Enterprise Manager installation – they need to be generated (which probably just extracts them from the zip files, anyway, or am I being far too cynical?!) To do this, use a command like the following:

[oracle@host1 em]$ ./em13200p1_linux64.bin -getResponseFileTemplates -outputLoc /u01/app/em
-bash: ./em13200p1_linux64.bin: Permission denied

Of course, I had FTP’ed the files to this host, and there was no execute permission on the em13200p1_linux64.bin file:

[oracle@host1 em]$ ls -al
total 6808580
drwxr-xr-x. 2 oracle oinstall       4096 Apr 19 11:04 .
drwxrwxr-x. 6 oracle oinstall       4096 Apr 20 10:39 ..
-rw-r--r--. 1 oracle oinstall  564264960 Apr 13 07:02 em13200p1_linux64-2.zip
-rw-r--r--. 1 oracle oinstall  741526563 Apr 13 12:03 em13200p1_linux64-3.zip
-rw-r--r--. 1 oracle oinstall 2084231936 Apr 13 13:24 em13200p1_linux64-4.zip
-rw-r--r--. 1 oracle oinstall  109191154 Apr 13 11:39 em13200p1_linux64-5.zip
-rw-r--r--. 1 oracle oinstall 2146696423 Apr 13 14:47 em13200p1_linux64-6.zip
-rw-r--r--. 1 oracle oinstall  771426157 Apr 13 12:32 em13200p1_linux64-7.zip
-rw-r--r--. 1 oracle oinstall  554606940 Apr 13 11:49 em13200p1_linux64.bin
[oracle@host1 em]$ chmod u+x *.bin
[oracle@host1 em]$ ls -al
total 6808580
drwxr-xr-x. 2 oracle oinstall       4096 Apr 19 11:04 .
drwxrwxr-x. 6 oracle oinstall       4096 Apr 20 10:39 ..
-rw-r--r--. 1 oracle oinstall  564264960 Apr 13 07:02 em13200p1_linux64-2.zip
-rw-r--r--. 1 oracle oinstall  741526563 Apr 13 12:03 em13200p1_linux64-3.zip
-rw-r--r--. 1 oracle oinstall 2084231936 Apr 13 13:24 em13200p1_linux64-4.zip
-rw-r--r--. 1 oracle oinstall  109191154 Apr 13 11:39 em13200p1_linux64-5.zip
-rw-r--r--. 1 oracle oinstall 2146696423 Apr 13 14:47 em13200p1_linux64-6.zip
-rw-r--r--. 1 oracle oinstall  771426157 Apr 13 12:32 em13200p1_linux64-7.zip
-rwxr--r--. 1 oracle oinstall  554606940 Apr 13 11:49 em13200p1_linux64.bin
[oracle@host1 em]$ ./em13200p1_linux64.bin -getResponseFileTemplates -outputLoc /u01/app/em
** The directory "/tmp" does not have enough space. At least 10241 MB of free space are required.
Please input another directory or [Exit]: /u01/app/em
0%.....[/u01/app/em/em13200p1_linux64-2.zip]
  End-of-central-directory signature not found.
Failed to extract disk 2 (rc 2)
......................................................100%
Launcher log file is /u01/app/em/OraInstall2018-04-20_11-26-31AM/launcher2018-04-20_11-26-31AM.log.
Copying response file template(s)
to /u01/app/em ...
  Copying response file template upgrade.rsp
  Copying response file template software_only.rsp
  Copying response file template new_install.rsp
Finished copying response file template(s)

The log(s) can be found here: /u01/app/em/OraInstall2018-04-20_11-26-31AM.

So what went wrong this time? This is the line that tells you:

End-of-central-directory signature not found.

The first zip file didn’t copy correctly. You can also see that from the size of the file, around 540 MB instead of the nearly 2 GB it should be. Notice it did extract three response files anyway, but I couldn’t be sure they were complete (and needed to reload the second file anyway!). So I loaded it again, and again three response files were generated. The Advanced Installation and Configuration Guide tells you to only use new_install.rsp, so ignore the other two files for now. Also, notice the spurious error message about /tmp? If I look at that it would seem to have plenty of free space:

[oracle@host1 em]$ df -h /tmp
Filesystem                Size  Used Avail Use% Mounted on
/dev/mapper/rootvg-tmplv  2.0G   66M  1.8G   4% /tmp

Anyway, I told it another directory (/u01/app/em/stage) which I had to create before running the command successfully.

Using the Response File

Of all the response files I’ve worked with during this series of posts, I have to say the EM response file was the pickiest, most annoying file of the lot! When you look at the file, you’ll see a lot of variables associated with a value. When you start editing the file, they are of this sort of format:

UNIX_GROUP_NAME=

Make sure you have either changed all the parts to something else or get rid of them entirely, as the stupid installer will complain about them otherwise! A couple of other issues I found along the way:

Collaborate18 Session: Real Life DBA Best Practices

Thank you all those came for my session at IOUG Collaborate 18 today. I hope you learned something new and inspired by something to make your task a bit easier, a bit more efficient and a whole lot enjoyable.

You can download the presentation here.

Remember, a best practice is something that can be explained as:
1) why it is good, or what specific good comes by following it
2) what happens if you don't follow it
3) what circumstances it doesn't apply, if any

If you hear a "best practice" without any of these, reject it politely. And most of all, create your own best practices and promote them. We all deserve to learn something new to make our collective life a bit easier.

Friday Philosophy – If I Was a Computer, I Might Be An IBM System 360

So today I turn 50. I spent a few minutes looking up what your typical computer looked like then and I decided the IBM System/360 was about right. It was a business machine. Personal computers did not exist then as, well, it was almost the dark ages…

{Note, it has been pointed out to me that I should say “If I were a computer”. Seems that even at my age I still have things still to learn properly… I’ll leave it as is to remind myself…}.

Technology changes and we who work with it change. With tech, after a while any individual part (say a specific computer) becomes obsolete and we throw it away. The technology itself changes and we stop using the parts that are superceeded. I don’t remember the last time I used any sort of floppy disc or a Zip Drive. Ironically we still often use a floppy disc icon to identify the Save button.

But us? Do we who work with I.T. become obsolete? Yes. We do so when we stop changing with the times (or “stop working”, but this is not an “I’m old and considering my mortality” type post, you will be relieved to know). I think that if you lose your passion to keep learning something new in computing and/or programming, it’s time to move out of the arena; as, worryingly soon, you are going to become like those old systems that never get updates and you know will be thrown out if they develop a serious flaw or just become too expensive to keep on the payroll – err, I mean plugged in.

I nearly lost interest about 8,10 years ago. I think it was partly that I found myself doing the same things over & over again and having the same stupid arguments (sorry, “discussions”) about how not doing things correctly was going to just make everyone’s life harder in the long run. I don’t mean doing things the same, I mean doing the things that make a good system – ensuring it fits the business need, that it is tested before it gets released, and you do this crazy thing called design. This time it was not that I needed to alter along with the tech, I needed to alter myself a little. I finally realised that, although it was getting worse, the I.T. world has always been a bit like that and part of the trick to this business is simply walking away from places that are too bad and looking for those who are a bit better. I’m lucky to be able to do that moving about (don’t get me wrong, I did have to put effort into it and I think that is where some people go wrong, they seem to almost expect an external agent to make things better for them) but then I’m 50 and still in the business. I’ve seen lots of people simply leave the industry when they could not affect that change.

However, doing a bit of the introverted-navel-gazing that comes with Significant Birthdays, I find it interesting that at 20, 25, 30, 35,… 42 (very significant that one) things have always been changing for me.

At 20 I was studying Genetics & Zoology at college and thought I would be a lab scientist. A career in I.T. was not even a consideration.
By 25 I’d taken up computing and I had fallen into this company called Oracle and I reckoned I would be with them for a good while, building systems with Forms, ReportWriter. PL/SQL and whatever came next. Oracle would not last for ever…
When I was 30 I was self employed, touting my services to various companies and mostly doing systems design and performance work.
Come 35 and I was back full-time employed (that was a surprise) working in science organisation (even more of a surprise) using my degree to some, well, degree (an utter surprise). And presenting at Oracle user group conferences.
At 40 I was self-employed again, but now totally focused on performance and and Oracle RDBMS Subject Matter Expert (someone who knows a bit about most of it and most of a bit of it).
42. 42 is a great age. You are the answer to everything…
At 45 I was retired. Except when I was not. OK, I had become a Consultant, doing short jobs for different clients. And doing all this User Group stuff. Me! Antisocial, miserable, slightly-autistic me!
Now at 50, I have to confess I am not keeping on top of the technical details of my chosen sphere the way I probably should, if my career is still in this area. But I’m not doing bad and my “job” is now even more as a presenter and a member of the User Group community. I need new skills for that.

So it keeps changing. Sometimes I chose the change and sometimes changes just dropped on me. But I’ll look at the options as they come up. And if no options are coming up and I am not happy in my job, I go look for options. I won’t say I always choose the best option but, heck, it’s worked OK so far.

I wonder what I’ll be doing at 55 and 60? I just hope I am not stuck in a museum with a “do not touch” sign next to me, like all the remaining IBM System/360s