One of the things you can do with Pin, is profile memory access. Profiling memory access using the pin tool ‘pinatrace’ is done in the following way:
$ cd ~/pin/pin-3.0-76991-gcc-linux $ ./pin -pid 12284 -t source/tools/SimpleExamples/obj-intel64/pinatrace.so
The pid is a pid of an oracle database foreground process. Now execute something in the session you attached pin to and you find the ‘pinatrace’ output in $ORACLE_HOME/dbs:
This blogpost is an introduction to Intel’s Pin dynamic instrumentation framework. Pin and the pintools were brought to my attention by Mahmoud Hatem in his blogpost Tracing Memory access of an oracle process: Intel PinTools. The Pin framework provides an API that abstracts instruction-set specifics (on the CPU layer). Because this is a dynamic binary instrumentation tool, it requires no recompiling of source code. This means we can use it with programs like the Oracle database executable.
The Pin framework download comes with a set of pre-created tools called ‘Pintools’. Some of these tools are really useful for Oracle investigation and research.
This blogpost is about the Oracle redo log structures and redo efficiency in modern Oracle databases. Actually, a lot of subtle things changed surrounding redo (starting from Oracle 10 actually) which have gone fairly unnoticed. One thing the changes have gone unnoticed for is the Oracle documentation, the description of redo in it is an accurate description for Oracle 9, not how it is working in Oracle 10 or today in Oracle 22.214.171.124.
My test environment is a virtual machine with Oracle Linux 7.2 and Oracle 126.96.36.199.161018, and a “bare metal” server running Oracle Linux 6.7 and Oracle 188.8.131.52.160419. Versions are important, as things can change between versions.
There are many posts about the amount of memory that is taken by the Oracle database executables and the database SGA and PGA. The reason for adding yet another one on this topic is a question I recently gotten, and the complexities which surrounds memory usage on modern systems. The intention for this blogpost is to show a tiny bit about page sharing of linux for private pages, then move on to shared pages, and discuss how page allocation looks like with Oracle ASMM (sga_target or manual memory).
The version of linux in this blogpost is Oracle Linux 7.2, using kernel: 4.1.12-37.6.3.el7uek.x86_64 (UEK4)
The version of the Oracle database software is 184.108.40.206.160719 (july 2016).
In a previous article called ‘memory allocation on startup’ I touched on the subject of NUMA; Non Uniform Memory Access. This article is about how to configure NUMA, how to look into NUMA usage and a real life case of NUMA optimisation using in-memory parallel execution.
At this point in time (start of the summer of 2016) we see that the CPU speed competition has stagnated and settled at somewhere below maximally 4 gigahertz, and instead the number of core’s and the size of memory is growing. The common used server type in the market I am in is a two socket server. It is not unthinkable that in the near future servers with more than two sockets will increase in popularity, which facilitates the increase in (parallel) computing capacity and maximal amount of memory.
Recently I have been presenting on what running on a large intel based NUMA system looks like (OTN EMEA tour in Düsseldorf and Milan, and I will be presenting about this at the Dutch AMIS 25th anniversary event in june). The investigation of this presentation is done on a SGI UV 300 machine with 24 terabyte of memory, 32 sockets (=NUMA nodes), 480 core’s and 960 threads.
Recently I have been given access to a new version of the UV 300, the UV 300 RL, for which the CPU has improved from Ivy Bridge to Haswell, and now has 18 core’s per socket instead of 15, which means the number of core’s on a fully equipped system is 576, which makes 1152 threads.
The intention of this blogpost is to show the Oracle wait time granularity and the Oracle database time measurement granularity. One of the reasons for doing this, is the Oracle database switched from using the function gettimeofday() up to version 11.2 to clock_gettime() to measure time.
This switch is understandable, as gettimeofday() is a best guess of the kernel of the wall clock time, while clock_gettime(CLOCK_MONOTONIC,…) is an monotonic increasing timer, which means it is more precise and does not have the option to drift backward, which gettimeofday() can do in certain circumstances, like time adjustments via NTP.
The first thing I wanted to proof, is the switch of the gettimeofday() call to the clock_gettime() call. This turned out not to be as simple as I thought.
This is the second blogpost on using PL/SQL inside SQL. If you landed on this page and have not read the first part, click this link and read that first. I gotten some reactions on the first article, of which one was: how does this look like with ‘pragma udf’ in the function?
Pragma udf is a way to speed up using PL/SQL functions in (user defined function), starting from version 12. If you want to know more about the use of pragma udf, and when it does help, and when it doesn’t, please google for it.
create or replace function add_one( value number ) return number is pragma udf; l_value number(10):= value; begin return l_value+1; end; / select sum(add_one(id)) from t2;
As you can see, really the only thing you have to do is add ‘pragma udf’ in the declaration section of PL/SQL.
Whenever you use PL/SQL in SQL statements, the Oracle engine needs to switch from doing SQL to doing PL/SQL, and switch back after it is done. Generally, this is called a “context switch”. This is an example of that:
-- A function that uses PL/SQL create or replace function add_one( value number ) return number is l_value number(10):= value; begin return l_value+1; end; / -- A SQL statement that uses the PL/SQL function select sum(add_one(id)) from t2;
Of course the functionality of the function is superfluous, it can easily be done in ‘pure’ SQL with ‘select sum(id+1) from t2’. But that is not the point.
Also, I added a sum() function, for the sake of preventing output to screen per row.
There’s been a lot of work in the area of profiling. One of the things I have recently fallen in love with is Brendan Gregg’s flamegraphs. I work mainly on Linux, which means I use perf for generating stack traces. Luca Canali put a lot of effort in generating extended stack profiling methods, including kernel (only) stack traces and CPU state, reading the wait interface via direct SGA reading and kernel stack traces and getting userspace stack traces using libunwind and ptrace plus kernel stack and CPU state. I was inspired by the last method, but wanted more information, like process CPU state including runqueue time.