After reading my blog entry about a performance issue due to excessive HCC decompression ( Accessing HCC compressed objects using index access path, a reader asked me about the CPU profiling method I mentioned in that blog entry. I started responding to that comment, and realized that the response was too big for a comment. So, in this blog entry, I will cover basics of the CPU profiling in Linux. Other platform provides similar utilities, for example, Solaris provides an utility dtrace.
While presenting at Oaktable World 2014 in San Fransisco, I discussed the in-memory pre-population speed and indicated that it takes about 30 minutes to 1 hour to load ~300GB of tables. Someone asked me “Why?” and that was a fair question. So, I profiled the in-memory pre-population at startup.
I profiled all in-memory worker sessions using Tanel’s snapper script and also profiled the processes in OS using Linux perf tool with 99Hz sample rate. As there is no other activity in the database server, it is okay to sample everything in the server. Snapper output will indicate where the time is spent; if the time is spent executing in CPU, then the perf report output will tell us the function call stack executing at that CPU cycle. Data from these two profiling methods will help us to understand the root cause of slowness.