Welcome to my first in a series of posts in response to Tim Ford's #EntryLevel Challenge, which I learned about indirectly from reading a post by Steve Hood. SQL is what I'm good at, so I will be focusing on SQL and T-SQL in this series.
To learn or practice with SQL requires two things: A database engine, and some example data. Microsoft SQL Server Express is a freely available and easy-to-install engine. Microsoft's Adventure Works example database provides a good set of tables with data designed to show off all that SQL is capable of doing. Put SQL Server Express together with Adventure Works, and you have a nice platform on which to learn and practice the SQL language.
SQL Server Express 2014 is the current version of Microsoft's free database engine. There are five editions of Express, and you can learn about them on the MSDN download page in Figure 1:
The edition I recommend is termed as "Express with Tools". Begin the download process from the page in Figure 1. You'll need to fill out the form in Figure 2. You'll also need a Microsoft Live account.
Figure 1. Choose "Express with Tools"
Figure 2. Fill out the form
Download the installer. Run it. Take all the defaults, and just click Next, Next, Next in the usual manner. The defaults are suitable for a learning and practice environment.
When the install is over, find and run the program named SQL Server 2014 Management Studio. Search on "Management Studio" from the Start menu, and you should find it.
Login as shown in Figure 3. Select the SQLEXPRESS server name from the dropdown menu in the dialog. Management Studio will open. Click the plus sign (+) to drill into your databases. You should see just the System databases as shown in Figure 4.
Figure 3. Logging in the first time
Figure 4. There's no example data yet
There's no adventure yet. But we're working on it. Installing Adventure Works is next.
Microsoft makes the Adventure Works database available from their SQL Server Product Examples page at msftdbprodsamples.codeplex.com. You can download a set of scripts to create the database, but I prefer to grab the full backup and restore from that.
Choose the Adventure Works year as shown in Figure 5. I choose 2014 to go along my 2014 install of Express. Then choose the option to download a database backup as in Figure 6. Finally, it's worth grabbing the Readme file shown later in Figure 7. The readme is at the very bottom of the download page.
Figure 5. Choose the year of Adventure Works
Figure 6. Choose the database backup
Here is where things get a tad tricky. You want to unzip the downloaded archive and copy the .BAK file that's within it to the Backup folder under your SQL Server Express install. Figure 8 shows the Backup folder and its location on my own system.
Be aware that the Readme file from Figure 7 provides the path for the non-Express edition of SQL Server. Your path will be somewhat different by descending through MSSQL12.SQLEXPRESS.
Figure 7. Grab the Rreadme file
Figure 8. Put the .BAK file into your Backup folder
The reason for moving the .BAK file is to avoid permissions issues during the restore. SQL Server has access to its own folders, but not to other folders on your system outside the install. So copy the .BAK file to your Backup folder, and save yourself the trouble of sorting out a file access error during the restore operation.
Note: You might receive a permissions error when navigating to your Backup folder in order to make the copy. Just click Continue when that happens to give yourself access to the folder.
Now run a RESTORE DATABASE statement, and it's "job done". Following is a script showing the statement that I ran on my own system. Find the command in Figure 7's Readme file. It'll be in the section on restoring from a backup. Then modify the paths to be correct for your system.
USE [master] RESTORE DATABASE AdventureWorks2014 FROM disk= 'C:\Program Files\Microsoft SQL Server\MSSQL12.SQLEXPRESS\MSSQL\Backup\AdventureWorks2014.bak' WITH MOVE 'AdventureWorks2014_data' TO 'C:\Program Files\Microsoft SQL Server\MSSQL12.SQLEXPRESS\MSSQL\DATA\AdventureWorks2014.mdf', MOVE 'AdventureWorks2014_Log' TO 'C:\Program Files\Microsoft SQL Server\MSSQL12.SQLEXPRESS\MSSQL\DATA\AdventureWorks2014.ldf' ,REPLACE
Be sure the FROM path points to your .BAK file. The other two paths should point to your DATA directory underneath your SQL Server Express install.
The change you are most likely to need to make—if you copy and paste from the Readme file in Figure 7—is to change "MSSQL12.MSSQLSERVER" to "MSSQL12.SQLEXPRESS" in all three paths. That's the one change I made before executing the statement.
Now it's time to execute the RESTORE DATABASE statement. Click Databases in the tree at the left pane of the management Studio window. Then click the New Query toolbar button to open a query window. Paste the USE and RESTORE DATABASES commands into the query box as shown in Figure 9, and press the Execute button in the toolbar to create the example Adventure Works database shown in Figure 10.
Figure 9. Restoring from the backup
Figure 10. Adventure Works is available
The Adventure Works database will be restored from the .BAK file and made available in your environment. Right-click Databases in the left pane. Select Refresh. You should see the Adventure Works database. Expand that entry and you can drill down to see what tables, columns, indexes, views, and other structures are available.
Congratulations! You've just installed one of the best environments for learning about the SQL language and how it can be used to query and analyze data. It's an environment shared by some of the best and most helpful database professionals whom it is my pleasure to know. Look for more posts from me this year on SQL.
And get to know the community! A good way to do that is by attending one of the many SQL Saturday events held locally around the planet.
Welcome! To SQL Server.
YesSQL Summit 2016 sponsored by O'Reilly and Axxana was held by the Northern California Oracle Users Group on January 26–28 at the Oracle conference center in Redwood City, California in conjunction with BIWA Summit 2016 and Spatial Summit 2016. The grand raffle prize sponsored by O'Reilly was a full pass to Strata + Hadoop World on March 28–31 in San Jose, California. Save 20% on Strata + Hadoop World conference passes with discount code UGNOCOUG. YesSQL Summit will return to the Oracle conference center on January 31, 2017.(read more)
RMOUG volunteers compile notifications of webinars, meetings, and meetups from the internet and post them here for everyone to use.
The information technology (IT) industry is always evolving and therefore always changing.
Free webinars, even if they seem too commercial at times, always have at least one solid nugget of solid information that can make the difference in a professional’s life and career.
You never know when the need for new information that nobody else knows is going to come in handy.
Stay a step ahead…
The latest video on my YouTube Channel is a run through of using the Database as a Service (DBaaS) offering on Oracle Cloud.
There have been a few minor changes in the interface since I last ran through capturing images, so the related article has been brought up to date.
I used my dad for the cameo in this video. Hopefully this will help him get a little more recognition, as he’s pretty much a nobody on the Oracle scene at the moment. With your help this could change!
Update: Almost as soon as I released this blog post the footage was out of date as Oracle released some minor changes to the interface. I rerecorded the video and re-uploaded it, so it is up to date as of now. All links from my website and this blog post point to the new video. If you have read this post via an RSS reader, you may still be seeing the old version of the post, and as a result see the link to the video as broken. But in that case, you won’t be able to read this either.
This is just a short blog post about a simple DTrace script (dtrace_kghal_pga_code), that i recently wrote and published due to a PGA memory leak troubleshooting assignment. A client of mine noticed a major PGA memory increase after upgrading to Oracle 12c. The PL/SQL code did not change - just the database release. He already troubleshooted the issue with help of Tanel Poder's blog post "Oracle Memory Troubleshooting, Part 4: Drilling down into PGA memory usage with V$PROCESS_MEMORY_DETAIL" and identified the corresponding heap and allocation reason. However there is one issue by just looking at V$PROCESS_MEMORY_DETAIL or heap dumps - you can not trace back the causing PL/SQL or SQL code very easily.
The basic idea behind the DTrace script is to check the memory allocation reason on probe entry level (for KGHAL memory allocator functions) and stop the Oracle process at this point if the check is true. After the Oracle process has been stopped by DTrace you can request an error stack with oradebug and continue the Oracle process. As the process continues you get an errorstack trace which includes the responsible code for this particular memory allocation request (in PGA).
The following demo is run with Oracle 220.127.116.11 on Solaris 11.2 x86. I also used Frits Hoogland's PL/SQL code from here and put it into PL/SQL function called mem_alloc to allocate a lot of PGA memory.
SQL> select mem_alloc() from dual;
After a few seconds Tanel Poder's script smem_detail.sql can be used to determine the detailed PGA memory usage.
SID CATEGORY NAME HEAP_NAME BYTES ALLOCATION_COUNT
---------- --------------- -------------------------- --------------- ---------- ----------------
25 PL/SQL pmuccst: adt/record koh-kghu call 896860424 54901
25 PL/SQL pl/sql vc2 koh-kghu call 112961768 6915
25 PL/SQL pmucalm coll koh-kghu call 31622336 1936
25 Other permanent memory pga heap 263952 32
Now you know that most memory is allocated from private heap "koh-kghu call" due to reason "pmuccst: adt/record", but you still don't know which PL/SQL code line is responsible for requesting this memory. In this demo case it is pretty simple as Frit's code allocates such amount of memory at only one place, but this is just for demonstration purpose. In reality the PL/SQL code is usually much larger and much more complex - so no way to determine the corresponding code line easily.
Let's use the DTrace script "dtrace_kghal_pga_code" to intercept the process when it requests memory from heap "koh-kghu call" due to "pmuccst: adt/record".
shell> ./dtrace_kghal_pga_code 1502 kghalf "pmuccst: adt/record"
dtrace: description 'pid$target::kghalf:entry
' matched 1 probe
dtrace: allowing destructive actions
CPU ID FUNCTION:NAME
0 78530 kghalf:entry
Process is stopped due to memory allocation reason "pmuccst: adt/record" from heap "koh-kghu call ".
Please run the following commands in separate SQL*Plus to dump an errorstack:
SQL> ORADEBUG SETOSPID 1502
SQL> ORADEBUG DUMP ERRORSTACK 3 <<<< ORADEBUG will hang until process is continued by prun - works as designed
Please run the following command in separate shell after executing ORADEBUG:
shell> /usr/bin/prun 1502
I think the DTrace output is self-explanatory and you just have to follow the instructions.
Received ORADEBUG command (#1) 'DUMP ERRORSTACK 3' from process '1519'
*** 2016-01-30 09:04:30.651
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
----- Current SQL Statement for this session (sql_id=8y13z00p0sgc6) -----
SELECT * FROM T2
----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
object line object
handle number name
#ff0000; font-size: 8pt; font-family: courier new,courier;">199ddab20 14 function TEST.MEM_ALLOC
----- Call Stack Trace -----
calling call entry
location type point
-------------------- -------- --------------------
*** 2016-01-30 09:04:30.946
ksedst()+307 call skdstdst()
__sighndlr()+6 call ssprtmin()
call_user_handler()+718 call __sighndlr()
#ff0000;">sigacthandler#ff0000;">()+219 call call_user_handler()
#ff0000; font-size: 8pt; font-family: courier new,courier;">kghalf()+2 signal sigacthandler()
#ff0000;">kohalmc()+214 call kghualloc()
kohalc()+145 call kohalmc()
pmuocon2_con_recur()+145 call kohalc()
pmuocon()+127 call pmuocon2_con_recur()
pfrrun_no_tool()+298 call pfrinstr_EXECC()
pfrrun()+1216 call pfrrun_no_tool()
plsql_run()+648 call pfrrun()
peidxr_run()+317 call plsql_run()
ssthrdmain()+558 call opimai_real()
main()+164 call ssthrdmain()
_start()+123 call main()
Checking the call stack trace in the generated trace file reveals that the errostack trace is created at the right time.
The function kghalf() is the last executed function right before the signal handler (by oradebug) kicks in: "sigacthandler()+219<-kghalf()+2". You also can see the PL/SQL call stack and the currently running SQL statement (if a "plain" SQL statement runs at this time - otherwise you see the "entry" PL/SQL call). PL/SQL code line 14 in function TEST.MEM_ALLOC is exactly related to "select * bulk collect into c_tmp from t2;".
I hope you may find this DTrace script a little bit useful in case of troubleshooting unclear PGA memory allocation.
Just let me know if you have some issues with it. If you have any further questions - please feel free to ask or get in contact directly, if you need assistance by troubleshooting Oracle database (performance) issues. You can also follow or drop me a note on Twitter of course.
Spring conference season is upon us. Planning for COLLABORATE16 in April has gotten me to thinking about conferences and missed opportunities.
Do you remember the first time you didn't attend an industry conference? That's a strange question, isn't it? I don't remember my first time not attending an event, but I do remember one event in particular as a forever-missed opportunity. The event was the International Community for Auditory Display's (ICAD's) 1998 conference in Glasgow, Scotland.
Now a trip to Glasgow might have been out of my reach at the time. But the 2000 event was held in Georgia, and I missed that one too. I didn't just miss it. I chose to miss it. Because I was cheap. Because I did not comprehend the benefits. Because no one had ever clued me in to why I should spend on myself like that. I missed the event, because I missed seeing its value.
I've missed other events over the years that I wish I had made the effort to attend. Interbike last year is an example. The Interbike International Bicycle Exposition is the largest bicycle industry event in North America, and the opportunity to attend as an industry insider was before me. But I missed the opportunity—not because of money this time, but because I let short-term work pressures get in the way of my long-term interests and personal growth.
So COLLABORATE16 is upon us. Now is the time to ante up if you're interested in attending. Or if not that event, maybe it's some other conference this year that is a better fit. Whatever the case, whatever event is the best fit, consider investing in yourself and your career by attending and meeting people and becoming involved in a larger community.
Because it's kind of a bummer to look back at missed opportunities and wonder what other course life might be taking right now had you chosen to attend an event rather than to miss it.
Because investing in your career pays in the long run.
Because sometimes we need to indulge our dreams.
I saw a tweet that made me smile a few days ago:
Like many of you, I started out my working life in IT as a computer programmer. Like some of you, I still am a computer programmer from time to time. I don’t think I’ve had a year of my working life when I did not do either some application development programming or some database infrastructure programming. I am constantly writing small (or not so small) SQL or PL/SQL programs to do what I do in Oracle.
I started programming in school, I did an “O” level in computer studies (the exams we sat in the UK aged 16, up until 1988!), and I was pretty good at the programming as compared to my fellow class mates. My first “real” program played Noughts and Crosses (tic-tac-toe to our American cousins and maybe others) and version 2 was unbeatable. Which at the time I thought was pretty cool.
but Wikipedia now tells me is pretty easy :-). I also remember someone in the year above me unrolling some huge printout of the role-playing game he was writing (you know, the old textual “you have walked into a room where there is a lion, a bar of soap and a chandelier, what do you want to do?” sort of thing) and telling me I would never be able to do it. I just looked at the code and thought: Why have you hard-coded every decision and used all those GOTOs? Some sort of loop and a data block to look up question, answers and consequences would be much smaller and easy to extend? I don’t think he liked me voicing that opinion…
I did not do any programming of any consequence as part of my college course but after that I started work as a computer programmer (sorry “analyst programmer”) in the National Health Service. Again, I seemed better at it than most of those around me, fixing bugs that others had given up on and coding the tricky stuff no one else wanted to touch. And after a year or so, I was convinced I was a programming god!
I wasn’t of course. Part of it was my juvenile, naive ego and the other part was that, fundamentally, many of those around me were bad programmers. Anybody decent either did not join in the first place or got a better job elsewhere that paid more than the NHS did. I eventually did that myself and joined Oracle. Where I realised that (a) SQL confused the hell out of me and (b) when I started using PL/SQL there were plenty of people around me who were better at traditional programming than I.
I think it took me about a year to feel I was damned good at both of them. Guess what? I was wrong. I was simply competent. But after a year or two more I did two things that, for me, finally did make me into a good programmer:
Since then, I think my own opinion of my coding skills has generally dropped year on year, even though I would like to think I continue to get better at actually constructing computer programs and suites of programs.
So yes, I wish I was as good a programmer now as I thought I was aged 22. And after 25 years at it (actually, pretty much 35 years at it on and off!) just like Rich Rogers (or is it John D Cook? I can’t quite decide if it is a quotation or not) I think I am finally getting moderately good at writing programs. If I continue to follow this trend, on my 65th birthday I will be convinced I can’t program for toffee and yet will finally be a Good Programmer.
I wonder if anyone would still employ me to do it by then?
It’s amazing how easy it is to interpret a number incorrectly until the point comes where you have to look at it closely – and then you realise that there was a lot more to the number than your initial casual assumption, and you would have realised it all along if you’d ever needed to think about it before.
Here’s a little case in point. I have a simple (i.e. non-partitioned) heap table t1 which is basically a clone of the view dba_segments, and I’ve just connected to Oracle through an SQL*Plus session then run a couple of SQL statements. The following is a continuous log of my activity:
SQL> select table_name, partitioned, blocks from user_tables; TABLE_NAME PAR BLOCKS -------------------- --- ---------- T1 NO 958 1 row selected. SQL> select name, value from v$mystat ms, v$statname sn where ms.statistic# = sn.statistic# and name like '%table scan%'; NAME VALUE -------------------------------- ---------- table scans (short tables) 9 table scans (long tables) 0 table scans (rowid ranges) 0 table scans (cache partitions) 0 table scans (direct read) 0 table scan rows gotten 4188 table scan blocks gotten 14 7 rows selected. SQL> select count(extents) from t1; COUNT(EXTENTS) -------------- 5143 1 row selected. SQL> select name, value from v$mystat ms, v$statname sn where ms.statistic# = sn.statistic# and name like '%table scan%';
So here’s the question: how many “table scans (long tables)” will Oracle record against my session when I re-run that query against v$mystat ?
Warning – this IS a trick question.
I said it was a trick question and, as you will have seen if you’ve had time to read the comments, the answer is going to depend on various configuration options. Here’s what I got in my test – and I’ve reported not just the session stats, but the segment statistics (v$segment_statistics):
NAME VALUE -------------------------------- ---------- table scans (short tables) 9 table scans (long tables) 26 table scans (rowid ranges) 26 table scans (cache partitions) 0 table scans (direct read) 0 table scan rows gotten 9331 table scan blocks gotten 954 7 rows selected. SQL> select owner, object_name, object_type, value from v$segment_statistics where owner = 'TEST_USER' and statistic_name = 'segment scans' and value != 0; OWNER OBJECT_NAME OBJECT_TYPE VALUE --------------- -------------------- ------------------ ---------- TEST_USER T1 TABLE 26 1 row selected.
I recorded 26 long table scans – and that 26 was echoed in the segment statistics. (Note: I don’t normally use v$segment_statistics, which is a join between a couple of in-memory structures and three real table, I usually query v$segstat). It’s the segment statistics that made me pause in a recent problem review; up to that moment I had been using the “Segments by Table Scans” section of the AWR Report as a useful (but badly named – since it also counts (most) index fast full scans) indicator of a potential threat, then I suddenly realised that there was yet another important detail I had to check before I could determine what the numbers were telling me.
So the next question is WHY has Oracle reported 26 tablescans ?
Martin Preiss (see comments) has been working hard to investigate this, and managed to produce a couple more variations in statistics for “the same” problem. In my case I had declared the table as “parallel 2”. For a table of degree N Oracle is very likely to break the table into 13*N chunks – there’s a historic reason for 13 – and that’s why I see 26 “rowid range” scans which, unfortunately, are also recorded as “long” table scans even though each is individually well short of the short table limit. This is really annoying when you start to look at the “Segments by Table Scan” report buecause you now don’t know how many times a table really was scanned unless you know something about the run-time degree of parellelism involved.
If you check the hidden parameters you will find several parameters relating to this chunking, in particular:
_px_min_granules_per_slave minimum number of rowid range granules to generate per slave (default 13) _px_max_granules_per_slave maximum number of rowid range granules to generate per slave (default 100)
Technically, therefore, Oracle MIGHT get to 100 granules per parallel slave, and a tablescan at degree 2 could be recorded as 200 tablescans!
Martin asked why my 26 tablescans didn’t show up as “table scans (direct read)” – but that’s 11g for you, it allows a serial tablescan to use direct path reads, and it allows parallel tablescans to read into the cache, and the combination of cache size and table size meant that my example just happened to read into the cache.
And that introduces ANOTHER interpretation problem – what annoying things might I discover if I declare the table the the CACHE option ? (as Ivica suggested in his first comment below) ?
Iviva has been busy on checking the CACHE operation and shown that there are a number of cases to consider (more, in fact, than I was planning to mention – and the variation the combination in NOPARALLEL and CACHE should give you some pause for thought). The only point I wanted to make was the effect of enabling PARALLEL and CACHE; I don’t think that this is a combination that is very likely to appear in a production system, but boundary conditions (and accidents) do occur. With my little sample – even after I grew the table to be MUCH larger, the tablescan DIDN’T get reported: here are some session stats and segment stats (taken using my snapshot code) of a single tablescan running parallel 2 when the table was cached:
Name Value ---- ----- table scans (rowid ranges) 26 table scans (cache partitions) 26 table scan rows gotten 164,672 table scan blocks gotten 29,611 Statistic Value --------- ------------ T1 logical reads 30,272 physical reads 29,614 physical read requests 261
I know I’ve done a tablescan in 26 pieces (parallel 2) – and scanned 29,000+ blocks doing it; but according to the segment stats AND session stats I haven’t done a tablescan. Fortunately, of course, I can see the parallel tablescan in the session stats, and in this isolated case I can see from the “cache partitions” statistics that that tablescan was on a cached table. But if I’ve been depending on the segment stats to tell me about which tablescans happen most frequently and do most work I’ve found another reason why I can’t trust the stats and have to do more work cross-checking different parts of the AWR for self-consistent results.
I don’t expect to add any more comments about this mixture of tablescans and parallelism, with the varying effects on the session and segment statistics – but there’s always the possibility that one day I’ll start to worry about how the KEEP and RECYCLE (db_keep_cache_size and db_recycle_cache_size) could confuse things further.
It’s probably worth pointing out that the segment statistics have never recorded SHORT tablescans, they’ve only ever captured details of LONG tablescans. There is a slight inconsistency here, though, since they capture all the “index fast full scans (full)” reported in the session stats whether they are short or long – not that the session stats record the difference; unfortunately, another flaw creeps in: parallel index fast full scans “disappear” from the stats, although they show up as “index fast full scans (rowid ranges)” and “index fast full scans (direct)” with the same multiplier of 13 that we see for parallel table scans.