Search

OakieTags

Who's online

There are currently 0 users and 37 guests online.

Recent comments

Oakies Blog Aggregator

PowerPoint 2013 Always Starts Minimized

This post has nothing to do with Oracle or Enterprise Manager at all, so if that’s all you’re interested in you can stop reading now. :)

Yesterday I ran into the situation where PowerPoint would only open minimized on the task bar and nothing I could do would get it to budge from there. The only way I could get PowerPoint to start and be displayed on any monitor was to start it in safe mode. [As an aside, to start PowerPoint (or indeed any Office product) in Safe Mode you just need to hold down the Control key while double-clicking either the PowerPoint program or any PowerPoint document.] After starting PowerPoint successfully in Safe Mode, I could see PowerPoint on my screen, but when I closed it down and tried to restart in normal mode, it would always start minimized.

So one answer is simply to start always in Safe Mode, which of course means a variety of things are disabled or don’t work fully. Not a very good answer. :) I searched a bit further afield, and found some Microsoft notes that basically explained the screen X-Y coordinates for PowerPoint may be corrupt (actually the notes were talking about Microsoft Word, not PowerPoint, but the explanation applies in either case). In the support note, it suggests 3 different methods to fix the issue:

  1. Maximize the program
  2. Use the move feature to reposition the program into view
  3. Edit the registry

Obviously you should work through these from top to bottom, but I had already found that the first two did nothing in my situation, so I was left with editing the registry.

IMPORTANT: As always, as the Microsoft note says, “Before you modify the registry, make sure to back it up and make sure that you understand how to restore the registry if a problem occurs. For information about how to back up, restore, and edit the registry, click the following article number to view the article in the Microsoft Knowledge Base: 256986 Description of the Microsoft Windows Registry”. In other words, do the rest of this at your own risk!!!

The note that I found from Microsoft talked about removing a registry key that for me didn’t exist. Of course, it was somewhat dated and there may be a more up to date version that I hadn’t found. In any case, here’s how I fixed the problem:

  • Run regedit.exe to start the Registry Editor
  • Navigate through the tree to HKEY_CURRENT_USER\Software\Microsoft\Office\15.0\PowerPoint
  • Right-click on “Options” and select “Export” to save a backup in case I needed it (which in fact I did – see below for the reason)
  • Delete the “Options” key completely

Voila! PowerPoint now started and displayed on the screen as I expected.

NOTE: I might also have been able to simply delete the entries Bottom, Left, Right and Top in the Options key, but I didn’t try that directly. Maybe if someone else that has this situation arise can test this and put a comment back on this post, we can see if that solution works as well. There were other options listed such as PersonalTemplates, ToolbarConfigSaved and so on that it could have been useful to keep. I added those back by editing the export I had taken beforehand to remove the Bottom, Left, Right and Top entries, then opening the Registry Editor and importing them back by choosing File -> Import, which merged the entries from my backup into the Options key.

Who Plans The Content of UKOUG Tech15?

<..Who are the Tech15 committee and my role
….submitting an abstract..>

When you go to a conference like UKOUG Tech15 there are hundreds of talks given over several days and a dozen or so streams. Who decides what is presented and how do they decide?

You do. Well, I’d say you have about 60-70% of the input, if you are a member of the UKOUG (and I know many reading this are not – but you are probably members {via your employer, if not personally} of other user groups. And, if you are not, you can probably benefit from joining one.) The point is, the members of the UK Oracle User Group have a fair say in what gets talked about at the UKOUG conferences. And, though not all are run in the same way, I know several of the large oracle conferences run on similar principles. You also provide the raw material, the proposed talks. That is open to each and every one of you, member or not. Anyone can offer a talk.

What about the other 30-40% of the input? Well, that would be me :-). {Note, British ironic humour}. As I mentioned in my first post about organising Tech15 I am the Lead for the database area this year, and some people did blame me last year for the content – but being the Lead does not put me in charge. There is a technical committee that decides what they feel should be the overall structure of the conference and have the final 30-40% say in what talks are given.

I’ll go into more details about aspect of the paper selection process in future posts, but the general structure is thus:

  • The steering committee meet for a kick-off meeting and decide on:
    • Who is in which committee (though this is pretty much sorted out before the meeting).
    • the general structure of the event – The major areas (Database, Middleware, Development, Business Analytics and Hardware/OS/Engineered), the number of streams each major area gets each day, the length of sessions and if anything is happening outside the main 3 days of the conference.
    • How we handle the labeling of topics in our streams (endless discussions there!).
    • Topics and considerations that we feel are important to our streams that should be mentioned in the call for papers.
    • How we will run the sub-committees and overall committee – again, this is generally known but we look at what we learnt the prior year and change accordingly.
  • The call for papers goes out (it will be the 13th April to 10th May this year). This is advertised by the UKOUG, being sent to previous paper submitters, the User Group members and is announced in the UKOUG mailings, tweeted and several other avenues. The committee will have suggested areas to submit for, but what is submitted is up to the presenting community – and this can alter our thoughts on content.
  • Judging – From 20th April to close to the Agenda Planning Day, volunteers and members of UKOUG are asked to judge the paper abstracts. These scores are important for the next step…
  • Agenda Planning Day – the steering committee members get together and spend pretty much a whole day reviewing the abstracts, the judging scores, the slots available, what we know of the speakers and presentations, the spread of topics, percentage of established and new speakers and half a dozen other things to come up with the rough agenda. It’s a bit of a bun fight, but we get there in the end. Every abstract is looked at along with it’s judging score.
  • Speakers are informed if their papers are accepted, rejected or we would like them as reserves – and the speakers confirm or decline acceptance or reserves (and occasionally question rejections). Sometimes a speaker will be asked if they would modify a submission.
  • The technical committees may well try and source some papers where we feel a topic is under-represented or to fit with some other aim (like a stream at a given level).
  • Reserves are slotted in to replace any speakers who decline and any clashes, alterations and agenda tweaks are dealt with as they arise.
  • The agenda is launched (ie we say what is on it) mid July.
  • From the agenda launch to the start of the conference, any paper changes are handled as they come up – usually a speaker pulling out or needing to change dates but occasionally other issues.

Why is it called “Paper Selection” when people are talking? Why do we talk about abstracts? Well, conferences pretty much started off as scientific conferences and you would submit you scientific paper – and then read it out to the conference. The abstract is a brief “why you should read my 35 page argument with long, impressive words for why I think hyaenas are more closely related to cats than dogs” {they are}. We inherit those terms.

So you can see that the steering committee has a fair input, so how do WE get chosen? Fundamentally, it is via a call for volunteers from the UKOUG community. The UKOUG ask people to volunteer in their regular emails to members/volunteers. (Volunteers have to be members of the UKOUG but the membership may well belong to a company. The UKOUG keeps track of the nominated contacts for an organisation, who are responsible for the membership management, but also the individuals who have helped out at any time under that membership. As an example, someone in purchasing or HR may be the nominated contact for the memberships a company has with UKOUG, but it is members of the technical IT staff who come to the events and may start helping out).
The office UKOUG staff/board members may well ask one or two of the experienced volunteers known to them to take a lead and help chose which volunteers to accept. Or, more commonly, to go and pester people they know to step up and volunteer! New volunteers are always part of the mix, we recognise that without new people and perspectives we will stagnate, and they challenge us when we say “we always do it this way”.

I have not mentioned Oracle Corporation involvement. Strictly speaking, people from Oracle are not volunteers and are certainly not members. They are Oracle Liaisons. The UKOUG gets good support from Oracle, we have talks from them, we have some SIG meetings in their offices. Oracle Corporation of course is happy to talk about the latest/greatest aspects of Oracle and if they can get us all fired up for an extra cost option, so much the better for them. But the relationship is generally balanced and varies over the years – and is influenced by individuals. Some people who work for Oracle will push to be allowed to help out the UKOUG, some product managers are more than happy to come and give talks about free, standard or old features as well as the shiny new stuff. Others I am sure see us as an annoyance. The input we get from the Oracle Liaisons is very helpful and appreciated – but don’t think it buys acceptance of whatever Oracle Corp want. I had to help deal with an Oracle product manager last year who was upset that their area had very few talks. It got as far as them almost demanding some slots. However, the number of talks submitted and the poor judging scores for those few that were told us on the committee that the user community were not currently interested in that topic. So no talks. Faye and I talked it over, I gave the logic and reason and she was good enough to then deal with Upset Product Manager.

I have helped with the agenda planning day a couple of time – I think I got pestered to help way back in 2007 or 8! – and I have been a SIG chair and deputy chair as well as a regular presenter, so I am a known soft-touch for helping the UKOUG. A key aspect to my being the Lead is simply that I have more free time than most other volunteers, so I can be got hold of and can spend a bit of time thinking about things and making decisions. This can be important on the run-up to the actual event as you sometimes need to make decisions quickly and a group discussion may not be the best way to do it. I might check with a couple of others (and I usually do) but the key thing is to make a decision in the timeframe allowed.

So that is who the Agenda Planning committee are and where we fit in. We are volunteers, filtered and guided by some old hands but with new blood each year. We aim to guide and give structure but the talks submitted are what anyone wants to submit. Judging scores by the community are key to paper selection and though Oracle Corp supports they don’t get to dictate.

And if all else fails, blame the committee Leads.

12c MView refresh

Some time ago I wrote a blog note describing a hack for refreshing a large materialized view with minimum overhead by taking advantage of a single-partition partitioned table. This note describes how Oracle 12c now gives you an official way of doing something similar – the “out of place” refresh.

I’ll start by creating a matieralized view and creating a couple of indexes on the resulting underlying table; then show you three different calls to refresh the view. The materialized view is based on all_objects so it can’t be made available for query rewrite (ORA-30354: Query rewrite not allowed on SYS relations) , and I haven’t created any materialized view logs so there’s no question of fast refreshes – but all I intend to do here is show you the relative impact of a complete refresh.


create materialized view mv_objects nologging
build immediate
refresh on demand
as
select
        *
from
        all_objects
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'mv_objects',
		method_opt 	 => 'for all columns size 1'
	);
end;
/

create index mv_obj_i1 on mv_objects(object_name) nologging compress;
create index mv_obj_i2 on mv_objects(object_type, owner, data_object_id) nologging compress 2;

This was a default install of 12c, so there were about 85,000 rows in the view. You’ll notice that I’ve created all the objects as “nologging” – this will have an effect on the work done during some of the refreshes.

Here are the three variants I used – all declared explicitly as complete refreshes:


begin
	dbms_mview.refresh(
		list			=> 'MV_OBJECTS',
		method			=> 'C',
		atomic_refresh		=> true
	);
end;
/

begin
	dbms_mview.refresh(
		list			=> 'MV_OBJECTS',
		method			=> 'C',
		atomic_refresh		=> false
	);
end;
/

begin
	dbms_mview.refresh(
		list			=> 'MV_OBJECTS',
		method			=> 'C',
		atomic_refresh		=> false,
		out_of_place		=> true
	);
end;
/

The first one (atomic_refresh=>true) is the one you have to use if you want to refresh several materialized views simultaneously and keep them self consistent, or if you want to ensure that the data doesn’t temporarily disappear if all you’re worried about is a single view. The refresh works by deleting all the rows from the materialized view then executing the definition to generate and insert the replacement rows before committing. This generates a lot of undo and redo – especially if you have indexes on the materialized view as these have to be maintained “row by row” and may leave users accessing and applying a lot of undo for read-consistency purposes. An example at a recent client site refreshed a table of 6.5M rows with two indexes, taking about 10 minutes to refresh, generating 7GB of redo as it ran, and performing 350,000 “physical reads for flashback new”. This strategy does not take advantage of the nologging nature of the objects – and as a side effect of the delete/insert cycle you’re likely to see the indexes grow to roughly twice their optimal size and you may see the statistic “recursive aborts on index block reclamation” climbing as the indexes are maintained.

The second option (atomic_refresh => false) is quick and efficient – but may result in wrong results showing up in any code that references the materialized view (whether explicitly or by rewrite). The session truncates the underlying table, sets any indexes on it unusable, then reloads the table with an insert /*+ append */. The append means you get virtually no undo generated, and if the table is declared nologging you get virtually no redo. In my case, the session then dispatched two jobs to rebuild the two indexes – and since the indexes were declared nologging the rebuilds generated virtually no redo. (I could have declared them with pctfree 0, which would also have made them as small as possible).

The final option is the 12c variant – the setting atomic_refresh => false is mandatory if we want  out_of_place => true. With these settings the session will create a new table with a name of the form RV$xxxxxx where xxxxxx is the hexadecimal version of the new object id, insert the new data into that table (though not using the /*+ append */ hint), create the indexes on that table (again with names like RV$xxxxxx – where xxxxxx is the index’s object_id). Once the new data has been indexed Oracle will do some name-switching in the data dictionary (shades of exchange partition) to make the new version of the materialized view visible. A quirky detail of the process is that the initial create of the new table and the final drop of the old table don’t show up in the trace file  [Ed: wrong, see comment #1] although the commands to drop and create indexes do appear. (The original table, though it’s dropped after the name switching, is not purged from the recyclebin.) The impact on undo and redo generation is significant – because the table is empty and has no indexes when the insert takes place the insert creates a lot less undo and redo than it would if the table had been emptied by a bulk delete – even though the insert is a normal insert and not an append; then the index creation honours my nologging definition, so produces very little redo. At the client site above, the redo generated dropped from 7GB to 200MB, and the time dropped to 200 seconds which was 99% CPU time.

Limitations, traps, and opportunities

The manuals say that the out of place refresh can only be used for materialized views that are joins or aggregates and, surprisingly, you actually can’t use the method on a view that simply extracts a subset of rows and columns from a single table.  There’s a simple workaround, though – join the table to DUAL (or some other single row table if you want to enable query rewrite).

Because the out of place refresh does an ordinary insert into a new table the resulting table will have no statistics – you’ll have to add a call to gather them. (If you’ve previously been using a non-atomic refreshes this won’t be a new problem, of course). The indexes will have up to date statistics, of course, because they will have been created after the table insert.

The big opportunity, of course, is to change a very expensive atomic refresh into a much cheaper out of place refresh – in some special cases. My client had to use the atomic_refresh=>true option in 11g because they couldn’t afford to leave the table truncated (empty) for the few minutes it took to rebuild; but they might be okay using the out_of_place => true with atomic_refresh=>false in 12c because:

  • the period when something might break is brief
  • if something does go wrong the users won’t get wrong (silently missing) results, they’ll an Oracle error (probably ORA-08103: object no longer exists)
  • the application uses this particular materialized view directly (i.e. not through query rewrite), and the query plans are all quick, light-weight indexed access paths
  • most queries will probably run correctly even if they run through the moment of exchange

I don’t think we could guarantee that last statement – and Oracle Corp. may not officially confirm it – and it doesn’t matter how many times I show queries succeeding but it’s true. Thanks to “cross-DDL read-consistency” as it was called in 8i when partition-exchange appeared and because the old objects still exist in the data files, provided your query doesn’t hit a block that has been overwritten by a new object, or request a space management block that was zero-ed out on the “drop” a running query can keep on using the old location for an object after it has been replaced by a newer version. If you want to make the mechanism as safe as possible you can help – put each relevant materialized view (along with its indexes) into its own tablespace so that the only thing that is going to overwrite an earlier version of the view is the stuff you create on the next refresh.

 

Try Oracle 12c VM with Delphix download


9104210308_a63b5ae5c4_z
 photo by #212124;" title="Go to Jose Maria Cuellar's photostream" href="https://www.flickr.com/photos/cuellar/" data-track="attributionNameClick" data-rapid_p="67">Jose Maria Cuellar (CC 2.0)

Thanks to Leighton Nelson who pointed out  that :

Oracle has a pre-installed Linux VM with 12c

Delphix as well has a pre-installed   trial version 

#000000;">Download both of these and start them up in Virtualbox and you can start virtualizing your PDBs.
#000000;">The Oracle pre-installed VM has a few eccentricities that have to be addressed before using it. There is no IP address and there is no history of the install and it’s missing a glibc library. After we address these in the instructions below you can link to the PDB and then provision that PDB back as a virtual PDB (vPDB) or provision that PDB to another 12c instance on another machine as a vPDB.

Here is a video of linking a PDB to Delphix and provisioning  a vPDB

before doing the above with the Oracle pre-installed VM, follow these steps:

#000000;">After importing and starting, change network to “Bridge Adaptor” and restarted services
#000000;">get the ip address with “ifconfig -a” and vi  /etc/hosts to add hostname “oraclevm” with the  ip address. Also increase /dev/shm or otherwise will get “MEMORY_TARGET not supported on this system”
#000000;">
#0000ff;">su -
#0000ff;">service network restart
#0000ff;"># note the IP of the machine
#0000ff;">ifconfig -a

#0000ff;"># add hostname “oraclevm” and current IP to hosts
#0000ff;">vi /etc/hosts
#0000ff;">hostname oraclevm
#0000ff;">echo oraclevm  >  /proc/sys/kernel/hostname

#0000ff;">#umount tmpfs might give errors, but seems to do the trick anway
#0000ff;">umount tmpfs
#0000ff;">mount -t tmpfs shmfs -o size=1500m /dev/shm
#000000;">

#0000ff;">mkdir /mnt/provision
chmod 755 /mnt
chmod 755 /mnt/provision

#000000;">

#0000ff;">yum install glibc.i686

#000000;">yum might get error
#000000;">

Existing lock /var/run/yum.pid: another copy is running as PID ….

#000000;">Waited a few minutes and was able to run. Oracle VM must run some yum stuff just after starting up. After one try, the running yum process never seemed to exit, so rebooted VM and upon reboot was able to run yum
#000000;">

 

 

#000000;">

Back as Oracle unset TWO_TASK as it blocks “/ as sysdba” connections and get rid of the .bash output as it messes up scp and Delphix trying to put the toolkit onto the box

#0000ff;">unset TWO_TASK
#0000ff;">cp .bashrc .bashrc.orig
#0000ff;">cat .bashrc.orig | grep -v cat > .bashrc

Set up  Delphix c##delphix container user and delphix PDB user

#0000ff;">sqlplus / as sysdba << EOF
#0000ff;">create#0000ff;"> #0000ff;">user#0000ff;"> c##delphix #0000ff;">identified#0000ff;"> #0000ff;">by#0000ff;"> delphix;
#0000ff;">grant#0000ff;"> #0000ff;">create#0000ff;"> #0000ff;">session#0000ff;"> #0000ff;">to#0000ff;"> c##delphix;
#0000ff;">alter#0000ff;"> #0000ff;">user#0000ff;"> c##delphix #0000ff;">set#0000ff;"> container_data=#0000ff;">all#0000ff;"> container=#0000ff;">current#0000ff;">;
#0000ff;">grant#0000ff;"> #0000ff;">select#0000ff;"> #0000ff;">any#0000ff;"> dictionary #0000ff;">to#0000ff;"> c##delphix;
#0000ff;">create#0000ff;"> #0000ff;">or#0000ff;"> replace #0000ff;">view#0000ff;"> v_x\$kccfe #0000ff;">as#0000ff;"> #0000ff;">select#0000ff;"> * #0000ff;">from#0000ff;"> x\$kccfe;
#0000ff;">grant#0000ff;"> #0000ff;">select#0000ff;"> #0000ff;">on#0000ff;"> v_x\$kccfe #0000ff;">to#0000ff;"> c##delphix;
#0000ff;">create#0000ff;"> #0000ff;">synonym#0000ff;"> c##delphix.x\$kccfe #0000ff;">for#0000ff;"> v_x\$kccfe;
#0000ff;">alter#0000ff;"> #0000ff;">session#0000ff;"> #0000ff;">set#0000ff;"> container=orcl;
#0000ff;">create#0000ff;"> #0000ff;">user#0000ff;"> delphix #0000ff;">identified#0000ff;"> #0000ff;">by#0000ff;"> delphix;
#0000ff;">grant#0000ff;"> #0000ff;">create#0000ff;"> #0000ff;">session#0000ff;"> #0000ff;">to#0000ff;"> delphix;
#0000ff;">grant#0000ff;"> #0000ff;">select#0000ff;"> #0000ff;">any#0000ff;"> dictionary #0000ff;">to#0000ff;"> delphix;
#0000ff;">create#0000ff;"> #0000ff;">or#0000ff;"> replace #0000ff;">view#0000ff;"> v_x\$kccfe #0000ff;">as#0000ff;"> #0000ff;">select#0000ff;"> * #0000ff;">from#0000ff;"> x\$kccfe;
#0000ff;">grant#0000ff;"> #0000ff;">select#0000ff;"> #0000ff;">on#0000ff;"> v_x\$kccfe #0000ff;">to#0000ff;"> delphix;
#0000ff;">create#0000ff;"> #0000ff;">synonym#0000ff;"> delphix.x\$kccfe #0000ff;">for#0000ff;"> v_x\$kccfe;
#0000ff;">EOF

make a toolkit directory

#0000ff;">mkdir ~/toolkit

Add the inventory directory and the inventory file with version info that Delphix uses

#0000ff;">mkdir $ORACLE_HOME/inventory
mkdir $ORACLE_HOME/inventory/ContentsXML
cat << EOF >  $ORACLE_HOME/inventory/ContentsXML/comps.xml
<?xml version=”1.0″ standalone=”yes” ?>


Sep.19 15:31:29 EDT” INST_LOC=”/home/oracle/base/oracle12102/oracle.server”>



EOF 

Put database in archive log mode

#0000ff;">sqlplus / as sysdba << EOF
shutdown immediate;
startup mount;
alter database archivelog;
alter database open;
alter system switch logfile;
alter system switch logfile;
#0000ff;">EOF

Use the machine name in listener.ora and tnsnames.ora instead of IP from 0.0.0.0
LISTENER =  (DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = oraclevm )(PORT = 1521))
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)

#0000ff;">vi $ORACLE_HOME/network/admin/listener.ora
lsnrctl stop
lsnrctl start
sqlplus / as sysdba << EOF
alter system register;
#0000ff;">EOF

Then in Delphix UI, go to
#000000;">

Manage -> Environments
click “+” sign
add host
after host is added click on the database tab  (far right disk icon)
on database tab for environment, click on “discover”
for login/password give  c##delphix / delphix  (the CDB login we created)
this will discover the PDBs
now click on Delphix or Manage-> Databases -> My databases
click “+” sign
select the orcl PDB on cdb1
for login/password give delphix / delphix (the PDB login we created)

 

After creating the  vPDB it shows up for the listener

lsnrct stat

Service “vorc_fd7” has 1 instance(s).
Instance “cdb1″, status READY, has 1 handler(s) for this service…

I add an entry to $ORACLE_HOME/network/admin/tnsnames.ora

DELPHIX =
  (DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = oraclevm)(PORT = 1521))
    (CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = vorc_fd7)
)
)

and then connect with SQL*Plus

sqlplus delphix/delphix@DELPHIX

 

2401421989_e05053cbf7_z

photo by #212124;" title="Go to Yannis's photostream" href="https://www.flickr.com/photos/yannisag/" data-track="attributionNameClick" data-rapid_p="48">Yannis (CC 2.0)

Try Oracle 12c VM with Delphix download


9104210308_a63b5ae5c4_z
 photo by #212124;" title="Go to Jose Maria Cuellar's photostream" href="https://www.flickr.com/photos/cuellar/" data-track="attributionNameClick" data-rapid_p="67">Jose Maria Cuellar (CC 2.0)

Thanks to Leighton Nelson who pointed out  that :

Oracle has a pre-installed Linux VM with 12c

Delphix as well has a pre-installed   trial version 

#000000;">Download both of these and start them up in Virtualbox and you can start virtualizing your PDBs.
#000000;">The Oracle pre-installed VM has a few eccentricities that have to be addressed before using it. There is no IP address and there is no history of the install and it’s missing a glibc library. After we address these in the instructions below you can link to the PDB and then provision that PDB back as a virtual PDB (vPDB) or provision that PDB to another 12c instance on another machine as a vPDB.

Here is a video of linking a PDB to Delphix and provisioning  a vPDB

before doing the above with the Oracle pre-installed VM, follow these steps:

#000000;">After importing and starting, change network to “Bridge Adaptor” and restarted services
#000000;">get the ip address with “ifconfig -a” and vi  /etc/hosts to add hostname “oraclevm” with the  ip address. Also increase /dev/shm or otherwise will get “MEMORY_TARGET not supported on this system”
#000000;">
#0000ff;">su -
#0000ff;">service network restart
#0000ff;"># note the IP of the machine
#0000ff;">ifconfig -a

#0000ff;"># add hostname “oraclevm” and current IP to hosts
#0000ff;">vi /etc/hosts
#0000ff;">hostname oraclevm
#0000ff;">echo oraclevm  >  /proc/sys/kernel/hostname

#0000ff;">#umount tmpfs might give errors, but seems to do the trick anway
#0000ff;">umount tmpfs
#0000ff;">mount -t tmpfs shmfs -o size=1500m /dev/shm
#000000;">

#0000ff;">mkdir /mnt/provision
chmod 755 /mnt
chmod 755 /mnt/provision

#000000;">

#0000ff;">yum install glibc.i686

#000000;">yum might get error
#000000;">

Existing lock /var/run/yum.pid: another copy is running as PID ….

#000000;">Waited a few minutes and was able to run. Oracle VM must run some yum stuff just after starting up. After one try, the running yum process never seemed to exit, so rebooted VM and upon reboot was able to run yum
#000000;">

 

 

#000000;">

Back as Oracle unset TWO_TASK as it blocks “/ as sysdba” connections and get rid of the .bash output as it messes up scp and Delphix trying to put the toolkit onto the box

#0000ff;">unset TWO_TASK
#0000ff;">cp .bashrc .bashrc.orig
#0000ff;">cat .bashrc.orig | grep -v cat > .bashrc

Set up  Delphix c##delphix container user and delphix PDB user

#0000ff;">sqlplus / as sysdba << EOF
#0000ff;">create#0000ff;"> #0000ff;">user#0000ff;"> c##delphix #0000ff;">identified#0000ff;"> #0000ff;">by#0000ff;"> delphix;
#0000ff;">grant#0000ff;"> #0000ff;">create#0000ff;"> #0000ff;">session#0000ff;"> #0000ff;">to#0000ff;"> c##delphix;
#0000ff;">alter#0000ff;"> #0000ff;">user#0000ff;"> c##delphix #0000ff;">set#0000ff;"> container_data=#0000ff;">all#0000ff;"> container=#0000ff;">current#0000ff;">;
#0000ff;">grant#0000ff;"> #0000ff;">select#0000ff;"> #0000ff;">any#0000ff;"> dictionary #0000ff;">to#0000ff;"> c##delphix;
#0000ff;">create#0000ff;"> #0000ff;">or#0000ff;"> replace #0000ff;">view#0000ff;"> v_x\$kccfe #0000ff;">as#0000ff;"> #0000ff;">select#0000ff;"> * #0000ff;">from#0000ff;"> x\$kccfe;
#0000ff;">grant#0000ff;"> #0000ff;">select#0000ff;"> #0000ff;">on#0000ff;"> v_x\$kccfe #0000ff;">to#0000ff;"> c##delphix;
#0000ff;">create#0000ff;"> #0000ff;">synonym#0000ff;"> c##delphix.x\$kccfe #0000ff;">for#0000ff;"> v_x\$kccfe;
#0000ff;">alter#0000ff;"> #0000ff;">session#0000ff;"> #0000ff;">set#0000ff;"> container=orcl;
#0000ff;">create#0000ff;"> #0000ff;">user#0000ff;"> delphix #0000ff;">identified#0000ff;"> #0000ff;">by#0000ff;"> delphix;
#0000ff;">grant#0000ff;"> #0000ff;">create#0000ff;"> #0000ff;">session#0000ff;"> #0000ff;">to#0000ff;"> delphix;
#0000ff;">grant#0000ff;"> #0000ff;">select#0000ff;"> #0000ff;">any#0000ff;"> dictionary #0000ff;">to#0000ff;"> delphix;
#0000ff;">create#0000ff;"> #0000ff;">or#0000ff;"> replace #0000ff;">view#0000ff;"> v_x\$kccfe #0000ff;">as#0000ff;"> #0000ff;">select#0000ff;"> * #0000ff;">from#0000ff;"> x\$kccfe;
#0000ff;">grant#0000ff;"> #0000ff;">select#0000ff;"> #0000ff;">on#0000ff;"> v_x\$kccfe #0000ff;">to#0000ff;"> delphix;
#0000ff;">create#0000ff;"> #0000ff;">synonym#0000ff;"> delphix.x\$kccfe #0000ff;">for#0000ff;"> v_x\$kccfe;
#0000ff;">EOF

make a toolkit directory

#0000ff;">mkdir ~/toolkit

Add the inventory directory and the inventory file with version info that Delphix uses

#0000ff;">mkdir $ORACLE_HOME/inventory
mkdir $ORACLE_HOME/inventory/ContentsXML
cat << EOF >  $ORACLE_HOME/inventory/ContentsXML/comps.xml
<?xml version=”1.0″ standalone=”yes” ?>


Sep.19 15:31:29 EDT” INST_LOC=”/home/oracle/base/oracle12102/oracle.server”>



EOF 

Put database in archive log mode

#0000ff;">sqlplus / as sysdba << EOF
shutdown immediate;
startup mount;
alter database archivelog;
alter database open;
alter system switch logfile;
alter system switch logfile;
#0000ff;">EOF

Use the machine name in listener.ora and tnsnames.ora instead of IP from 0.0.0.0
LISTENER =  (DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = oraclevm )(PORT = 1521))
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)

#0000ff;">vi $ORACLE_HOME/network/admin/listener.ora
lsnrctl stop
lsnrctl start
sqlplus / as sysdba << EOF
alter system register;
#0000ff;">EOF

Then in Delphix UI, go to
#000000;">

Manage -> Environments
click “+” sign
add host
after host is added click on the database tab  (far right disk icon)
on database tab for environment, click on “discover”
for login/password give  c##delphix / delphix  (the CDB login we created)
this will discover the PDBs
now click on Delphix or Manage-> Databases -> My databases
click “+” sign
select the orcl PDB on cdb1
for login/password give delphix / delphix (the PDB login we created)

 

After creating the  vPDB it shows up for the listener

lsnrct stat

Service “vorc_fd7” has 1 instance(s).
Instance “cdb1″, status READY, has 1 handler(s) for this service…

I add an entry to $ORACLE_HOME/network/admin/tnsnames.ora

DELPHIX =
  (DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = oraclevm)(PORT = 1521))
    (CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = vorc_fd7)
)
)

and then connect with SQL*Plus

sqlplus delphix/delphix@DELPHIX

 

9104210308_a63b5ae5c4_z photo by #212124;" title="Go to Jose Maria Cuellar's photostream" href="https://www.flickr.com/photos/cuellar/" data-track="attributionNameClick" data-rapid_p="67">Jose Maria Cuellar (CC 2.0)

 

IOUG Collaborate 2015

I will be presenting two topics in IOUG Collaborate 2015 in Vegas. Use the show planner and add my presentations to your schedule :)

Session #189: April 13 Monday 9:15 to 10:15AM Topic: Oracle Database 12c In-Memory Internals. Room Palm B

Session #145: April 13 Monday 12:45PM-1:45PM Topic: Tools and Techniques for Advanced Debugging in Solaris & Linux (mostly live demo). Room Palm B.

Oracle Exadata Performance: Latest Improvements and Less Known Features

Here are the slides of a presentation I did at the IOUG Virtual Exadata conference in February. I’m explaining the basics of some new Oracle 12c things related to Exadata, plus current latest cellsrv improvements like Columnar Flash Cache and IO skipping for Min/Max retrieval using Storage Indexes:

Note that Christian Antognini and Roger MacNicol have written separate articles about some new features:

Enjoy!

 

Exadata Storage Index Min/Max Optimization

Before discussing the Exadata-specific feature, let’s review what the database engine can do independently of whether Exadata is used. To execute queries containing the min or max functions efficiently, two specific operations are available with B-tree indexes defined on the column referenced in the min or max function. The first, INDEX FULL SCAN (MIN/MAX), is used when a query doesn’t specify a range condition. In spite of its name, however, it performs no full index scan. It simply gets either the rightmost or the leftmost index key:

SQL> SELECT /*+ index(t t_pk) */ min(id) FROM t;

   MIN(ID)
----------
         1

SQL> SELECT * FROM table(dbms_xplan.display_cursor(NULL,NULL,'iostats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
SQL_ID  gy59myptq5gac, child number 0
-------------------------------------
SELECT /*+ index(t t_pk) */ min(id) FROM t

Plan hash value: 2094033419

---------------------------------------------------------------------------------------------
| Id  | Operation                  | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |      1 |        |      1 |00:00:00.01 |       2 |
|   1 |  SORT AGGREGATE            |      |      1 |      1 |      1 |00:00:00.01 |       2 |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T_PK |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------

The second, INDEX RANGE SCAN (MIN/MAX), is used when the query specifies a condition on the same column used in the function:

SQL> SELECT /*+ index(t t_pk) */ min(id) FROM t WHERE id > 42;

   MIN(ID)
----------
        43

SQL> SELECT * FROM table(dbms_xplan.display_cursor(NULL,NULL,'iostats last'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
SQL_ID  3nxqnh5y5z29b, child number 0
-------------------------------------
SELECT /*+ index(t t_pk) */ min(id) FROM t WHERE id > 42

Plan hash value: 4039034112

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |      1 |00:00:00.01 |       2 |
|   1 |  SORT AGGREGATE              |      |      1 |      1 |      1 |00:00:00.01 |       2 |
|   2 |   FIRST ROW                  |      |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  3 |    INDEX RANGE SCAN (MIN/MAX)| T_PK |      1 |      1 |      1 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID">42)

Unfortunately, this optimization technique can’t be applied when both functions (min and max) are used in the same query. In this type of situation, an index full scan is performed. The following query is an example:

SQL> SELECT /*+ index(t t_pk) */ min(id), max(id) FROM t;

   MIN(ID)    MAX(ID)
---------- ----------
         1      10000

SQL> SELECT * FROM table(dbms_xplan.display_cursor(NULL,NULL,'iostats last'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
SQL_ID  dn4290uqx6s34, child number 0
-------------------------------------
SELECT /*+ index(t t_pk) */ min(id), max(id) FROM t

Plan hash value: 56794325

-----------------------------------------------------------------------------------
| Id  | Operation        | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |      1 |        |      1 |00:00:00.06 |      32 |
|   1 |  SORT AGGREGATE  |      |      1 |      1 |      1 |00:00:00.06 |      32 |
|   2 |   INDEX FULL SCAN| T_PK |      1 |  10000 |  10000 |00:00:00.03 |      32 |
-----------------------------------------------------------------------------------

In Exadata, as of Oracle Database version 12.1.0.2, there is an optimization technique that uses storage indexes instead of B-tree indexes. To take advantage of it, not only a smart scan is required, but a storage index defined on the column referenced in the min or max function must also exist (and that, unfortunately, is something we can’t ensure). Let’s have a look to an example:

  • The table used for the test contains 16GB of data:
SQL> SELECT blocks*block_size AS value
  2  FROM dba_tables JOIN dba_tablespaces USING (tablespace_name)
  3  WHERE table_name = 'ORDERS'
  4  AND owner = sys_context('userenv','current_schema');

           VALUE
----------------
  16,725,196,800
  • If the optimization technique is disabled (by default is enabled, but you can control it with the _cell_storidx_minmax_enabled undocumented initialization parameter), as the statistics show, the whole table is read (cell physical IO bytes saved by storage index = 0) and the min and max functions are evaluated by the database instance:
SQL> SELECT /*+ no_parallel */ min(order_total), max(order_total) FROM orders;

MIN(ORDER_TOTAL) MAX(ORDER_TOTAL)
---------------- ----------------
               0            21347

Elapsed: 00:00:06.24

SQL> SELECT name, value
  2  FROM v$mystat NATURAL JOIN v$statname
  3  WHERE name IN ('physical read total bytes',
  4                 'cell physical IO bytes eligible for predicate offload',
  5                 'cell physical IO interconnect bytes returned by smart scan',
  6                 'cell physical IO bytes saved by storage index');

NAME                                                                       VALUE
--------------------------------------------------------------- ----------------
physical read total bytes                                         16,725,196,800
cell physical IO bytes eligible for predicate offload             16,725,196,800
cell physical IO bytes saved by storage index                                  0
cell physical IO interconnect bytes returned by smart scan         1,779,074,848
  • If the optimization technique is enabled (which is the default), thanks to a storage index, the cells avoid reading the whole table and much less data is sent back to the database instance (cell physical IO interconnect bytes returned by smart scan is much lower):
SQL> SELECT /*+ no_parallel */ min(order_total), max(order_total) FROM orders;

MIN(ORDER_TOTAL) MAX(ORDER_TOTAL)
---------------- ----------------
               0            21347

Elapsed: 00:00:00.58

SQL> SELECT name, value
  2  FROM v$mystat NATURAL JOIN v$statname
  3  WHERE name IN ('physical read total bytes',
  4                 'cell physical IO bytes eligible for predicate offload',
  5                 'cell physical IO interconnect bytes returned by smart scan',
  6                 'cell physical IO bytes saved by storage index');

NAME                                                                       VALUE
--------------------------------------------------------------- ----------------
physical read total bytes                                         16,725,196,800
cell physical IO bytes eligible for predicate offload             16,725,196,800
cell physical IO bytes saved by storage index                     15,565,668,352
cell physical IO interconnect bytes returned by smart scan           123,454,448

Note that even though the query I used for the test contains both the min and the max function, the optimization technique takes place. In other words, there is no limitation similar to the one just discussed with B-tree indexes.

Oracle system V shared memory indicated deleted

This article is written with examples taken from an (virtualised) Oracle Linux 6u6 X86_64 operating system, and Oracle database version 12.1.0.2.1. However, I think the same behaviour is true for Oracle 11 and 10 and earlier versions.

Probably most readers of this blog are aware that a “map” of mapped memory for a process exists for every process in /proc, in a pseudo file called “maps”. If I want to look at my current process’ mappings, I can simply issue:

$ cat /proc/self/maps
00400000-0040b000 r-xp 00000000 fc:00 786125                             /bin/cat
0060a000-0060b000 rw-p 0000a000 fc:00 786125                             /bin/cat
0060b000-0060c000 rw-p 00000000 00:00 0
0080a000-0080b000 rw-p 0000a000 fc:00 786125                             /bin/cat
01243000-01264000 rw-p 00000000 00:00 0                                  [heap]
345b000000-345b020000 r-xp 00000000 fc:00 276143                         /lib64/ld-2.12.so
345b21f000-345b220000 r--p 0001f000 fc:00 276143                         /lib64/ld-2.12.so
345b220000-345b221000 rw-p 00020000 fc:00 276143                         /lib64/ld-2.12.so
345b221000-345b222000 rw-p 00000000 00:00 0
345b800000-345b98a000 r-xp 00000000 fc:00 276144                         /lib64/libc-2.12.so
345b98a000-345bb8a000 ---p 0018a000 fc:00 276144                         /lib64/libc-2.12.so
345bb8a000-345bb8e000 r--p 0018a000 fc:00 276144                         /lib64/libc-2.12.so
345bb8e000-345bb8f000 rw-p 0018e000 fc:00 276144                         /lib64/libc-2.12.so
345bb8f000-345bb94000 rw-p 00000000 00:00 0
7f8f69686000-7f8f6f517000 r--p 00000000 fc:00 396081                     /usr/lib/locale/locale-archive
7f8f6f517000-7f8f6f51a000 rw-p 00000000 00:00 0
7f8f6f524000-7f8f6f525000 rw-p 00000000 00:00 0
7fff2b5a5000-7fff2b5c6000 rw-p 00000000 00:00 0                          [stack]
7fff2b5fe000-7fff2b600000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

What we see, is the start and end address, the rights (rwx), absence of rights is shown with a ‘-‘, and an indication of the mapped memory region is (p)rivate or (s)hared. In this example, there are no shared memory regions. Then an offset of the mapped file, then the device (major and minor device number). In our case sometimes this is ‘fc:00′. If you wonder what device this might be:

$ echo "ibase=16; FC" | bc
252
$ ls -l /dev | egrep 252,\ *0
brw-rw---- 1 root disk    252,   0 Mar 23 14:19 dm-0
$ sudo dmsetup info /dev/dm-0
Name:              vg_oggdest-lv_root
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      252, 0
Number of targets: 2
UUID: LVM-q4nr4HQXgotaaJFaGF1nzd4eZPPTohndgz553dw6O5pTlvM0SQGLFsdp170pgHuw

So, this is a logical volume lv_root (in the volume group vg_oggdest).

Then the inode number (if a file was mapped, if anonymous memory was mapped the number 0 is shown), and then the path if a file was mapped. This is empty for anonymous mapped memory (which is memory which is added to a process using the mmap() call). Please mind there are also special regions like: [heap],[stack],[vdso] and [vsyscall].

Okay, so far I’ve shown there is a pseudo file called ‘maps’ which shows mapped memory and told a bit about the fields in the file. Now let’s move on to the actual topic of this blog: the Oracle database SGA memory, and the indicator this is deleted!

In this example I pick the maps file of the PMON process of an Oracle database. Of course the database must use system V shared memory, not shared memory in /dev/shm (which is typically what you see when Oracle’s automatic memory (AMM) feature is used). This is a snippet from the maps file of the pmon process on my server:

 cat /proc/2895/maps
00400000-1093f000 r-xp 00000000 fc:00 1326518                            /u01/app/oracle/product/12.1.0.2/dbhome_1/bin/oracle
10b3e000-10dbf000 rw-p 1053e000 fc:00 1326518                            /u01/app/oracle/product/12.1.0.2/dbhome_1/bin/oracle
10dbf000-10df0000 rw-p 00000000 00:00 0
12844000-1289d000 rw-p 00000000 00:00 0                                  [heap]
60000000-60001000 r--s 00000000 00:04 111902723                          /SYSV00000000 (deleted)
60001000-602cc000 rw-s 00001000 00:04 111902723                          /SYSV00000000 (deleted)
60400000-96400000 rw-s 00000000 00:04 111935492                          /SYSV00000000 (deleted)
96400000-9e934000 rw-s 00000000 00:04 111968261                          /SYSV00000000 (deleted)
9ec00000-9ec05000 rw-s 00000000 00:04 112001030                          /SYSV6ce0e164 (deleted)
345b000000-345b020000 r-xp 00000000 fc:00 276143                         /lib64/ld-2.12.so
345b21f000-345b220000 r--p 0001f000 fc:00 276143                         /lib64/ld-2.12.so
...

If you look closely, you see the oracle executable first, with two entries, one being readonly (r-xp), the other being read-write (rw-p). The first entry is readonly because it is shared with other processes, which means that there is no need for all the processes to load the Oracle database executable in memory, it shares the executable with other process. There’s much to say about that too, which should be done in another blogpost.

After the executable there are two anonymous memory mappings, of which one is the process’ heap memory.

Then we see what this blogpost is about: there are 5 mappings which are shared (r–s and rw-s). These are the shared memory regions of the Oracle database SGA. What is very odd, is that at the end of the lines it says “(deleted)”.

Of course we all know what “deleted” means. But what does it mean in this context? Did somebody delete the memory segments? Which actually can be done with the ‘ipcrm’ command…

If you go look at the maps of other Oracle processes and other databases you will see that every database’s shared memory segment are indicated as ‘(deleted)’.

Word of warning: only execute the steps below on a test environment, do NOT do this in a production situation.

In order to understand this, the best way to see what actually is happening, is starting up the Oracle database with a process which is traced with the ‘strace’ utility with the ‘-f’ option set (follow). Together with the ‘-o’ option this will produce a (long) file with all the system calls and the arguments of the calls which happened during startup:

$ strace -f -o /tmp/oracle_startup.txt sqlplus / as sysdba

Now start up the database. Depending on your system you will notice the instance startup takes longer. This is because for every system call, strace needs to write a line in the file /tmp/oracle_start.txt. Because of this setup, stop the database as soon as it has started, on order to stop the tracing from crippling the database performance.

Now open the resulting trace file (/tmp/oracle_startup.txt) and filter it for the system calls that are relevant (calls with ‘shm’ in their name):

$ grep shm /tmp/oracle_startup.txt | less

Scroll through the output until you see a line alike ‘shmget(IPC_PRIVATE, 4096, 0600) = 130777091′:

...
4545  shmget(IPC_PRIVATE, 4096, 0600)   = 130777091
4545  shmat(130777091, 0, 0)            = ?
4545  shmctl(130777091, IPC_STAT, 0x7fff9eb9da30) = 0
4545  shmdt(0x7f406f2ba000)             = 0
4545  shmctl(130777091, IPC_RMID, 0)    = 0
4545  shmget(IPC_PRIVATE, 4096, 0600)   = 130809859
4545  shmat(130809859, 0, 0)            = ?
4545  shmctl(130809859, IPC_STAT, 0x7fff9eb9da30) = 0
4545  shmdt(0x7f406f2ba000)             = 0
4545  shmctl(130809859, IPC_RMID, 0)    = 0
...

What we see here is a (filtered) sequence of systems calls that could explain the status deleted of the shared memory segments. If you look up what process id is in front of these shm system calls, you will see it’s the foreground process starting up the instance. If you look closely, you’ll that there is a sequence which is repeated often:

1. shmget(IPC_PRIVATE, 4096, 0600) = 130777091
The system call shmget allocates a shared memory segment of 4 kilobyte, rights set to 600. The return value is the shared memory identifier of the requested shared memory segment.

2. shmat(130777091, 0, 0) = ?
The system call shmat attaches the a shared memory segment to the process’ address space. The first argument is the shared memory identifier, the second argument is the address to attach the segment to. If the argument is zero, like in the call above, it means the operating system is tasked with finding a suitable (non used) address. The third argument is for flags, the value zero here means no flags are used. The returncode (here indicated with a question mark) is the address at which the segment is attached. This being a question mark means strace is not able to read the address, which is a shame, because we can’t be 100% certain at which memory address this shared memory segment is mapped.

3. shmctl(130777091, IPC_STAT, 0x7fff9eb9da30) = 0
The system call shmctl with the argument IPC_STAT has the function to read the (kernel) shared memory information of the shared memory identifier indicated by the first argument, and write it at the memory location in the third argument in a struct called shmid_ds.

4. shmdt(0x7f406f2ba000) = 0
With this system call, the shared memory segment is detached from the process’ address space. For the sake of the investigation, I assumed that the address in this call is the address which is returned by the shmat() call earlier.

5. shmctl(130777091, IPC_RMID, 0) = 0
This is another shared memory control system call, concerning our just created shared memory segment (shared memory identifier 130777091), with the command ‘IPC_RMID’. This is what the manpage says about IPC_RMID:

       IPC_RMID  Mark the segment to be destroyed.  The segment will only  actually  be  destroyed
                 after the last process detaches it (i.e., when the shm_nattch member of the asso-
                 ciated structure shmid_ds is zero).  The caller must be the owner or creator,  or
                 be privileged.  If a segment has been marked for destruction, then the (non-stan-
                 dard) SHM_DEST flag of the shm_perm.mode field in the associated  data  structure
                 retrieved by IPC_STAT will be set.

What I thought this means was:
It looked like to me the database instance starts building up its shared memory segments per 4096 page. Because IPC_RMID only marks the segment to be destroyed, and because it will only be truly destroyed when there are no processes attached to the shared memory segment, it looked like to me the background processes were pointed to the shared memory segment which was marked destroyed (in some way I hadn’t discovered yet), which meant the shared memory segment would actually survive and all database processes can use it. If ALL the database processes would be killed for any reason, for example with a shutdown abort, the processes would stop being connected to the shared memory segment, which would mean the shared memory segment would vanish automatically, because it was marked for destruction.
Sounds compelling, right?

Well…I was wrong! The sequence of creating and destroying small shared memory segments is done, but it turns out these are truly destroyed with the shmctl(…,IPC_RMID,…) call. I don’t know why the sequence of creating shared memory segments is happening.

I started looking for the actual calls that create the final, usable shared memory segments in the /tmp/oracle_startup.txt file. This is actually quite easy to do; first look up the shared memory segment identifiers using the sysresv utility (make sure the database’s ORACLE_HOME and ORACLE_SID are set):

$ sysresv
...a lot of other output...
Shared Memory:
ID		KEY
197394436	0x00000000
197427205	0x00000000
197361667	0x00000000
197459974	0x6ce0e164
Semaphores:
ID		KEY
1015811 	0xd5cdbca4
Oracle Instance alive for sid "dest"

Actually the ‘sysresv’ utility (system remove system V memory I think is what the name means) has the task of removing memory segments if there is no instance left to use them. It will not remove the memory segments if it finds the instance alive. It prints out a lot of information as a bonus.

Now that we got the shared memory identifiers, simply search in the trace file generated by strace, and search for the creation of the memory segment with the identifiers: (please mind searching with ‘less’ is done with the forward slash)

$ less /tmp/oracle_startup.txt
9492  shmget(IPC_PRIVATE, 905969664, IPC_CREAT|IPC_EXCL|0640) = 197394436
9492  shmat(197394436, 0x60400000, 0)   = ?
9492  times(NULL)                       = 430497743
9492  write(4, " Shared memory segment allocated"..., 109) = 109
9492  write(4, "\n", 1)                 = 1

Aha! here we see shmget() again, but now with a size (905969664) that looks much more like a real shared memory segment size used by the database! After the shared memory identifier is created, the process attaches it to its addressing space with shmat() to a specific memory address: 0x60400000.

The next thing to do, is to look for any shmctl() call for this identifier. Oracle could still do the trick of marking the segment for destruction…
…But…there are no shmctl() calls for this identifier, nor for any of the other identifiers shown with the sysresv utility. This is rather odd, because Linux shows them as “(deleted)”. There ARE dozens of shmat() calls, of the other (background) processes forked from the starting process when they attach to the shared memory segments.

So, conclusion at this point is Linux shows the shared memory segments as deleted in ‘maps’, but the Oracle database does not mark the segments for destruction after creation. This means that either Linux is lying, or something mysterious is happening in the Oracle executable which I didn’t discover yet.

I could only think of one way to verify what is truly happening here. That is to create a program myself that uses shared memory, so I have 100% full control over what is happening, and can control every distinct step.

This is what I came up with:

#include 
#include 
#include 

int main ()
{
  int segment_id;
  char* shared_memory;
  struct shmid_ds shmbuffer;
  int segment_size;
  const int shared_segment_size = 0x6400;

  /* Allocate a shared memory segment.  */
  segment_id = shmget (IPC_PRIVATE, shared_segment_size,
                     IPC_CREAT | IPC_EXCL | S_IRUSR | S_IWUSR);
  printf ("1.shmget done\n");
  getchar();
  /* Attach the shared memory segment.  */
  shared_memory = (char*) shmat (segment_id, 0, 0);
  printf ("shared memory attached at address %p\n", shared_memory);
  printf ("2.shmat done\n");
  getchar();
  /* Determine the segment's size. */
  shmctl (segment_id, IPC_STAT, &shmbuffer);
  segment_size  =               shmbuffer.shm_segsz;
  printf ("segment size: %d\n", segment_size);
  printf ("3.shmctl done\n");
  getchar();
  /* Write a string to the shared memory segment.  */
  sprintf (shared_memory, "Hello, world.");
  /* Detach the shared memory segment.  */
  shmdt (shared_memory);
  printf ("4.shmdt done\n");
  getchar();

  /* Deallocate the shared memory segment.  */
  shmctl (segment_id, IPC_RMID, 0);
  printf ("5.shmctl ipc_rmid done\n");
  getchar();

  return 0;
}

(I took the code from this site, and modified it a bit for my purposes)
If you’ve got a linux system which is setup with the preinstall rpm, you should be able to copy this in a file on your (TEST!) linux database server, in let’s say ‘shm.c’, and compile it using ‘cc shm.c -o smh’. This will create an executable ‘shm’ from this c file.

This program does more or less the same sequence we saw earlier:
1. Create a shared memory identifier.
2. Attach to the shared memory identifier.
3. Get information on the shared memory segment in a shmid_ds struct.
4. Detach the shared memory segment.
5. Destroy it using shmctl(IPC_RMID).

What I did was have two terminals open, one to run the shm program, and one to look for the results of the steps.

Step 1. (shmget)

$ ./shm
1. shmget done

When looking with ipcs, you can see the shared memory segment which is created because of the shmget() call:

$ ipcs -m

------ Shared Memory Segments --------
0x00000000 451608583  oracle     600        25600      0

when looking in the address space of the process running the shm program, the shared memory segment is not found. This is exactly what I expect, because it’s only created, not attached yet.

Step 2. (shmat)

shared memory attached at address 0x7f3c4aa6e000
2.shmat done

Of course the shared memory segment is still visible with ipcs:

0x00000000 451608583  oracle     600        25600      1

And we can see from ipcs in the last column (‘1′) that one process attached to the segment. Of course exactly what we suspected.
But now that we attached the shared memory to the addressing space, it should be visible in maps:

...
7f3c4aa6e000-7f3c4aa75000 rw-s 00000000 00:04 451608583                  /SYSV00000000 (deleted)
...

Bingo! The shared memory segment is visible, as it should be, because we just attached it with shmat(). But look: it’s deleted already according to Linux!

However I am pretty sure, as in 100% sure, that I did not do any attempts to mark the shared memory segment destroyed or do anything else to make it appear to be deleted. So, this means maps lies to us.

So, the conclusion is the shared memory Oracle uses is not deleted, it’s something that Linux shows us, and is wrong. When looking at the maps output again, we can see the shared memory identifier is put at the place of the inode number. This is handy, because it allows you to take the identifier, and look with ipcs for shared memory segments and understand which specific shared memory segment a process is using. It probably means that maps tries to look up the identifier number as inode number, which it will not be able to find, and then comes to the conclusion that it’s deleted.

However, this is speculation. Anyone with more or better insight is welcome to react on this article.

Tagged: linux, memory, memory mapping, oracle

NYOUG Spring General Meeting

The New York Oracle User Group held their Spring General Meeting recently and I was presenting there about the Data Guard Broker and also about the Recovery Area.

Many thanks to the board for organizing this event, I really enjoyed being there! Actually, the Broker demonstration went not so smoothly – always dangerous to do things live – but I managed to get out of the mess in time and without losing too much of the message I wanted to get through. At least that’s what I hope ;-)

I took the opportunity to do some sightseeing in New York as well:

Me on Liberty Island

Tagged: NYOUG