Let me start by saying, I’m a big fan of all the Alien films, even Alien 3, which seems to be the least liked. Having said that, if someone said I had to pick one it would always be Alien. I watch that film several times a month. Ridley Scott is renowned for being uncompromising, so I went into Prometheus with some seriously high expectations, which always worries me as it kinda sets you up for a fall…
The opening sequence of Prometheus is quite possibly the most beautiful thing I’ve ever seen on film. As I watched it I thought it was probably worth the £6 just to see that on a big screen…
If you’ve seen the trailer, you know what’s going to happen. A star map found on ancient ruins all over the earth is seen as an invitation to visit some alien world, whose inhabitants may be the key to the origins of life on earth. We go to visit and things may not be as they seem…
One of the really annoying things about the first Alien film is we are introduced to the Space Jockey, the big alien on the bridge of the space ship they find, but we never know who they are or why they are there. Prometheus is a prequel to the the Alien films, but focuses almost exclusively on the Space Jockey aliens, so at last we know something about them…
I won’t say too much about what happens, because I think it will spoil the ride, but suffice to say it answers a lot of questions about the Space Jockey aliens and the origin of the aliens from the Alien and AvP films. I think it is a fantastic addition to collection and as soon as it is released on DVD is will buy it. Hopefully I will see it again next week when I go to the cinema with some mates.
Noomi Rapace is unusual, in that her face looks completely different from every angle it is viewed from. She was great in the original Swedish versions of (The Girl with the Dragon Tattoo, The Girl Who Played with Fire, The Girl Who Kicked the Hornets’ Nest) and I think she works well here. She is in fantastic shape in this film. She looks so strong and athletic. She is very much the Ellen Ripley style character of this film.
Michael Fassbender is incredible as the android/synthetic/artifical person on this mission. He gives me the creeps. There is a great introduction video to the “David 8″ model here. It’s not a spoiler, but it gives you a feel for the spookiness of the David 8 character.
The rest of the characters are fine, but as far as I’m concerned, the film is really about these two characters.
So in summary, Prometheus more than lived up to my expectations. If you are a fan of the Alien films I would be surprised if you don’t love it.
This is a quick note about reverse path filtering and impact of that feature to RAC. I encountered an interesting problem recently with a client and it is worth blogging about it, with a strong hope that it might help one of you in the future.
Environment is 188.8.131.52 GI, Linux 5.6. In a 3 node cluster, Grid Infrastructure (GI) comes up cleanly in just one node, but never comes up in other nodes. If we shutdown GI in first node, we can start the GI in second node with no issues. Meaning, GI can be up in just one node at any time.
System Admins indicated that there are no major changes, only few bug fixes. Seemingly, problem started after those bug fixes. But there were few other changes to the environment /init.ora parameter change etc. So, the problem was not immediately attributable to just OS changes.
Reviewing the GI alert log file, It was evident that CSSD daemon was not joining the cluster. CSSD log files indicated an Error message as “Other_node has Disk HB, but no Network HB”, implying that problem is with network layer. Normal checks such as ping, traceroute etc to all other nodes are successful ( and network admin/sysadmin simply said that this is an Oracle issue as the ping/traceroute is working fine).
Update 1: An Important note, After reading Brian’s comment below, I decided to clarify my blog entry. “Other_node has Disk HB, but no Network HB” error can happen for many reasons, but almost all those reasons will distill down to some type of network configuration issue. Essentially, this error means that network packets or multicast packets are not flowing through properly between the nodes. In this entry, I am discussing JUST ONE of that reason. If you encounter “Other_node has Disk HB, but no Network HB” error, you should review your network configuration carefully and review note 1054902.1 “How to Validate Network and Name Resolution Setup for the Clusterware and RAC “. [ Multicast issues are less prevalent (almost non-existent) in 184.108.40.206 version though as the software handles the multicast issues beautifully now ( essentially, tries 230.x.x.x IP range and then 224.x.x.x IP range automatically) ].
Time for advanced tools! With tcpdump and wireshark, I was able to see that packets were leaving the surviving node, but not received in the other node (and vice versa). Also checked the packets in the switch (port mirroring) and could see that packets are flowing through the switch with no issues..
Why would the packets received in the interface will not show up in the wireshark output? Kernel must be somehow filtering the packets.
At this point, we need to prove that packets are thrown away by the kernel. Interestingly named log_martians kernel parameter came handy. After changing the parameters net.ipv4.conf.eth3.log_martians and net.ipv4.conf.eth4.log_martians to 1, System admins confirmed that packets were disregarded by the kernel.
Started reviewing the sysctl.conf and comparing with old copy of sysctl.conf, there are no notable differences between the files for the past few weeks. So, no kernel parameter change either.
I was expecting to see some kernel parameter change that would tell the Kernel to filter the packets, such as firewall etc. Not seeing any change, I was baffled by the mystery.
Finally, decided to review all OS changes. A notable change from OS point of view stuck out: Kernel was upgraded from 2.6.18 to 2.6.32. While that doesn’t look like a major change, it is relevant since we know that Kernel is throwing away packets for some reason.
Then, I recollected seeing a note about 2.6.32 in MOS and searched for 2.6.32 string. Note 1286796.1 was exactly what I was remembering ” rp_filter for multiple private interconnects and Linux Kernel 2.6.32″.
Reverse Path Filtering
Reverse Path Filtering (RPF) is a security feature, if the reply of a packet may not go through the interface it was received on, that the kernel can throw away the packets. Ironically, this is not a new feature, just that 2.6.32 kernel fixed a bug and so, RPF started to work. This bug fix in 2.6.32 kernel affects private interconnect traffic.
Solution was simple, disable RPF for private interfaces. Modify /etc/sysctl.conf and add following two kernel parameter and then perform sysctl -p (Read that ML note (1286796.1) for complete description).
I wish that this is documented better so that this weird problem can be avoided. I also want to make it clear that not all CSSD heart beat issues can be attributed to this RPF. It is just that this client was unfortunate enough to encounter this issue.
Update 1: Fixing the link for training.
Here’s another one of those little changes that sidles in when you’re not looking.
When locally managed tablespaces first appeared, there were a number of posts on Usenet (comp.databases.oracle.server) and the metalink lists (the OTN database forum didn’t exist at the time) about “missing” space in data files. The commonest sort of comment was along the lines of
“I’ve created a data file of 100 MB, how come I can only create 99 extents of 1 MB each?”
“I’ve created a data file of 10 GB, how come I can only create 9 extents of 1 GB each?”
The answer was that Oracle reserved the first few blocks of the file (typically totalling 64KB of space, but 128KB if you were on a platform that supported a 32KB block size) for space management blocks holding a bitmap identifying which (notional) extents in the file were free and which were allocated to data segments. For uniform extents each bit in the map represented the given unit size of an extent, for system-managed extent allocation each bit represented 64KB (the “lowest common denominator” for an extent).
Here’s a little demo script, with some results, cut from a session using 10.2.0.3 to demonistrate the point:
create tablespace tiny datafile 'C:\ORACLE\ORADATA\D10g\tiny.dbf' SIZE 1114112 reuse extent management local autoallocate segment space management auto ; select file_id, block_id, bytes, blocks from dba_free_space where tablespace_name = 'TINY' ; SQL> @demo Tablespace created. FILE_ID BLOCK_ID BYTES BLOCKS ---------- ---------- ---------- ---------- 8 9 1048576 128 1 row selected. SQL>
If you’re wondering about my choice of initial file size, it’s the equivalent of 1MB + 64KB. As you can see this results in 1MB of “user” space available, starting in block 9 of the file (leaving 8 blocks – 64KB given my block size – for any file-based overheads).
Let’s check the results (after changing the filename) for an 11g instance – 220.127.116.11 in this case:
SQL> @temp Tablespace created. FILE_ID BLOCK_ID BYTES BLOCKS ---------- ---------- ---------- ---------- 7 128 65536 8 1 row selected. SQL>
When we shift to 11g we see 1MB of overhead at the start of the file, and only 64KB left for user space.
I haven’t done a lot of testing to see how many different special cases there are but based on a couple of quick checks it looks as if Oracle will allocate 1MB of space management overhead at the start of file if the initial file size is at least 1MB + 64KB. (For smaller files the overhead is still 64KB.)
I suspect the reason behind this is that if a file were to become very large Oracle would eventually have to allocate a secondary space management bitmap somewhere in the middle of the file (which might make it hard to shrink the file at a later date), so it pre-emptively allocates a space management area large enough for a very large file when you create the initial file.
As a little throwaway test (to see if I could get a secondary map allocated) I created a single file of 2,047 MB, in a tablespace using locally managed extents of uniform size 16KB (yes, deliberately silly sizing), and freelist management. Here’s the slightly surprising result I got from querying dba_free_space for that tablespace:
select block_id, blocks, block_id + blocks - 1 end_block, bytes/1048576 MB from dba_free_space where tablespace_name = 'TINY' order by block_id ; BLOCK_ID BLOCKS END_BLOCK MB ---------- ---------- ---------- ---------- 128 126976 127103 992 127104 126976 254079 992 254080 7936 262015 62 ---------- sum 2046 3 rows selected.
I was expecting a single free space chunk – but that’s not the way it seems to work. For an 8KB block size (in this version of Oracle), the bitmap space management blocks start at block 3 of the file, and each bitmap block can map 63,488 extents – which in my case amounted to 126,976 blocks – and the code that reports dba_free_space seems to look at one bitmap block at a time to generate the report.
It’s probably the case that none of this matters to anyone – but it’s just another couple of details that go into my catalogue of implementation details just in case I find the boundary condition one day, perhaps relating to LOBs or SecureFiles, where something strange happens because of some side effect of this implementation.
Footnote: If you want to create a file on an 8KB block size with a secondary bitmap, you should start with a small file (1MB or less), 16KB uniform extents and freelist management; this will give you 6 blocks of bitmaps, which can each map 63,488 extent, or 126,976 blocks, for a total of 5,952MB. Add 64KB for the initial file header information and then a bit more – call it 6GB to make it easy – and you’ve got a targer to resize the file to the point where you get a secondary map. But I think that means you’ll need a 64-bit O/S.
Enkitec recently completed its 80th Exadata implementation, so this is a good time to announce that we are organizing the first Enkitec Extreme Exadata Expo (E4) on 13-14. August 2012, in Dallas, Texas!
This is the global Exadata event to attend, if you want to learn from the best Oracle/Exadata experts in the world. In addition to the Enkitec’s Exadata team (like Kerry Osborne, Karen Morton, Andy Colvin, Karl Arao and a dude named Tanel), you’ll have a chance to learn from Jonathan Lewis, Maria Colgan, Doug Burns, Frits Hoogland, Arup Nanda and other Exadata gurus out there ;-)
We also have some customers presenting their Exadata case studies and customer experience – and wait, there’s even a chance for you to speak as well – if you have serious hands on Exadata experience already or a real-life case study to present. We have very limited speaking slots, so you would have to send your abstract at the Enkitec E4 Call-for-papers page by 5pm CST 31. May and we’ll let you know whether your abstract made it by 8th of June. Sorry for letting you know so late, but you have around a day before the CFP closes … but it shouldn’t take that long to write an abstract anyway ;-)
Additionally you’ll have a chance to attend one of Enkitec’s 3-day training classes right after the conference to get the most out of your trip. Considering this – and the speaker lineup, the Enkitec E4 will likely be the best Exadata event to attend! Note that if you can’t attend physically, you can attend virtually too!
So, check out http://extremeexadata.com/ now – it will absolutely rock!
P.S. This event will be a good chance to get all three co-authors of the Expert Oracle Exadata book to sign your copy as well! :-)
May 30, 2012 A question appeared on the OTN Database forums yesterday that has yet to receive an answer. The question was essentially, why did the execution plan change when the OPTIMIZER_FEATURES_ENABLED parameter was adjusted from 18.104.22.168 to 22.214.171.124, and why did the execution performance improve as a result of making the change? A DBMS_XPLAN [...]
Fedora 17 was released yesterday. I mentioned in a previous post I had run through the installation of Oracle 11gR2 on Fedora 17 alpha. With the arrival of the final Fedora 17 release I ran through the articles again last night to make sure everything was OK. You can see the finished versions here:
As always, installing Oracle on Fedora 17 is just for fun and totally not supported. For anything proper you should be using Oracle Linux or RHEL.
I’ve previously discussed Virtual Indexes and how they can be used to do basic “what if” analysis if such an index really existed. However, a recent comment on the OTN forums regarding using them to compare index costs made me think a follow-up post regarding the dangers of Virtual Indexes might be warranted. The big advantage of [...]
Why oh why can’t the My Oracle Support (MOS) website actually work like it is meant to?
I have been trying to get set up to use the new companies CSIs and it is driving me crazy. I’m on the “Support IDs and Privileges” page and I either get:
I get the same random (mis)behavior on the HTML and Flash versions. I checked with another guy in the office and his account is doing the same thing, so it’s not an issue specific to my account. I’ve also tried on IE, Chrome, Firefox and Opera. No luck.
Can we please ditch these versions and have the old APEX version back? It’s wasn’t as pretty, but at least it worked!
I rang Oracle Support to try and figure it out. The lady talked me through the whole process, asking me to click links or buttons that didn’t exist on my screen. It’s more than a little ironic that you need Oracle Support to actually use the MOS website…
So I am in MOS limbo. I can use my personal account and CSI, but who knows if I will ever get the company account working…
For those not familiar with Richard Foote’s extensive blog about indexes (and if you’re not you should be) – the title of this note is a blatant hi-jacking of his preferred naming mechanism.
It’s just a short note to remind myself (and my readers) that anything you know about Oracle, and anything published on the Internet – even by Oracle Corp. and its employees – is subject to change without notice (and sometimes without being noticed). I came across one such change today while read the Expert Oracle Exadata book by Kerry Osborne, Randy Johnson and Tanel Poder. It was just a little throwaway comment to the effect that:
In NOARCHIVELOG mode all bulk operations (such as INSERT, APPEND, index REBUILD and ALTER TABLE MOVE) are automatically nologging.
The obvious error there is the reference to “index REBUILD”. Although create table as select and alter table move default to nologging (when running in noarchivelog mode) the equivalent commands for indexes have always been logged. On the other hand, pausing for thought here, I wouldn’t expect such an obvious error to slip past all three authors and the technical reviewers so, before opening my mouth and putting my foot firmly into it, I decided to run a quick test and, almost inevitably, I have a handy test script that I’ve been running intermittently for years for exactly this test case.
execute snap_redo.start_snap create table t1 as with generator as ( select --+ materialize rownum id from dual connect by level <= 10000 ) select rownum id, lpad(rownum,10,'0') small_vc, rpad('x',100) padding from generator v1, generator v2 where rownum <= 100000 ; execute snap_redo.end_snap execute snap_redo.start_snap create index t1_i1 on t1(padding); execute snap_redo.end_snap execute snap_redo.start_snap alter index t1_i1 rebuild; execute snap_redo.end_snap
The snap_redo package is a simple bit of pl/sql I wrote to report the changes in a few of the current session stats (view v$mystat) over time. Specfically it looks at the statistics containing the word redo (and, in a more sophisticated form, a few others related to transaction management). Here are the key results for a test run on 10.2.0.3 and 126.96.36.199 – first from 10g
============ Create Table ============ redo synch writes 2 redo entries 410 redo size 71,792 redo wastage 1,836 redo writes 6 redo blocks written 133 redo ordering marks 3 redo subscn max counts 10 ============ Create Index ============ redo synch writes 3 redo synch time 2 redo entries 1,887 redo size 13,129,888 redo buffer allocation retries 4 redo wastage 2,144 redo writes 11 redo blocks written 26,428 redo write time 67 redo log space requests 3 redo log space wait time 31 redo ordering marks 3 redo subscn max counts 3 ============= Rebuild index ============= redo synch writes 2 redo entries 1,978 redo size 13,132,052 redo buffer allocation retries 2 redo wastage 4,084 redo writes 16 redo blocks written 26,453 redo write time 66 redo ordering marks 3 redo subscn max counts 4
Now from 11g
============ Create Table ============ redo synch writes 1 redo synch time 1 redo entries 428 redo size 67,228 redo size for direct writes 2,912 redo wastage 1,072 redo writes 5 redo blocks written 138 redo write time 1 redo ordering marks 3 redo subscn max counts 5 ============ Create Index ============ redo synch writes 1 redo entries 601 redo size 71,092 redo size for direct writes 14,916 redo wastage 972 redo writes 4 redo blocks written 146 redo blocks checksummed by FG (exclusive) 6 redo ordering marks 3 redo subscn max counts 4 ============= Rebuild index ============= redo synch writes 1 redo entries 684 redo size 80,400 redo size for direct writes 14,916 redo wastage 1,988 redo writes 7 redo blocks written 167 redo write time 5 redo blocks checksummed by FG (exclusive) 6 redo ordering marks 4 redo subscn max counts 5
As you can see, somewhere between 10.2.0.3 and 188.8.131.52 index creation and rebuild finally became consistent with table creation and move when running in noarchivelog mode. The book was right – it was, after all, talking about Exadata which means it’s implicitly talking about 11g and doesn’t really have to qualify the comment with references to earlier versions.
There’s a secondary moral to this story: instead of saying: “You’re wrong”, you might look a little wiser if you start with “Are you sure about that?” or even “Which version are you thinking of?”
There’s another corollary, of course – if you decide to test out the time and impact of rebuilding a very large index by using a backup copy of your production system, make sure that you are running in archivelog mode or you won’t be doing a test that is anything like valid for the production system (assuming your production systems are running in archivelog mode, of course).
In part 1 we performed a series of experiments to explore the relation between CPU utilization and Linux load average. We concluded that the load average is influenced by processes running on or waiting for the CPU. Based on experiments in part 2 we came to the conclusion that processes that are performing disk I/O […]