Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Subquery Factoring (3)

From time to time I’ve warned people that subquery factoring should be used with a little care if all you’re trying to do is make a query more readable by extracting parts of the SQL into “factored subqueries” (or Common Table Expressions – CTEs – if you want to use the ANSI term for them). [...]

V3rity has released a redo log mining tool to extract DDL from redo logs

V3rity is the new company founded by David Litchfield in March 2010 since he left NGS and until recently his site had little on it. I suspected that his new company would focus on Database forensics and I am glad....[Read More]

Posted by Pete On 29/06/10 At 01:18 PM

Configuring Large PeopleSoft Application Servers

Occasionally, I see very large PeopleSoft systems running on large proprietary Unix servers with many CPUs.  In an extreme case, I needed to configure application server domains with up to 14 PSAPPSRV processes per domain (each domain was on a virtual server with 8 CPU cores, co-resident with the Process Scheduler).

The first and most important point to make is don't have too many server processes.  If you run out of CPU or if you fully utilise all the physical memory and start to page memory from disk, then you have too many server processes.  It is better to queue on a Tuxedo queue rather than the CPU run queue, or disk queue during paging.

Multiple APPQ/PSAPPSRV Queues

A piece of advice that I originally got from BEA (prior to their acquisition by Oracle) was that you should not have more than 10 server processes on a single queue in Tuxedo.  Otherwise, you are likely to suffer from contention on the IPC queue structure because processes must acquire exclusive access to the queue in order to enqueue a service request to the queue or dequeue a request from it.  Instead multiple queues should be configured that are both serviced by the same server processes and so advertise the same services. 

If you look at the 'large' template delivered by PeopleSoft, you will see that it produces a domain that runs between 9 and 15 PSAPPSRV processes.  This does not conform to the advice I received from BEA.  I repeated this advice in PeopleSoft for the Oracle DBA.  Though I cannot now find the source for it, I stand by it.  I have recently been able to conduct some analysis to confirm it on a real production system.  Domains with two queues of 8 PSAPPPSRV server process each out performed domains with only a single queue.

Load Balancing Across Queues

If the same service is advertised on multiple queues, then Tuxedo recommended that you should specify realistic service loads and use Tuxedo load balancing to determine where to enqueue requests.  I want to emphasise that I am talking about load balancing across queues within a Tuxedo domain, and not about load balancing across Tuxedo domains in the web server.

This is what the Tuxedo documentation says about load balancing:

"Load balancing is a technique used by the BEA Tuxedo system for distributing service requests evenly among servers that offer the same service. Load balancing avoids overburdening some servers while leaving others idle or infrequently used. Before sending a request to a service routine, the BEA Tuxedo system identifies all servers capable of handling the request and selects the one most appropriate for maintaining a balanced load across all the servers in the configuration.


You can control whether a load-balancing algorithm is used on the system as a whole. Such as algorithm should be used only when necessary, that is, only when a service is offered by servers that use more than one queue. Services offered by only one server, or by multiple servers in a Multiple Server, Single Queue (MSSQ) do not need load balancing. The LDBAL parameter for these services should be set to N. In other cases, you may want to set LDBAL to Y."

It doesn't state that load balancing is mandatory for multi-queue domains, and only hints that it might improve performance.  If load balancing is not used, the listener process puts the messages on the first empty queue (one where no requests are queued).  If all queues have requests the listener round-robins between the queues.

You could consider giving ICScript, GetCertificate and other services with small service times a higher Tuxedo Service priority.  This means they jump the queue 9 times out of 10.  ICScript is generally used during navigation, GetCertificate is used at log on.  Giving these services higher priority will mean they perform well even when the system is busy.  Users often need to do several mouse clicks to navigate around the system, but these services are usually quick.  This will improve the user experience without changing the overall performance of the system.

Data

I have recently been able to test the performance of a domains with up to 14 PSAPPSRVs on a single IPC queue, versus domains with two queues with up to 7 PSAPPSRVs each, both with and without Tuxedo queue balancing.   These results come from a real production system where the multiple queue configuration was implemented on 2 of the 4 application servers.  The system has a short-lived weekly peak period of on-line processing.  During that time Tuxedo spawns additional PSAPPSRV processes, and so I get different sets of times for different numbers of process. 

The timings are produced from transactions sampled by PeopleSoft Performance Monitor.  I capture the number of spawned processes using the Tuxmon scripts on my website that use tmadmin to collect Tuxedo metrics.

1 Queue 2 Queue
Server Processes per Queue Number of Services Mean ICPanel Service Time Server Processes per Queue Number of Services Mean ICPanel Service Time
6 2,616 1.33 3 6945 1.05
7 1,949 0.97
8 1,774 1.06 4 7595 1.16
9 1,713 1.02
10 1,553 1.25 5 4629 1.17
11 1,250 1.30
12 969 1.32 6 3397 1.16
13 427 1.21
14 1,057 1.10 7 3445 1.13
 Total
13,308 1.17 26011 1.13

The first thing to acknowledge is that this data is quite noisy because it comes from a real production system, and the effects we are looking for are quite small.

I am satisfied that the domains with two PSAPPSRV queues generally perform better under high load, than those under 1.  Not only does the queue time increase on the single queue domain, the service time also increases.

However, I cannot demonstrate that Tuxedo Load Balancing makes a significant difference in either direction.

My results suggest that domains with multiple queues for requests handled by PSAPPSRV process perform slightly better without load balancing if there is no queue of requests, but perform slightly better if there is a queue of pending requests.  However, the difference is small.  It is not large enough to be statistically significant in my test data.

Conclusion

If you have a busy system with lots of on-line users, and sufficient hardware to resource it, then you might reach a point when you need more than 10 PSAPPSRVs.  In which case, I recommend that you configure multiple Tuxedo queues.

On the whole, I would recommend that Tuxedo Load Balancing should be configured.  I would not expect it to improve performance, but it will not degrade it either.

Little Things Doth Crabby Make – Part XIV. Verbose Linux Command Output Should Be Very Trite. Shouldn’t It?

Not all topics I blog about in my Little Things Doth Crabby Make series make me crabby. Often times I’ll blog something that I presume would make at least one individual somewhere, sometime crabby. This one actually did make me crabby. Huh? Was That Verbose? I’m blogging about the –verbose option to the Linux mdadm(8) [...]

Will Oracle Ever Release Sun Servers Based On Westmere EP and Nehalem EX Processors? Yes.

Oracle has announced the release of several new x86 servers based on the Westmere EP and Nehalem EX processors. This is a really short blog entry, because the website is so loaded with information I haven’t much to add: Main Oracle Page for x86 Family of Systems Impressive List of x86 Benchmark Results with Oracle [...]

Build a Simple Firewall for Databases Using SQL Net

This article was initially published in 2003 in DBAZine.com, which has since been folded.

So, you want to set up a secured database infrastructure?

You are not alone. With the proliferation of threats from all sources — identity thefts to corporate espionage cases — and with increased legislative pressures designed to protect and serve consumer privacy, security has a taken on a new meaning and purpose. Part of the security infrastructure of an organization falls right into your lap as a DBA, since it’s your responsibility to secure the database servers from malicious entities and curious insiders.

What are your options? Firewalls are first to come to mind. Using a firewall to protect a server, and not just a database server, is not a new concept and has been around for a while. However, a firewall may be overkill in some cases. Even if a firewall is desirable, it may still have to be configured and deployed properly. The complexity in administering a firewall, not to mention the cost to acquire one, may be prohibitive. If the threat level can be reduced by proper positioning of existing firewalls, the functionality of additional ones can be created by a tool available free with Oracle Net, Node Validation. In this article, you will learn how to build a rudimentary, but effective, firewall-like setup with just Oracle Net, and nothing else.

Background

Let’s see a typical setup. Acme, Inc. has several departments — two of which are Payroll and Benefits. Each department’s database resides on a separate server. Similarly, each department’s applications run on separate servers. There are several application servers and database servers for each department. To protect the servers from unauthorized access, each database server is placed behind a firewall with ports open to communicate SQL*Net traffic only. This can be depicted in figure 1 as follows:


Figure 1: Protecting departmental database and application servers using multiple firewalls.

This scheme works. But notice how many firewalls are required and the complexity that having this number adds to the administration process. What can we do to simplify the setup? How about removing all the firewalls and having one master firewall around the servers, as in Figure 2?

Figure 2: One master firewall around all the servers.

This method protects the servers from outside attacks; however, it still leaves inside doors open. For instance, the application server PAYROLL1 can easily connect to the database server of the Benefits Department BENEFITDB1, which is certainly inappropriate. In some organizations, there could be legal requirements to prevent this type of access.

Rather than creating a maze of firewalls as in the case we noted previously, we can take advantage of the SQL*Net Node Validation to create our own firewall. We will do this using only Oracle Net, which is already installed as a part of the database install. The rest of this article will explain in detail how to accomplish this.

Objective

Our objective is to design a setup as shown in figure 3. In this network, the application servers benefits1 and benefits2 access the database on server benefitsdb1. Similarly, application servers payroll1 and payroll2 access the database on server payrolldb1. Clients should be given free access to the intended machines. Client machines shouldn’t be allowed to access the database on other departments (e.g., benefits1 and benefits2 shouldn’t be able to access the database on payrolldb1). Likewise, application servers payroll1 and payroll2 should not be allowed to access benefitsdb1.


Figure 3: One master firewall and restricting access from non-departmental clients.

Note the key difference in requirements here — we are not interested in disallowing any type of access from client machines to servers of another department. Rather, it’s enough to disable access at the Oracle level only. This type of restriction is enforced by the listener. A listener can check the IP address of the client machine and, based on certain rules, decide to allow or deny the request. This can be enabled by a facility called Valid Node Checking, available as a part of Oracle Net installation. Let’s see how this can be done.

To set up valid node checking, simply place a set of lines on a specific file on the server. In our example, the following lines are placed in the parameter file on the server payrolldb1, allowing access to servers payroll1 and payroll2.

tcp.validnode_checking = yes

tcp.invited_nodes = (payroll1, payroll2)

Where this parameter file is located depends on the Oracle version. In Oracle 8i, it’s a file named protocol.ora; in Oracle 9i, it’s called sqlnet.ora. Both these files are located in the directory specified by the environmental variable TNS_ADMIN, which defaults to $ORACLE_HOME/network/admin in UNIX or %ORACLE_HOME%\network\admin in Windows.

These parameters are self-explanatory. The first line, tcp.validnode_checking = yes, specifies that the nodes are to be validated before accepting the connection.

The second line specifies that only the clients payroll1 and payroll2 are allowed to connect to the listener. The clients are indicated by either IP address (e.g., 192.168.1.1) or the node name as shown above. The list of node names is specified by a single line separated by commas. It is important to have only one line — you shouldn’t break it up.

The values take effect only during the startup of the listener. After making the change in protocol.ora (in Oracle 8i) or sqlnet.ora (in Oracle 9i and later), stop and restart the listener. After you’ve done so, if a user, regardless of the authentication in the database or authority level, attempts to connect to the database on benefits1 from the node payroll1, he receives the error as shown below.

$ sqlplus scott/tiger@payrolldb1

SQL*Plus: Release 9.2.0.4.0 - Production on Tue Jan 2o 9:03:33 2004

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

ERROR: ORA-12537: TNS:connection closed

Enter user-name:

The error message is not very informative; it does not explicitly state the nature of the error. This error occured, however, because the connection request came from a client that is not listed as accepted. In this case, the listener simply rejected the connection originating from the node benefits1, regardless of the user. Yet the same user trying to connect from node payroll1 would succeed.

Excluded Nodes

In the previous example, we saw how to allow only a certain set of clients, and disallow all others. Similarly, you can specify the other type of rule — exclude some clients and allow all others. Say the lines in the parameter file are as follows:

tcp.validnode_checking = yes

tcp.excluded_nodes = (payroll3)

All clients but those connecting from payroll3 would be able to connect to all nodes. So, in this case, clients benefits1 and benefits2 would be able to connect to payrolldb1 in addition to clients payroll1 and payroll2. Isn’t that counter to what we wanted to achieve? Where can this exclusion be used?

In real life cases, networks are subdivided into subnetworks, and they offer adequate protection. In a particular subnet, there may be a greater number of clients needing access than the number being restricted. In such a case, it might be easier to specifically refuse access from a set of named clients, conveniently named in the tcp.excluded_nodes parameter. You can also use of this parameter to refuse access from certain machines that had been used to launch attacks in the past.

You can also mix excluded and included nodes, in which case, the invited nodes are given precedence over excluded ones. But there are three very big drawbacks to this approach.
1. There is no way to specify a wild card character in the node list. You must specify a node explicitly by its name or its IP address.
2. All excluded or invited nodes are to be specified in only one line, severely limiting your ability to specify a large number of nodes.
3. Since the validation is based on IP address or client names only and it’s relatively easy to spoof these two key pieces of identification, the system is not inherently secure.

For these reasons, mixing excluded and included nodes is not quite suitable for excluding a large list of servers from a network or subnetwork. This method can be used when the list of machines accessing the network is relatively small and the machines are in a subnetwork, behind a firewall. In such a configuration, the risk of external attacks is very slight, and the risk of unauthorized access by spoofing key identification is negligible.

Oracle Net also provides another means to develop a rudimentary firewall using a lesser known and even lesser used tool called Connection Manager. This tool is far more flexible in the setup; you can specify wildcards n-node names without restrictions such as the need to have only a single line for naming the nodes. A detailed discussion of Connection Manager with real-life examples can be found in the book Oracle Privacy Security Auditing.

Troubleshooting

Of course, things may not always proceed as smoothly as in the examples we’ve cited so far. One of the common problems you can encounter is that the exclusion may not work even though the files may be present and the parameters seem to be defined properly.

To diagnose a node checking issue you may encounter, you need to turn on tracing during the connection process. Tracing the process can be done in several levels of detail, and in this case, you should enable it for the level called support, or "16." Place the following line in the file sqlnet.ora:

trace_level_server = support

Doing this causes the connection process to write detailed information in a trace file under the directory $ORACLE_HOME/network/trace. The directory can be specified to a different value by a parameter in the file sqlnet.ora, as
trace_directory_server = /tmp

By doing this, the trace information to be written to the directory /tmp instead of the default. After setting the parameters as shown above, you should attempt the connection again. There is no need to bounce the listener. The connection attempt will create trace files named similar to svr_0.trc to be written in the proper directory. You should open this file in an editor (parts of the file are shown below).

[20-JAN-2004 12:00:01:234] Attempted load of system pfile
source /u02/oracle/product/9.2/network/admin/sqlnet.ora

[20-JAN-2004 12:00:01:234] Parameter source loaded successfully

[20-JAN-2004 12:00:01:234]

[20-JAN-2004 12:00:01:234] -> PARAMETER TABLE LOAD RESULTS FOLLOW <-

[20-JAN-2004 12:00:01:234] Successful parameter table load

[20-JAN-2004 12:00:01:234] -> PARAMETER TABLE HAS THE FOLLOWING CONTENTS <-

[20-JAN-2004 12:00:01:234] tcp.validnode_checking = yes

[20-JAN-2004 12:00:01:234] trace_level_server = support

[20-JAN-2004 12:00:01:234] tcp.invited_nodes = (192.168.1.1, 192.168.1.2)

[20-JAN-2004 18:27:04:484] NAMES.DIRECTORY_PATH = (TNSNAMES)

[20-JAN-2004 18:27:04:484] tcp.excluded_nodes = (192.168.1.3)

[20-JAN-2004 18:27:04:484] --- PARAMETER SOURCE INFORMATION ENDS ---

These lines indicate that

1. The parameter file /u02/oracle/product/9.2/network/admin/sqlnet.ora was read by the listener.

2. The parameters were loaded successfully.

3. The contents of the parameter were read as they were mentioned.

4. The names of the excluded and invited nodes are displayed.

If the information is not as shown here, the problem be caused by the way the parameter file is written; most likely a typographical error such as a missing parenthesis. This type of error should be fixed before proceeding further along the trace file.

If the parameters are indeed loaded properly, you should next check the section of the file in which the node validity checking is done. This section looks like this:

[20-JAN-2004 12:30:45:321] ntvllt: Found tcp.invited_nodes. Now loading...

[20-JAN-2004 12:30:45:321] ntvllhs: entry

[20-JAN-2004 12:30:45:321] ntvllhs: Adding Node 192.168.1.1

[20-JAN-2004 12:30:45:321] ntvllhs: Adding Node 192.168.1.2

[20-JAN-2004 12:30:45:321] ntvllhs: exit

[20-JAN-2004 12:30:45:321] ntvllt: exit

[20-JAN-2004 12:30:45:321] ntvlin: exit

[20-JAN-2004 12:30:45:321] nttcnp: Validnode Table IN use; err 0x0

The first line indicates that the parameter tcp.invited_nodes was found. Next, the entries in that list are read and displayed one after the other. This is the most important clue. If the addresses were written incorrectly, or the syntax were wrong, the trace files would have indicated this by not specifying the node names checked. The last line in this section shows that the ValidNode table was read and used with error code of 0x0 (in hexadecimal, equating to zero) — the table has no errors. If there were a problem in the way the valid node parameters were written in the parameter file, the trace file would have shown something different. For instance, say the parameters were written as

tcp.excluded_nodes = (192.168.1.3

Note how a parenthesis is left out, indicating a syntax problem. However, this does not affect the connection; the listener simply ignores the error and allows the connection without doing a valid node checking. Upon investigation, we would find the root of the problem in the trace file. The trace file shows the following information.

--- PARAMETER SOURCE INFORMATION FOLLOWS ---

[20-JAN-2004 12:45:03:214] Attempted load of system pfile

source /u201/oracle/product/9.2/network/admin/sqlnet.ora

[20-JAN-2004 12:45:03:214] Load contained errors 14] Error stack follows: NL-00422: premature end of file NL-00427: bad list

[20-JAN-2004 12:45:03:214]

[20-JAN-2004 12:45:03:214] -> PARAMETER TABLE LOAD RESULTS FOLLOW <-

[20-JAN-2004 12:45:03:214] Some parameters may not have been loaded

[20-JAN-2004 12:45:03:214]

See dump for parameters which loaded OK This clearly shows that the parameter file had errors that prevented the parameters from loading. Because of this, the valid node checking is turned on and in use, but there is nothing in the list of the excluded nodes as shown in the following line from the trace file:

[20-JAN-2004 12:45:03:214] nttcnp: Validnode Table IN use; err 0x0

Since the error is 0x0, no error is reported by the validity checking routine. The subsequent lines on the trace file show other valuable information. For instance this line,

[20-JAN-2004 12:45:13:211] nttbnd2addr: using host IP address: 192.168.1.1

shows that the IP address of the server to which the listener was supposed to route the connection was 192.168.1.1. If all goes well, the listener allows the client to open a connection. This is confirmed by the following line:

[20-JAN-2004 12:45:14:320] nttcon: NT layer TCP/IP connection has been established.

As the line says, the TCP/IP connection has been established. If any other problems exist, the trace file will show enough helpful information for a diagnosis.

Summary

To summarize:

  1. Node Validation can be used to instruct listeners to accept or reject a connection from a specific client.
  2. The parameter file is sqlnet.ora in Oracle 9i and protocol.ora in Oracle8i.
  3. The nodes must be explicitly specified by name or by IP Address; no wildcards are supported.

What can I say…?

There are (probably) more important things out there, for example, my daughter has twisted here knee-cap and me being here in Washington DC makes me a bit worried if all is well in Holland (be assured, I have called multiple times, but its not the same as being there) .

ACED Also doing this dedicated Oracle XMLDB stuff since 2003/2004, being an “evangelist”…colleagues within AMIS probably have no idea how fast they will have to run again when Marco is starting about:

“Oh, I could do that with Oracle XML DB much more easier via…”

…but in all, I can’t help myself and its worth it. Its all about passion, no title really needed, I will do it anyway.

So during ODTUG, after signing up for some extra ambassador duties, I could add an extra ribbon which I really appreciate because I know that the people who nominated me, don’t nominate someone very easily. I’m honored and really appreciate it. It will help with getting people more into Oracle XML DB.

My thanks go out towards the Oracle XML DB Development team who kept supporting me despite I can be very “intensive”, as a Dutch friend described it lately, and those colleagues that still want to know how this all works in Oracle XML DB (they haven’t given up on me yet, me being a “mental case”).

There is so much I still learn from them and others and they really help me focus. Said that, of course this also counts for the other OakTable “intensive” passionate people out there…

Hoped to be of help and more to come…

8-)

SQL Server 4

SimpleTalk have just published another of my SQL Server articles – which talks about storage methods and block dumping in SQL Server, and ends with the suggestion that the reason that SQL Server DBAs seem to be very keen on “clustered indexes” may be related to the fact that SQL Server doesn’t seem to handle [...]

Stress Test

Here’s something you probably don’t want to say at the start of a public presentation: “Does anyone have a laptop I could borrow for the next 90 minutes.” I’ve just done my first ODTUG presentation – and everything I do runs under VMWare, which simply refused to start. I’m in my room now trying to [...]

The “Not a Problem” Problem and other related stuff

It’s been a while since I visited the Sun Video RSS feed and I found some interesting videos related to performance that are worth sharing and something you could watch over a big cup of coffee..

First is titled Performance: The “Not a Problem” Problem which I could also relate when doing performance analysis for example…

  • When a client instantly jump into conclusion that the performance degradation is caused by the database link well in reality when everything is measured/profiled it turned out that it’s because of the slow IO subsystem..
  • Or when the Data Guard environment was being blamed because of the slow batch processing but when everything is measured/profiled it turned out that it’s the bad SQL..

there are lots of interesting customer scenarios we could imagine.. but if there would be a systematic approach, a clear problem definition, and measuring the right stuff.. then things would be better..

[blip.tv ?posts_id=3742672&dest=-1]

Second is titled Performance: Experimentation which highlights the importance of the test cases and isolating the problem.. and how this could make a good learning experience while also solving the problem..

[blip.tv ?posts_id=3740794&dest=-1]

Third is the series of videos titled “Little Shop of Performance Horrors” posted on this blog post and tackled about these horrific topics:

  • The worst perf issues I’ve ever seen!
  • Common misconfigurations
  • The encyclopedia of poor assumptions
  • Unbelievably bad perf analysis
  • Death by complexity
  • Bad benchmarking
  • Misleading analysis tools
  • Insane performance tuning
  • The curse of the unexpected

I hope I’ve shared you some good stuff </p />
</p></div>

    	  	<div class=