Apache Jserv Tuning Doc

19
Proactive Monitoring on an Apps 11i Web Tier Server --------------------------------------------------------------------------- Here are just three examples where you can quickly answer important questions about your system with very little effort from access_log: 1. How long does it take 9iAS to serve requests? The main performance information is the time taken to serve a request. This is not included in the access_log by default , but can easily be added by adding the following line to the httpd.conf file via an AutoConfig customization (see Metalink Note 270519.1 ): LogFormat "%h %l %u %t \"%r\" %>s %b %T" This line can be added anywhere in the file, but would sensibly be placed after the existing LogFormat entries. This directive adds an extra column into the access_log to show the time in seconds it took from receiving a request to sending back a response. Valid log formats are described in the Apache documentation . 2. Am I getting any server errors? The access_log includes the HTTP status code of the request. They are listed in full in RFC 2616 but the codes we would be immediately concerned about are any in the 400 or 500 range. For example: Status 500 (internal server error) may typically be seen for a JServ request and often means the JVM has some kind of problem or has died. For example: This entry may indicate that the JServ JVM is not responding to any requests: 192.168.1.10 - - [21/Jun/2006:13:25:30 +0100] "POST /oa_servlet/actions/processApplicantSearch HTTP/1.1" 500 0 Status 403 (forbidden) could typically be seen for oprocmgr requests and often means there is a misconfiguration that needs to be resolved. Page: 1

Transcript of Apache Jserv Tuning Doc

Page 1: Apache Jserv Tuning Doc

Proactive Monitoring on an Apps 11i Web Tier Server---------------------------------------------------------------------------

Here are just three examples where you can quickly answer important questions about your system with

very little effort from access_log:

1. How long does it take 9iAS to serve requests?

The main performance information is the time taken to serve a request.  This is not included in the

access_log by default , but can easily be added by adding the following line to the httpd.conf file via an

AutoConfig customization (see Metalink Note 270519.1):

      LogFormat "%h %l %u %t \"%r\" %>s %b %T"  

This line can be added anywhere in the file, but would sensibly be placed after the existing LogFormat

entries. This directive adds an extra column into the access_log to show the time in seconds it took from

receiving a request to sending back a response.  Valid log formats are described in the Apache

documentation. 

2.  Am I getting any server errors?

The access_log includes the HTTP status code of the request.   They are listed in full in RFC 2616 but the

codes we would be immediately concerned about are any in the 400 or 500 range. 

For example:

Status 500 (internal server error) may typically be seen for a JServ request and often means the JVM has some kind of problem or has died.  

For example:  This entry may indicate that the JServ JVM is not responding to any requests:

192.168.1.10 - - [21/Jun/2006:13:25:30 +0100] "POST

/oa_servlet/actions/processApplicantSearch HTTP/1.1" 500 0

Status 403 (forbidden) could typically be seen for oprocmgr requests and often means there is a misconfiguration that needs to be resolved. 

For example:  This entry in access_log may indicate a problem with system configuration (oprocmgr.conf):

myserver - - [21/Jun/2006:13:25:30 +0100] "GET /oprocmgr-service?cmd=Register&index=0&modName=JServ&grpName=OACoreGroup&port=16000 HTTP/1.1" 403 226

Page: 1

Page 2: Apache Jserv Tuning Doc

3. Are users having problems accessing pages?

The status code of 200 means the request was successful, however a dash ( - ) for the "Bytes sent"

column, normally means that the request hit the Apache timeout.  You would also see a time taken to serve

request above 300 seconds as the default Apache timeout is 5 minutes.   

If you see this situation occurring regularly then your users are either navigating away from the browser

page before it has rendered, or are likely to be getting a "white screen of death" in their browser window

where it will appear to hang.

In this situation you need to identify why the requests are not being processed in good time, which is a large

subject in itself.

Identifying these issues

Hopefully these examples will inspire you to want to analyse your access_log, but I hear you ask, "Where

will I get the time?"   

Luckily the access_log is a simple text file, so if you do not have commercial monitoring software you can

use some "quick and dirty" scripts to report just the exceptions you are interesting in seeing.   For example I

would often use the following to scan access_logs for problems:

## Start of script#### Check for HTTP statuses in 400 or 500 range for JServ ## or PLSQL requests only##awk ' $9>=400 && $9<=599 { print $0 }' access_log* | grep -e "servlet" -e "\/pls\/" | grep -v .gif#### Check for requests taking more than 30 seconds to be returned##awk ' $11>30 {print $0} ' access_log*## ## This one is not an exception report, you need to manually check## Look for when the JVMs are restarting##grep "GET /oprocmgr-service?cmd=Register" access_log*#### End of script

[Editor:  Due to formatting restrictions , if you're cutting-and-pasting this script, you must manually join the

line above ending with grep -e with the following line starting with "servlet" into a single unbroken line.]

Page: 2

Page 3: Apache Jserv Tuning Doc

Conclusion

Proactive monitoring of the access_log will help you to :-

Baseline your system performance Identify user usage patterns Highlight possible system or user problems Identify areas with possible performance issues Verify user reported problems

Configuring Middle-Tier JVMs for Applications 11i------------------------------------------------------------------------

When you call Oracle Support with a problem like apj12 errors in your mod_jserv.log, middle-tier Java

Virtual Machines (JVMs) crashing, or poor middle-tier performance, then it will often be suggested to

increase the number of JVM processes.  So, the key question that likely occurs to you is, "How many JVMs

are required for my system?"

Processing Java Traffic in Groups

First, some quick background:  web requests received by Oracle HTTP Server (Apache) to process Java

code is sent to one of four different types of JVM groups to be processed.   You can see this in the

jserv.conf file:

 ApJServGroup OACoreGroup 2 1 /usr/.../jserv.properties

 ApJServGroup DiscoGroup  1 1 /usr/.../viewer4i.properties

 ApJServGroup FormsGroup  1 1 /usr/.../forms.properties

 ApJServGroup XmlSvcsGrp  1 1 /usr/.../xmlsvcs.properties

The number of JVMs for each group is signified by the first number on each line. OACoreGroup is the default group.  This is where most Java requests will be serviced DiscoGroup is only used for Discoverer 4i requests FormsGroup is only used for Forms Servlet requests XmlSvcsGrp is for XML Gateway, Web Services, and SOAP requests

In the example above, I have two JVMs configured for OACoreGroup and one JVM configured for each of

the other groups.

Factors Affecting Number of JVMs Required

Page: 3

Page 4: Apache Jserv Tuning Doc

Determining how many JVMs to configure is a complex approximation, as many factors need to be taken into account.  These include:

Hardware specification and current utilization levels Operating system patches and kernel settings JDK version and tuning Applications code versions, especially JDBC and oJSP JServ configuration file tuning (jserv.properties and zone.properties) Applications modules being used How many active users User behaviour

Rough Guidelines for JVMs

Luckily, Oracle Development have undertaken various performance tests to establish some rules of thumb

that can be used to configure the initial number of JVMs for your system.

OACoreGroup o    1 JVM per 100 active users

DiscoGroupo Use the capacity planning guide from Note 236124.1 "Oracle 9iAS 1.0.2.2 Discoverer 4i: A

Capacity Planning Guide"

FormsGroup o    1 JVM per 50 active forms users

XmlSvcsGrpo    1 JVM is generally sufficient

In addition to this, Oracle generally recommends no more than 2 JVMs per CPU.    You also need to confirm

there are enough operating system resources (e.g. physical memory) to cope with any additional JVMs.

Your Mileage Will Vary

The general guidelines above are just that -- they're very broad estimates, and your mileage will vary.  The

Applications Technology Group is working on a JVM Sizing whitepaper that will provide guidelines based on

whether your E-Business Suite deployment is small, medium, or large. 

Until then, it's critical that you test your environment under load, using transactional tests that closely mirror

what your users will be doing. 

Here are a couple of quick-and-dirty tools that might be useful in sizing your JVMs.

Script to determine "active users" for OACoreGroup

Page: 4

Page 5: Apache Jserv Tuning Doc

REM

REM SQL to count number of Apps 11i users

REM Run as APPS user

REM

select 'Number of user sessions : ' || count( distinct session_id) How_many_user_sessions

from icx_sessions icx

where disabled_flag != 'Y'

and PSEUDO_FLAG = 'N'

and (last_connect + decode(FND_PROFILE.VALUE('ICX_SESSION_TIMEOUT'), NULL,limit_time,

0,limit_time,FND_PROFILE.VALUE('ICX_SESSION_TIMEOUT')/60)/24) > sysdate   

and counter < limit_connects;

REM

REM END OF SQL

REM

How to determine "active forms users" for FormsGroup

Check the number of f60webmx processes on the Middle Tier server.  For example:

ps -ef | grep f60webx | wc -l

Conclusion

The number of required JVMs is extremely site-specific and can be complex to predict Use the rules of thumb as a starting point, but benchmark your environment carefully to see if

they're adequate Proactively monitor your environment to determine the efficiency of the existing settings and

reevaluate if required

Investigating java.lang.OutOfMemoryError with Apps 11i Middle Tier JVM---------------------------------------------------------------------------------------------------------

If you had to guess what the error "java.lang.OutOfMemoryError" indicates when seen in your Apache

log files, you would probably be right...

Page: 5

Page 6: Apache Jserv Tuning Doc

In most cases this message is telling you that the Java process has exhausted available memory and

cannot process some requests successfully.   In many cases additional symptoms will be:

Poor web page performance User requests being timed out Java processes taking 100% CPU on your server.  

The Java Virtual Machine (JVM) will sometimes be restarted by oprocmgr as performance is so poor it presumes the JVM has died.

There can be other causes.  For example, this message can sometimes be given where there is the lack of some other resource such as threads per process (kernel parameter "max_thread_proc" on HP-UX) but this case should be easily identifiable from the exact message written out.

Where has all the memory gone?

Every object created in Java takes some memory.  Once all java code that references the java object has completed, then the Garbage Collector (GC) automatically removes the object and claims back the memory.

You can run out of memory due to:

Insufficient memory allocated to the JVM to cope with the amount of work Suboptimal GC processing A memory leak -- Java code is not releasing an object when it should A Java code or operating system defect

How are JVMs configured with Apps 11i

Two important configuration files for Java in an Apps environment are: jserv.conf

This file defines how many JVMs are configured, as discussed

jserv.properties This file controls the JVM parameters in the "wrapper.bin.parameters=" entry.  When you install Apps originally or update the JDK version, the default settings are updated in your CONTEXT.xml file ("s_jvm_options" for the OACoreGroup JVM) and controlled by AutoConfig thereafter.

What can I do about OutOfMemoryError?

Are you using Oracle Configurator?

Configurator can sometimes require huge chunks of memory for a relatively long period of time, so Oracle now recommends setting up separate JVMs specifically for Configurator.  For more details see Setting Up a Dedicated JServ for Running Only Oracle Configurator" (Metalink Note 343677.1).

Page: 6

Page 7: Apache Jserv Tuning Doc

Increase available memory

An obvious choice, you may think, and this will often at least delay the effect of the issue, but may not always solve it.   You can achieve this by either increasing the number of JVMs or the memory settings ("-Xmx512M" for example).  In general, I would not recommend going much over 640MB per JVM, in order to keep the Garbage Collection time to a reasonable level.

Tuning GC

You can influence how Java performs Garbage Collection.   I would normally recomend reviewing

and possibly backing out any non-default JVM parameters first, as these are sometimes just carried

forward because they were used with previous JDK versions.  As always, any tuning should be

undertaken in a TEST environment to ensure the changes have a positive impact.   

Using Concurrent Mark Sweep collector

With multi-CPU servers using JDK 1.4 and higher, the following parameters have been seen to

improve performance and GC throughput on some environments:

-XX:+UseConcMarkSweepGC -XX:+UseParNewGC  

Identify the components that are using memory

Not always as easy as it sounds.  Generating class histograms or java profiling may be the only

options. 

What data should I gather to investigate further ?

The initial investigation should focus on trying to identify the pattern of memory usage building up and how

frequent GC is occurring.  The tools and information described below may help to achieve this

Review stdout files

Page: 7

Page 8: Apache Jserv Tuning Doc

Basic Garbage Collection information is normally written to the

$IAS_ORACLE_HOME/Apache/Jserv/logs/jvm/*.stdout file for each JVM.  Additional

Garbage Collection information can be logged, for example:

-XX:+PrintGCTimeStamps -XX:+PrintGCDetails

Add more verbose GC logging

-Xverbosegc

This allows you to see how much of each memory pool is being used.  For example you may find it is

the "Permanent Generation" space that is running out of memory, rather than the "Tenured

Generation".  If this turns out to be the case, the "-XX:MaxPermSize" parameter could be

increased to provide more memory.

Collect class histograms

-XX:+PrintClassHistogram

Implemented in JDK 1.4 and higher by some vendors.  Can be very useful if available.  Beware of

using this on HP-UX platforms running JDK 1.4 as it seems to crash the JVM rather than output the

data!

Use JConsole

Discussed in below article.

Java Profiling

-Xrunhprof

This should tell you exactly where the memory is being used, but has a huge impact on performance

and is therefore not likely to be possible to gather this data, except for a lightly used test

environment.

Other clues

Does the DMS output for the JVM show increasing active connections or threads?

Do JDBC connections to the database keep increasing? What module names do they relate to?

Does temporarily disabling "AM Pool" reduce memory consumption (Profile option "FND:

Application Module Pool Enabled")?

Page: 8

Page 9: Apache Jserv Tuning Doc

Are you seeing PL/SQL "cursor leaks"?  See, "Understanding and Tuning the Shared Pool" (Metalink

Note 62143.1) in section "Checking hash chain lengths" for SQL to check this.

Any Java Thread Deadlocks or Database Deadlocks?

Are Java processes spinning to 100% CPU or is CPU use very low when the error occurs?

Look for platform references

Review your operating system vendor's web site to look for known issues or required platform

patches for your Java version.

My Production system is failing every few hours, what can I do right now?

A possible quick fix would be to increase memory and/or number of JVMs.  Be sure to ensure that this won't

cause the operating system to start swapping!

You could also consider a scheduled bounce of Apache every 12 or 24 hours if possible. 

These steps may allow you to minimise the immediate impact on the business so you can investigate and

implement changes in a methodical, safe, controlled and tested manner.

Where can I read more about Java memory and tuning?

There are many resources on the Internet to describe how Java uses memory and what steps to take for

performance tuning.  The best place to start is your hardware vendors web site as the options available

depend on their Java implementation and also the specific Java version you are using.  

For example:

HP-UX Performance tuning,Tutorials, & Training  http://www.hp.com/products1/unix/java/infolibrary/tutorials/index.html

 Maximizing Java performance on AIX  http://www-128.ibm.com/developerworks/eserver/library/es-Javaperf1.html

 J2SE 5.0 Performance White Paper  http://java.sun.com/performance/reference/whitepapers/5.0_performance.html

Conclusion

Dealing with Java processes running out of memory can sometimes be as simple as increasing the memory

available, or may require detailed investigation to identify the root cause.     

Page: 9

Page 10: Apache Jserv Tuning Doc

Although each case will have its own unique considerations, I hope that the information in this article has

given you some ideas as to the general approach to take should the need arise.

Additional References

Exception:java.lang.OutOfMemoryError when using JDK 1.3.1 (Metalink Note 344132.1) URI:/Oa_html/Oa.Jsp Java.Lang.Outofmemoryerror In Sales - Product Catalog (Metalink Note

387434.1)

Using Jconsole to monitor Apps 11i JVMs

------------------------------------------------------------

Whenever you are planning to upgrade components in your E-biz 11i environment, you shouldn't forget to

review the Technology Stack components as well as the Applications 11i code.

There are some compelling technology benefits justifying the upgrade of your Apps Middle Tiers to JDK

version 1.5, but the best reason for me is the JConsole GUI monitoring tool, which I think is really cool

Page: 10

Page 11: Apache Jserv Tuning Doc

 

For a better description of JConsole and its capabilities you can review this Sun article, but in summary

provides a lightweight, always-on monitoring capability for your JVMs.  Using JConsole, you can view

graphically the Heap usage, Threads, classes and other statistics in real time, which gives you a great idea

of activity inside the JVM process, delivered in an easily digestible form.   I can happily sit and watch my

JVMs chug away for hours using JConsole, and is more fun than watching TV!

Page: 11

Page 12: Apache Jserv Tuning Doc

 

JConsole with Apps Release 11i

You can run JConsole either locally (from the server that you are monitoring) or remotely via a network

connection.   With Apps 11i, the safest and easier option is to run locally.   

Whether running local or remote,  you need to add the following lines to your jserv.properties file.  You

should make this change following "Customizing an AutoConfig Environment" (Metalink note 270519.1) to

ensure AutoConfig does not drop these custom settings.

wrapper.bin.parameters=-Dcom.sun.management.jmxremote

wrapper.bin.parameters=-Dcom.sun.management.jmxremote.ssl=false 

** NOTE - you may wish to use SSL for better security

Page: 12

Page 13: Apache Jserv Tuning Doc

Connecting Locally

Once you have started your JVMs using the new parameters described above, you simply launch JConsole

as the same operating system user, which presents you with a list of all the JVM Processes for you to

connect to and monitor.

Connecting Remotely

Running JConsole remotely is a little more complex and certainly not so secure... meaning it may introduce

unacceptable risk for a production environment.    Firstly, authentication is enabled by default so you will

either need to configure your username/passwords or could choose to defeat authentication entirely by

using the jserv.properties value below.  The following change allows ANYONE to attach to your JVM

processes so should not be used lightly ! 

wrapper.bin.parameters=-Dcom.sun.management.jmxremote.authenticate=false

The second issue to overcome is that each JVM needs its own TCP port to accept connections.   One

possible approach is to customize the $IAS_ORACLE_HOME/Apache/Apache/bin/java.sh JVM startup

script and provide a port number range from which you allocate a tcp port number.   The sample script

modification below is not perfect (as it doesn't resolve port number clashes) but may give you some ideas as

to how you could overcome this issue in your own environment. 

Add this modification just before the existing line

 exec $JSERVJAVA $JAVA_ADDITIONAL_ARGS $ARGV 1>> $STDOUTLOG 2>> $STDERRLOG

####  start of modification for JConsole remote monitoring

####  Not supported or recommended by Oracle Corporation

####

####  If there is a clash of port numbers calculated, then

####  the JVM will not startup.  Script therefore needs to be

####  improved to check port not already in use

####

####  Use port number range 9600 - 9650 as an illustration

if [ "$JDK_VERSION" = "1.5" ] ; then

 mJMXPORTMIN=9600

 mJXMPORTMAX=9650

 mNUMPORTS=`expr ${mJXMPORTMAX} - ${mJMXPORTMIN}`

Page: 13

Page 14: Apache Jserv Tuning Doc

 mRANDOM=$(((($RANDOM*${mNUMPORTS})/32767)+1))

 mJMXPORT=`expr ${mRANDOM} + ${mJMXPORTMIN}`

 JAVA_ADDITIONAL_ARGS=" -DCLIENT_PROCESSID=$$  -Dcom.sun.management.jmxremote.port=${mJMXPORT}"

fi

####  end of modification

After starting your JVMs you then need to identify the TCP port they are listening on.   You can do this using

the following command:

netstat -a | grep tcp | grep 96

You can then startup JConsole from your PC (or any other machine) and using the "Remote" tab you enter

the machine name and port you wish to connect to and also the username/password if appropriate.  You

then get the same monitoring capabilities as if you had connected locally.

Conclusion

The real time graphical monitoring provided by JConsole, along with its ease of implementation and use,

provides a useful tool to help you understand the way your JVMs are being utilised.  This knowledge can

then be used to tune and troubleshoot your Apps 11i environments.

In Depth: Load-Balancing E-Business Suite Environments----------------------------------------------------------------------------------

Increasing Fault Tolerance at Lower Cost

You can use load-balancing routers (LBRs) to protect your E-Business Suite from similar types of system failures.  Load-balancers increase your environment's fault-tolerance and scalability by distributing load across a pool of application servers like this:

Page: 14

Page 15: Apache Jserv Tuning Doc

Besides fault-tolerance and scalability, another appealing benefit is that you can use load-balancing to substitute expensive SMP boxes with clusters of inexpensive Linux-based commodity servers. 

Supported Load-Balancing Methods

The E-Business Suite supports the following types of load-balancing: HTTP Layer DNS-based Apache Jserv Layer Forms Metric Server Concurrent Processing Layer Database Layer via Real Application Clusters

I'll cover only the first two methods in this article.

HTTP Layer Load-Balancing

HTTP Layer load-balancing is the most common method used in E-Business Suite environments. 

Page: 15

Page 16: Apache Jserv Tuning Doc

In this configuration, end-users navigate to a specific Web Entry Point that represents your E-Business Suite's domain name.  An HTTP Layer load-balancer routes all subsequent traffic for a specific user to a specific Web Node.

HTTP Layer load-balancers may use heartbeat checks for node death detection and restart, and sophisticated algorithms for load-balancing.

DNS-Based Load-Balancing

When an end-user's browser attempts to access your E-Business Suite environment, your local Domain Name Server (DNS) can direct that user to a specific application server in a pool based on available capacity:

Traffic for that user's session will be handled by the application server 10.10.10.10, while other users' traffic

Page: 16

Page 17: Apache Jserv Tuning Doc

may be directed to other application servers in the pool.  Like HTTP layer load-balancers, many DNS-based load-balancers use heartbeat checks against nodes and sophisticated algorithms for load-balancing.

Business Continuity ("Disaster Recovery")

Our larger enterprise-class customers combine DNS-based and HTTP layer load-balancers to support their business continuity plans.  In the event of a disaster, end-users are directed via a DNS-based load-balancer from the primary E-Business Suite environment to an offsite standby site.

Minimum Requirement:  Session Persistence

The minimum requirement is that a load-balancer support session persistence, where a client's initial HTTP connection is directed to a particular application server, then subsequent HTTP requests from that client are directed to the same server.  As long as a load-balancer is able to handle session persistence (also referred to as "stickiness"), it's likely to work with the E-Business Suite.

Page: 17