Your bulletproof jacket or Some tips for Oracle EBS High Availability

Usually an ERP system is business critical and even a short outage can have a significant negative impact on the enterprise level processes. I would like to review the fully redundant Oracle EBS environment below.
Let’s look at these tier by tier.

Oracle EBS

1. Database Tier.
Of course Oracle Real Application Cluster (RAC) is most popular solution for real “zero-downtime” configuration. But RAC itself should be configured properly to avoid stability issues. Some simple rules should be used for RAC implementation to make CRS (Cluster Ready Services) redundant and stable:
– Make at least 3 voting disks (NORMAL redundancy)
– Make at least 2 OCR devices

A few recommendations for regular database files on Oracle ASM (Note that I’m not reviewing 3rd party cluster file systems like IBM HASMP or Veritas/Symantec CFS):
– Set external redundancy if your storage supports RAID 10/5/6/7
– Place your datafiles on RAID 10 (if possible)
– Set multipass access from OS to the SAN LUNs for ASM disks

For E-Business Suite (for real-time concurrent processing) database parameters are required to set on RAC database:
– _lm_global_posts=TRUE
– _immediate_commit_propagation=TRUE
– instance_groups = appsN (N is inst_id if using RAC)
– parallel_instance_group = appsN (N is inst_id if using RAC)

By the way RAC is not only option for DB High Availability, depending on the allowed downtime you can choose other (more cost effective) variants:*
– Oracle RAC1 (RAC One Node) – some reports will not be even interrupted during failover
– Oracle Restart – database will be automatically restarted on 2nd node
– Oracle Clusterware Cold Failover – database should be manually restarted (or DBA should create and register restart script)
– Oracle Dataguard Fast Start Failover (FSFO)
* Hmmm… looks like all those RAC-alternatives require another blog post – stay tuned!

2. Concurrent Management Tier.
Parallel Concurrent Processing (PCP) needs to be setup to make sure a crash of the CM box is not a show stopper for batch processing. With PCP, Concurrent Manager Queues will be automatically restarted on a Secondary CM Server.

After migrating the EBS Database from a Single DB to RAC, a bunch of your CM jobs will be (Surprise! Surprise!) significantly slower. You need to do some tuning steps on CM tier:
– All multi-threaded jobs like Payroll Workers, ASCP/MRP jobs and FSG reports needs to be adjusted to the RAC environment. Why is this happening? Because of RAC Global SGA. Multi-threaded jobs connect to the different RAC nodes at the same time and this significantly slows down processing.
– For 11i EBS versions the workaround was to move all multi-threaded jobs to separate (2nd) CM node (CMNode2) and connect this server only to the specific RAC node, by changing TNS entry and keep only 1 RAC node (for ex. DBNode2) there. And all processes from the CMNode2 will be connected to DBNode2. It’s not really a flexible solution because if the CMNode1 crashed – all concurrent managers will be started on the CMNode2 and (of course) will connect to just the one database RAC Node. But the opposite scenario is even worse – all multi-threaded jobs move (by PCP) to the CMNode1 and connect to the TWO RAC Nodes.
– For R12 (12.1+) Oracle provides a more flexible configuration. You can simply select a “Target Instance” in Concurrent -> Program -> Define -> Search -> Session Control and assign it to the Concurrent Program (See MOS 1129203.1)
– For ASCP/MRP modules Oracle provides an option to move them out into their own environment and ( from a performance perspective) this split will definitely be useful**. If you cannot split now – see options above.
** I’ll focus on ASCP separation in my next posts – stay tuned!
In addition to multi-threaded jobs your system is processing a lot of XML reports, which should go thru Output Post Processor (OPP) – do not forget setup another OPP Manager on CMNode2

3. Internal Web/Forms Tier.
For Application (Web/Forms) Tier you can simply setup Hardware Load Balancer (any of the LB brands – F5 BigIP, Citrix Netscaler, Cisco Ace). And add multiple Forms/Web tiers into the LB pool. If one of the pool members crashed, users who worked thru this server just need to reconnect and LB will forward the connection to the live pool member.
– Load balancer setup should be done according to Oracle recommendations (Metalink has bunch of notes related to the particular brand, make sure you checked all of them, especially related to the “Session Persistence”)
– Set multiple JVMs on application tier for the services running there and configure Java Memory and Garbage Collection parameters related to the number of parallel user connections. And do not forget to setup monitoring of your Java Heap Usage to be proactive for “OutOfMemory: Java Heap Space” error.
– Terminate your SSL certificate on Load Balancer to use SSL-Accelerator features

4. External (DMZ) Web Tier.
DMZ Servers have to be redundant same way as Internal Web servers, using Hardware Load Balancer. But network infrastructure may be various:
– Reverse Proxy controls access to the Web Tiers
– All External Web Tier located between internal and external firewalls
– Combinations of Reverse Proxy/Firewalls/Load Balancers
Oracle does not recommend to activate Forms on the DMZ tiers – do not forget to disable it in CONTEXT_FILE
If you configure the multiple external application modules on the shared DMZ tiers (like iSupplier, iSupport, iReqruitment, iStore) please remember your multiple web-entry points should be redirected to the different nodes. 11i/R12 EBS does not support different web entry points (URLs) on the same node and you need to select some neutral url (like ebsdmz.yourdomain.com) to avoid confusions.

5. DISCLAIMER.
This post does not include any aspects of Disaster Recovery (DR) or Business Continuity (BCP) Plans. All information above is related to the single (but redundant) E-Business Suite environment within one datacenter.

Stay tuned!



« Back to blog