RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable...

34
RAC Deep Dive for Developers building efficient and scalable RAC-aware applications

Transcript of RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable...

Page 1: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

RAC Deep Dive for Developersbuilding efficient and scalable RAC-aware applications

Page 2: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

<Insert Picture Here>

Transparent Application Failover Technology

<Insert Picture Here>

Oracle RAC DD4DDmitry Volkov/Igor Melnikov

LMS

ASM

REAL APPLICATION CLUSTERS

SERVICE

3 WAY MESSAGING

GRD

CACHE FUSION

TAF

FAN, FCF

DBMS_PIPE

OCR INTERCONNECT

CLUSTERWARE

CRS_STAT

CRSCTL

VOTING DISK

VIP

ONS

OCR GC current block

Page 3: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

<Insert Picture Here>

Agenda

• Current node failure• Failure handling• Application’s response to the loss of session• Demonstrations

Page 4: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

<Insert Picture Here>

Current node failure

Page 5: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Session Failure

• The connection to the cluster node can be lost• The cluster node is crashed• A network problem

• The application receives an error, for example: “ORA-03114 not connected to Oracle”

• Even if the problem was solved rapidly, the user has still to exit and restart the application

• Some work is lost• It is unacceptable in many cases (in business-critical

applications)

Page 6: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Transparent Application FailoverConnection Loss Handling

• During a SQL-call (OCI call) it is determined that the connection was lost and it reconnects to the other node

• RAC and Transparent Application Failover (TAF) protect application:• Automatically and transparent for app reconnect to other node• Application and queries can continue work

NodeA

NodeB

NodeA

NodeB

Node A broken, users reconnect

Page 7: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Transparent Application Failover (TAF)

A normal situation

Page 8: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Transparent Application Failover (TAF)

A failure situation

Page 9: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Transparent Application FailoverConnection Loss Handling

• Two methods• BASIC: a new connection is established• PRECONNECT: a backup preconnection to another node is

established

• Two types• SESSION: the session is recovered• SELECT: cursors are saved

Page 10: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

TAF - BASIC Method

TAF

Page 11: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

TAF – PRECONNECT Method

TAF

Page 12: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

CRM =(DESCRIPTION=(FAILOVER=ON|TRUE|YES)(ADDRESS=

(PROTOCOL=tcp) (HOST=sales1-server) (PORT=1521))

(CONNECT_DATA=(SERVICE_NAME=CRM) (FAILOVER_MODE=

(TYPE=select) (METHOD=basic)(RETRIES=20)(DELAY=15))))

Transparent Application FailoverTAF Setup on a Client

Oracle Net waits for 15 seconds before trying to reconnect again.Oracle Net attempts to reconnect up to 20 times.

Presenter
Presentation Notes
TAF also provides the ability to automatically retry connecting if the first connection attempt fails with the RETRIES and DELAY parameters. In this example, Oracle Net tries to reconnect to the listener on sales1-server. If the failover connection fails, Oracle Net waits 15 seconds before trying to reconnect again. Oracle Net attempts to reconnect up to 20 times.
Page 13: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Transparent Application Failover

(type SESSION)

D E M O N S T R A T I O N

Page 14: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

TAF Facility

• TAF restores or reestablishes: • Client-Server connections• Special SQL statements• Active cursors (select command) starting to fetch row set

• TAF does not save and does not protect: • Active transactions (ORA-25402 transaction must roll back)• Server-side program variables of PL/SQL packages

• Applications not using OCI8• All ALTER SESSION statements are lost

Presenter
Presentation Notes
What TAF Restores�TAF automatically restores some or all of the following elements associated with active database connections. Other elements, however, may need to be embedded in the application code to enable TAF to recover the connection. Client-Server Database Connections�TAF automatically reestablishes the connection using the same connect string or an alternate connect string that you specify when configuring failover. Users' Database Sessions�TAF automatically logs a user in with the same user ID as was used prior to failure. If multiple users were using the connection, then TAF automatically logs them in as they attempt to process database commands. Unfortunately, TAF cannot automatically restore other session properties. These properties can, however, be restored by invoking a callback function. Executed Commands�If a command was completely executed upon connection failure, and it changed the state of the database, TAF does not resend the command. If TAF reconnects in response to a command that may have changed the database, TAF issues an error message to the application. Open Cursors Used for Fetching�TAF allows applications that began fetching rows from a cursor before failover to continue fetching rows after failover. This is called "select" failover. It is accomplished by re-executing a SELECT statement using the same snapshot, discarding those rows already fetched and retrieving those rows that were not fetched initially. TAF verifies that the discarded rows are those that were returned initially, or it returns an error message Active Transactions�Any active transactions are rolled back at the time of failure because TAF cannot preserve active transactions after failover. The application instead receives an error message until a ROLLBACK is submitted. Serverside Program Variables�Serverside program variables, such as PL/SQL package states, are lost during failures; TAF cannot recover them. They can be initialized by making a call from the failover callback.
Page 15: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

• It automatically• Opens a new session• Rolls back the active transactions

• You have to rerun manually• Alter session …. commands• Transactions (update, insert…)

• Exceptions handling• ORA-254xx

TAF Abilities (continued)

Page 16: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

What Happens after the Failover

• For all active transactions, insert, update and delete commands

• There is ORA-25402 exception: transaction must roll back

• The application must rollback• The application must repeat the transaction

• FAILOVER_TYPE = SESSION• In case of SELECT operators only, the user does not

have to reconnect• FAILOVER_TYPE=SELECT

• In case of large queries the user will not notice a failover

Page 17: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Transparent Application Failover

(type SELECT)

D E M O N S T R A T I O N

Page 18: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

TAF - type SELECT

SQL> SELECT * FROM emp where dep_no = :x;

empno name------- -------7369 Smith7499 Allen7521 Ward7566 Jones7654 Martin7698 Blake

Connection breaks

Instance 1

Instance 2

• Oracle Client stores:• bind-variables• count of fetched rows • CRC of fetched rows• SCN on time query begin!

Oracle Client :• re-execute query by SCN (when lost query run)• invisible fetch 3 rows• Calculate new CRC• compare old CRC with

new CRC !

Page 19: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

• Client stores status of query and results so far• Bind variables• Count fetched rows• CRC of fetched rows• SCN at tine when query begun

• Replay query when connecting is established again• Re-execute query by old SCN

• Oracle Client• Calculate new CRC of fetched rows• Compare with old CRC• If not equal – exception generated (order of record has changed! )

TAF in TYPE=SELECT mode

Page 20: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Types and Methods combined

Server resources and reconnect speed

Clie

nt re

sour

ces

and

resu

me

func

tiona

lity

Method = PRECONNECTMethod = BASIC

Type=

SESSION

Type=

SELECT

Client is automatically logged into surviving

node of cluster

Query replayed on surviving node and remaining records

returned

Session activated on surviving node.

Session activated on surviving node, query

replayed and remaining records returned

Page 21: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Transparent Application Failover

• Is introduced to Oracle Client• Starting from version 8.0.6• Generally it does not depend on RAC and may be used

for:• A non-RAC database (single instance)• High Availability Clusters• A replicated database• A standby database• RAC

• Failover continues as long as the service is available

Page 22: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Use services when using TAF

• Issue with default service• Default service gets registered to listener when database is in

mount mode• Connections can get redirected to a database in mount mode

and fail

• Additional services register only when database open• Any new service is controlled by the clusterware• Clusterware will not enable service until database instance is

really available for connections

• Don’t use default service for automatic reconnect to standby DB

Page 23: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Transparent Application Failover -API for Developers

Page 24: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

TAF Callback

• If necessary, the application may determine a function (callback) which would be called back at the time of failure

• The handler may be used:• For the issue of a message to the user

• For example: “Please wait”

• For restoring the session condition• For repeating the work

Presenter
Presentation Notes
For more details on the failover callback, please refer to the Application Developers Guide to the Oracle Call Interface manual.
Page 25: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

“C” TAF Callback (OCI ≥ 8.0.6)“C” Example

sb4 callback_fn(svchp, envhp, fo_ctx, fo_type, fo_event )::

{switch (fo_event) {

case OCI_FO_BEGIN: {printf(" Failing Over ... Please stand by \n");

::OCIFocbkStruct failover; ::

failover.callback_function = &callback_fn;if (OCIAttrSet( srvh, OCI_HTYPE_SERVER,

&failover, 0, OCI_ATTR_FOCBK, errh)!= OCI_SUCCESS)

::

Implement the callback procedure

Register TAF callback procedure

Take action

Presenter
Presentation Notes
The basic structure of a user-defined application failover callback function is as follows: sb4 appfocallback_fn ( dvoid * svchp, dvoid * envhp, dvoid * fo_ctx, ub4 fo_type, ub4 fo_event ); An example is provided in the later "Failover Callback Example" for the following parameters: svchp - The first parameter, svchp, is the service context handle. It is of type dvoid *. envhp - The second parameter, envhp, is the OCI environment handle. It is of type dvoid *. fo_ctx - The third parameter, fo_ctx, is a client context. It is a pointer to memory specified by the client. In this area the client can keep any necessary state or context. It is passed as a dvoid *. fo_type - The fourth parameter, fo_type, is the failover type. This lets the callback know what type of failover the client has requested. The usual values are: OCI_FO_SESSION, which indicates that the user has requested only session failover. OCI_FO_SELECT, which indicates that the user has requested select failover as well. fo_event - The last parameter is the failover event. This indicates to the callback why it is being called. It has several possible values: - OCI_FO_BEGIN indicates that failover has detected a lost connection and failover is starting. - OCI_FO_END indicates successful completion of failover. - OCI_FO_ABORT indicates that failover was unsuccessful, and there is no option of retrying. - OCI_FO_ERROR also indicates that failover was unsuccessful, but it gives the application the opportunity to handle the error and retry failover. - OCI_FO_REAUTH indicates that a user handle has been reauthenticated. To find out which, the application should check the OCI_ATTR_SESSION attribute of the service context handle (which is the first parameter).
Page 26: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Java TAF Callback (OCI ≥ 9.0.1)Java Example

import oracle.jdbc.OracleConnection;import oracle.jdbc.OracleOCIFailover;::

CallBack fcbk= new CallBack();::

((OracleConnection)conn).registerTAFCallback(fcbk, msg);::

class CallBack implements OracleOCIFailover { public int callbackFn (Connection conn, Object ctxt,

int type, int event) {::switch (event) {case FO_BEGIN:

Instantiate the callback class

register TAF callback function

Implement the callback class

React

Presenter
Presentation Notes
JDBC OCI Application Failover Callbacks--OCIFailOver.java This sample demonstrates the registration and operation of JDBC OCI application failover callbacks. For information on Transparent Application Failover (TAF) and failover events, see "OCI Driver Transparent Application Failover". /* * This sample demonstrates the registration and operation of * JDBC OCI application failover callbacks * * Note: Before you run this sample, set up the following * service in tnsnames.ora: * inst_primary=(DESCRIPTION= * (ADDRESS=(PROTOCOL=tcp)(Host=hostname)(Port=1521)) * (CONNECT_DATA=(SERVICE_NAME=ORCL) * (FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)) * ) * ) * Please see the Oracle Net Administrator's Guide for more detail about * failover_mode * * To demonstrate the the functionality, first compile and start up the sample, * then log into sqlplus and connect /as sysdba. While the sample is still * running, shutdown the database with "shutdown abort;". At this moment, * the failover callback functions should be invoked. Now, the database can * be restarted, and the interupted query will be continued. */ // You need to import java.sql and oracle.jdbc packages to use // JDBC OCI failover callback import java.sql.*; import java.net.*; import java.io.*; import java.util.*; import oracle.jdbc.OracleConnection; import oracle.jdbc.OracleOCIFailover; public class OCIFailOver { static final String user = "scott"; static final String password = "tiger"; static final String driver_class = "oracle.jdbc.OracleDriver"; static final String URL = "jdbc:oracle:oci8:@inst_primary"; public static void main (String[] args) throws Exception { Connection conn = null; CallBack fcbk= new CallBack(); String msg = null; Statement stmt = null; ResultSet rset = null; // Load JDBC driver try { Class.forName(driver_class); } catch(Exception e) { System.out.println(e); } // Connect to the database conn = DriverManager.getConnection(URL, user, password); // register TAF callback function ((OracleConnection) conn).registerTAFCallback(fcbk, msg); // Create a Statement stmt = conn.createStatement (); for (int i=0; i<30; i++) { // Select the ENAME column from the EMP table rset = stmt.executeQuery ("select ENAME from EMP"); // Iterate through the result and print the employee names while (rset.next ()) System.out.println (rset.getString (1)); // Sleep one second to make it possible to shutdown the DB. Thread.sleep(1000); } // End for // Close the RseultSet rset.close(); // Close the Statement stmt.close(); // Close the connection conn.close(); } // End Main() } // End class jdemofo /* * Define class CallBack */ class CallBack implements OracleOCIFailover { // TAF callback function public int callbackFn (Connection conn, Object ctxt, int type, int event) { /********************************************************************* * There are 7 possible failover event * FO_BEGIN = 1 indicates that failover has detected a * lost conenction and faiover is starting. * FO_END = 2 indicates successful completion of failover. * FO_ABORt = 3 indicates that failover was unsuccessful, * and there is no option of retrying. * FO_REAUTH = 4 indicates that a user handle has been re- * authenticated. * FO_ERROR = 5 indicates that failover was temporarily un- * successful, but it gives the apps the opp- * ortunity to handle the error and retry failover. * The usual method of error handling is to issue * sleep() and retry by returning the value FO_RETRY * FO_RETRY = 6 * FO_EVENT_UNKNOWN = 7 It is a bad failover event *********************************************************************/ String failover_type = null; switch (type) { case FO_SESSION: failover_type = "SESSION"; break; case FO_SELECT: failover_type = "SELECT"; break; default: failover_type = "NONE"; } switch (event) { case FO_BEGIN: System.out.println(ctxt + ": "+ failover_type + " failing over..."); break; case FO_END: System.out.println(ctxt + ": failover ended"); break; case FO_ABORT: System.out.println(ctxt + ": failover aborted."); break; case FO_REAUTH: System.out.println(ctxt + ": failover."); break; case FO_ERROR: System.out.println(ctxt + ": failover error gotten. Sleeping..."); // Sleep for a while try { Thread.sleep(100); } catch (InterruptedException e) { System.out.println("Thread.sleep has problem: " + e.toString()); } return FO_RETRY; default: System.out.println(ctxt + ": bad failover event."); break; } return 0; } }
Page 27: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

ODP .NET CallbackC# Example (for .NET-applications)

public static FailoverReturnCode OnFailover(object sender,OracleFailoverEventArgs eventArgs)

{switch (eventArgs.FailoverEvent){case FailoverEvent.Begin :Console.WriteLine(" \nFailover begin - Failing Over”); break;

case FailoverEvent.End :Console.WriteLine("Failover ended ... ");break;

::

con.Failover += new OracleFailoverEventHandler(OnFailover);

register TAF callback function

Implement the callback

Page 28: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Delphi win32/.NET CallbackSupport TAF in Devart ODAC (DOA – not supported !)

TTAFDemo = classpublic

… … …procedure OraSessionFailover(Sender : TObject;

FailoverState : TFailoverState;FailoverType : TFailoverType;var Retry : Boolean);

end;

varv_xSession : TOraSession;

begin… … …

FSession.OnFailover := OraSessionFailover;

Page 29: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Using TAF Callback

D E M O N S T R A T I O N

Page 30: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Server side TAFTAF Setup for the Service

begindbms_service.modify_service(

service_name => ‘oltp’, aq_ha_notifications => true, failover_method => dbms_service.failover_method_basic,failover_type => dbms_service.failover_type_session,failover_retries => 60,failover_delay => 3);

end;

• Centralized on the server:• Solving maintenance problems of tnsnames.ora on clients• The server side TAF settings have priority over the ones of the

client• The PRECONNECT-method is not supported

Page 31: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

Setup of TCP/IP-stack for TAFFor Linux Platforms

net.ipv4.tcp_keepalive_time=10net.ipv4.tcp_keepalive_intvl=5net.ipv4.tcp_keepalive_probes=5net.ipv4.tcp_syn_retries=1net.ipv4.tcp_retries2=3

• Add to /etc/sysctl.conf

• Run the sysctl –p command

• Add the ENABLE=BROKEN attribute to tnsnames.oraTAF = (DESCRIPTION =(ENABLE = BROKEN)(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(host = rac1-vip)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(host = rac2-vip)(PORT = 1521))(FAILOVER = true))

(CONNECT_DATA =(failover_mode= (type=session) (method=basic)(retries=2))

(SERVICE_NAME = racdb.ru.oracle.com)))

message interval keepalive

probe packet interval

sets the number of probe packets keepalive

the number of attempts to transfer SYN packets

sets the number of failed attempts

Page 32: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

TAF - Summary

• Transparent Application Failover is a powerful technology to increase the availability of applications

• It enables the users to work continuously• It automatically continues to run queries after

failures• Developers may extend TAF by its functionality

with the callback functions• Is may be used for other purposes, for example

for planned hardware shutdown or software updates

ComputerA

ComputerC

ComputerB

ComputerD

Page 33: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology

<Insert Picture Here>

Igor MelnikovSenior Consultant, Oracle CIS

Email : [email protected] Phone : +7 (495) 641 14 00Direct: +7 (495) 641 14 42Mobile: +7 (915) 205 26 27

Page 34: RAC Deep Dive for Developers · RAC Deep Dive for Developers building efficient and scalable RAC-aware applications  Transparent Application Failover Technology