High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas...

17
High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference 1

Transcript of High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas...

Page 1: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 1

High Availability in Hurricane Alley

Multi-site multi-node CASDeep in the Heart of Texas

Srinivas Varadaraj & Bill Thompson

Page 2: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 2

Agenda

1. Strategy2. Technical requirements3. Constraints4. Stuff at hand5. Architectural decisions6. Cluster & production architecture7. Challenges and solutions8. Multi-site routing9. Production experiences10. Questions & Comments

Page 3: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 3

Strategic requirements

Single Identity

Single Sign On/ Single Sign Off

Maximize self service tools

Improved user experience

Page 4: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 4

Technical requirements

• Application Compatibility• High Availability• Rolling maintenance• Transparency • Scalability• AD integration• Customization(branding)

Page 5: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 5

Constraints

• Limited budget , use existing resources.– Power in the datacenters– Single internet – High latency connectivity

• Limited in-house development & experience– Stay close to release code

• Aggressive timeframe

Page 6: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 6

Stuff we had at hand

• SAN infrastructure with replication to DR• VM clusters• Site-to-site VPN based connectivity to DR• F5 loadbalancers • Dedicated firewalls • Opportunity

Page 7: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 7

Decisions ! Decisions ! Decisions !

• Virtual Machines• SAN based storage• The great ticket registry debate• To replicate tickets or NOT !• Building by cloning• “Appliance” like• SSL Local vs Offloading • Cluster VS Standalone application servers• Timeout !

Page 8: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 8

Cluster components

Page 9: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 9

Final Architecture

Page 10: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 10

“Holy troubles, Batman!”

• SSL offloading– Tomcat offloading workaround

• Authentication and Validation persistence– User and application can go to either site.– Enter site identifiers

• Multi-site ticket replication.– Latency in WAN

• Algorithm usage in phpCAS clients and Java CAS clients

• Slow performance of mod_auth_cas on VMs

Page 11: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 11

Routing logic

• HTTP_REQUEST• HTTP_REQUST_DATA• HTTP_RESPONSE

Page 12: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 12

HTTP_REQUEST(Request from the client)

HTTP_REQUEST{1) Grab header length to determine payload size2) If both sites are down, redirect to a branded

service unavailable page3) If URI has siteID of other site and other site is up, route to other site4) Otherwise default route to local site}

Page 13: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 13

HTTP_REQUEST_DATA(Payload manipulation)

HTTP_REQUEST_DATA{1) Grab <samlp:AssertionAtrifact> from payload , this may contain siteID2) if we have a siteID of the other side { If the siteID is Loadbalancer introduced { blank the loadbalancer extension} Route to other side else { if we have a siteID of the local side { If the siteID is Loadbalancer introduced { blank the loadbalancer extension} Route to local side }}

Page 14: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 14

HTTP_RESPONSE(Response from the server)

HTTP_RESPONSE{1) Grab server’s response headers2) If SiteID is not in the response header { Introduce a loadbalancer siteID to compensate for java CAS client} Release HTTP to client}

Page 15: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 15

Page 16: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 16

Experiences in Production

• Approx. 8 months in production• 7 Applications in production 10 in development• Survived two power outages at DR• Survived multiple internet outages• Successful rolling upgrades to MySQL & CAS• Flow based redesign.• LPPE • Re-visit ticket registry

Page 17: High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference1.

Jasig Sakai Conference 17

Questions/Comments

• Credits:– CAS developers and community– F5 & F5 devcentral– Unicon– LU & Txstate

• Thank you for your time !!• Contacts:– Sri: [email protected]– Bill: [email protected]