Migrating Jive To The Cloud
-
Upload
mattjive -
Category
Technology
-
view
3.170 -
download
3
description
Transcript of Migrating Jive To The Cloud
Sept 17, 2009
Migrating Jive to the Cloud:Practical Tips and Tricks
Matt TuckerCTO, Jive Software
Jive Overview
• Founded in 2001 – Series A with Sequoia in Fall ’07
• Growing revenue 100% Y/Y and generating cash
• Industry analyst recognition as leader in our space
• Only vendor to bridge external and internal communities
• More than 2,500 customers
• Operations in 5 countries, new office in Palo Alto
• Over 150 employees (currently hiring +20)
Jive’s Cloud Evolution
Summer of 2008• All deployments were installed software or ASP managed hosting
• Customers lacked easy, cheap way to start with Social Business Software
• Evaluated and selected Amazon EC2 environment
October 2008 – February 2009• Dedicated, skunk-works team devoted to migrating Jive’s offering to the cloud
• First instances up and running in January 2009 – 3 months total development
Summer 2009• Over 250 Customers on Jive Express
• All customer sandbox sites migrated to the EC2 cloud
• Costs to run a Jive Express environment are 1/10 the cost of ASP
• In process of launching additional products via the cloud
Advantages for Jive’s Cloud Offering
ASP takes too long and costs too much• 6 weeks to procure and install new servers
• Approximately $20K per installation, all up-front and no linear ramp
• No ability to turn off or manage capacity
Easier to manage• Minutes, not days or months to get up and running
• Instances spin up and down automatically
• Fail-over happens with admin tools in the background
• Vastly lower operational expense due to automation
Easy deployment for customers• Tools to manage and track adoption
• Customized wizards guide through typical use cases
• Customers can migrate from EC2 to ASP or their own SW deployments
The Cloud & Enterprise
Enterprise Readiness Issues• No SAS70 Type II certification for AWS
• Need to improve SLA for high end customers. On Jive side as well as AWS
• Enterprise security reviews have not caught up with the cloud yet. Standard evaluation criteria still focuses on things like hardware vs. virtualization, data center tours, etc
Ramifications• Cloud is generally for smaller or “starter” Jive implementations; low
percentage of revenue but gets us into larger deals
• We’ve built an easy migration path to on-prem or ASP
• At least 2 years away from widespread enterprise cloud readiness, but trend is happening
PaaS or IaaS?
PaaS IaaSGood choice for new code base Good choice for existing code
base
Innovative pricing models Straightforward metering pricing
Not mature Mature
Automatically scale Scale yourself; not always easy
Multi-tenant apps Multi-tenant or single-tenant apps
Key Technical Challenges
Bring multi-tenant cost efficiency to a single-tenant app
Jive is a “fat” application. How do we fit in a small EC2 instance? Cut down app startup time from 10 mins to 2 mins, use small
Java heap
Any customizations break easy/automated upgrading Built new simplified admin console and did other simplifications
via product overlay
Must eliminate per-instance manual labor Invested in radical level of automation that maintains the
environment with very little manual intervention
Architecture Overview
EC2 Instances
Controller Service
Redirect Service
S3
SQS
XMPP
EBS
Provisioning Site
Trick: Scripting Java Install
• Basic tip: fully script the creation of your AMI!
• Ran into problem that install of Java can’t be automated
# Install Sun JDK (messing with whiptail to avoid license prompt)mv /usr/bin/whiptail /usr/bin/whiptail.origcat > /usr/bin/whiptail <<EOMexit 0EOMchmod +x /usr/bin/whiptailapt-get install -y sun-java6-jdkrm /usr/bin/whiptailmv /usr/bin/whiptail.orig /usr/bin/whiptail
export JAVA_HOME=/usr/lib/jvm/java-6-sunrm /usr/bin/javaln -s $JAVA_HOME/bin/java /usr/bin/java
Trick: Hibernation
• Further cut costs by automatically turning off instances that don’t get active use
• Trick is to use DNS redirect so that they can be turned back on within minutes via self-service
Stale EC2 Instance
Redirect Service
Redirect DNS
Set TTL to 60s
Redirect Service
New EC2 Instance
Redirect DNS
Set Normal TTL
Hibernate
Re-Awaken
Trick: Upgrades
The trick: use elastic compute to do things you hadn’t imagined previously
• In ASP environment upgrades are run in-place and manually; requires multiple hours of scheduled downtime in case something goes wrong
• At EC2 we upgrade “alongside” rather than in-place
• Upgrades at EC2 are fully automated and performed en-masse
• Have achieved low 2% failure rate (fix generally only requires minor intervention)
Trick: Upgrades
How upgrades are done:
1) Make an instance read-only by putting up an upgrade message
2) Take an EBS snapshot of instance data
3) Create a NEW instance with NEW EBS volume from snapshot
4) Run upgrade on new instance using scripts
5) Run tests to ensure upgrade worked
6) Change elastic IP from old to new instance
7) Delete old EC2 instance and EBS volume
If any step fails, remove maintenance message on existing instance and log error message. Failed attempts only cost $0.10
Trick: XMPP
• SQS is fantastic for asynchronous message processing; we use it to deliver things like hourly stats. But doesn’t solve all problems
• Use XMPP for real-time controller to instance communication
• Enables multi-step synchronous actions like creating a downloadable data backup
• Simpler and faster development than complicated web services
Tip: Reserved Instances
• Lower costs by >= 30% -- purchase reserved instances
• Updated provisioning code to ensure that we always use an availability zone that has reserved instances first
Tip: Retry AWS Calls
• We’ve found that 2-5% of AWS web services calls fail
• Work-around by adding re-try logic to critical code paths; retry of major functional actions has been easier than re-try of individual AWS calls (i.e., retry everything that goes into creating new instance and include robust cleanup code)
• Added reporting to track all “orphaned” resources for edge cases where cleanup isn’t perfect
Tip: Use Userdata
• Possible to pass in dynamic data to instance when booting as userdata
• Userdata has small size limit so we securely download full startup script from S3 then execute it
$ export INSTANCESTARTUP_VERSION=instanceStartup-1.0.1.sh
$ /usr/local/jive/bin/s3-curl/s3curl.pl --id $AWS_ACCESS_ID --key $AWS_SECRET_KEY -- -f --retry 5 --connect-timeout 10 -y 10 http://xxx.s3.amazonaws.com/$INSTANCESTARTUP_VERSION > $INSTANCESTARTUP_VERSION
$ chmod +x $INSTANCESTARTUP_VERSION$ ./$INSTANCESTARTUP_VERSION
Tip: Handling Email
• Sending email from EC2 doesn’t work: reverse DNS won’t resolve it needs, big providers simply mark all of EC2 as SPAM
• Solution: relay mail to external server at trusted IP address. We use same infrastructure that ASP environment does. Large amount of email being sent = high sender score
Also, check out Sendgrid (http://www.sendgrid.com)
Demo: Backend Tools