Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

download Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

of 11

Transcript of Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    1/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Hybrid Web ClusterWhitepaper: High Availability, Scalable 'Cloud Sites'Deployments with No Single Point of Failure

    Version 2.0

    Luke Marsden, CTO

    Hybrid Logic Ltd.

    +1-415-449-1165 (US) / +44-203-384-6649 (UK)

    [email protected]

    http://www.hybrid-cluster.com/

    Wednesday 2 November 2011 1/11

    mailto:[email protected]://www.hybrid-cluster.com/http://www.hybrid-cluster.com/mailto:[email protected]
  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    2/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Table of Contents

    Hybrid Web Cluster ....................................................................................................................1

    Abstract................................................................................................................................2Keeping Your Websites Online...................................................................................................3

    Understanding Business Continuity Planning...........................................................................3

    Scaling Websites When There Are Spikes In Traffic..................................................................4

    Protect Against User Error with Continuous Data Protection......................................................5

    Choosing Between Shared and Direct Attached Storage...........................................................5

    Technical Details.....................................................................................................................6Hybrid Web Cluster: A Paradigm Shift in Web Hosting..............................................................6

    More Than Just LAMP Web Hosting........................................................................................6

    Infrastructure-as-a-Service Integration..................................................................................6

    Analysis of a Typical Web Request.........................................................................................7

    Keeping you online in a disaster: Practical choices...................................................................8

    Tunable Parameters............................................................................................................8

    Integration Options.................................................................................................................9Control Panel.....................................................................................................................9

    API.................................................................................................................................10

    WHMCS & Parallels APS.....................................................................................................10

    Conclusion...........................................................................................................................11Derive Competitive Advantage with Hybrid Web Cluster..........................................................11

    AbstractProviders of hosting solutions today are faced with a myriad of challenges in transitioning to the cloud.On this journey, many issues remain constant. In this paper we offer solutions to four key problems:

    Business continuity planning and disaster recovery.

    Scaling websites when there are spikes in traffic. Protecting against user error with continuous data protection.

    Choosing between shared and direct-attached storage.

    The revolutionary technology in Hybrid Web Cluster solves these problems for you, enabling you to

    deliver the next-generation cloud web hosting that your customers are demanding.

    Wednesday 2 November 2011 2/11

  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    3/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Keeping Your Websites OnlineHybrid Web Clusterallows previously unseen resilience in the face of hardware, network and system

    failures, completely eliminating single points of failure.

    Understanding Business Continuity PlanningThere are two key metrics used by industry to evaluate available Disaster Recovery (DR) solutions. Theseare called Recovery Point Objective (RPO) and Recovery Time Objective (RTO). Typically one has aprimary site and a DR site, where data is replicated from the primary to the DR site at a certain interval:

    As you can see from the diagram above, RPO is the amount of data lost in a disaster (such as the failureof a server or data center). This depends on the backup or replication frequency, since the worst-case isthat the disaster occurs just before the next scheduled replication occurs.

    RPO = data loss measured in time

    RTO defines the amount of time it takes an organization to react to a disaster (whether automatically ormanually; typically there will be at least some manual element such as changing IP addresses),performing the reconfiguration necessary to recreate the primary site at the DR site. For example if thereis a fire at the primary site you would need to order new hardware and re-provision your servers frombackups. For most web hosting companies retaining an exact replica of every server at the primary site atthe DR site is not economically viable. For example, a hosting company recently interviewed describedhow the purchase of an additional NetApp storage appliance at the DR site was financially infeasible andtherefore if the primary were to fail permanently then the RTO could be measured in weeks.

    RTO = downtime

    Hybrid Web Cluster makes enterprise standards of RPO and RTO available to all web hosts without the

    additional costs ofanyexpensive shared storage hardware. We guarantee RPO = 5 minutes and RTO =2 minutes. This can be adjusted according to your requirements, as a trade-off between disk andnetwork I/O and acceptable amounts of data loss in a disaster scenario (see Tunable Options section).

    RPO = data loss RTO = downtime

    Conventional backup cycle 24 hours 48 hours

    Hybrid Web Cluster 5 minutes 2 minutes

    Compare our RPO and RTO to your current web hosting solution. If you have nightly backups to an off-

    site storage server, your RPO is 24 hours and your RTO is however long as it would take your techniciansto provision and reconfigure all the new hardware at your DR site. By automating continuous data

    protection and failover with Hybrid Web Cluster, you can significantly improve the guarantees you offer toyour customers even in the worst case scenario. In the context of cloud infrastructure, our solution can

    cope with the failure of an entire region1.

    1 http://aws.amazon.com/message/65648/

    Wednesday 2 November 2011 3/11

    http://aws.amazon.com/message/65648/http://aws.amazon.com/message/65648/
  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    4/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Scaling Websites When There Are Spikes In Traffic

    In this section we compare the Hybrid Web Cluster scalability model to both the common approach ofinstalling many websites on a single server and the CloudLinux model, the current industry leader.

    In the common shared hosting model, a web

    hosting company will simply install a lot ofwebsites on a single server without any HighAvailability (HA) or redundancy, and set up anightly backup via rsync.

    In this model when a website gets very

    popular, the server which is hosting it is alsobusy serving requests for a lot of other

    websites and becomes over-loaded. Typicalconsequences of this are that the server will

    start to respond very slowly as the requirednumber of I/O operations per second exceeds

    the capacity of the server. The server willsoon run out of memory as the web requests

    stack up, start swapping to disk, and thrashitself to death. This results in everyone's

    websites going offline.

    The technique advocated by CloudLinux is tocontain the spike of traffic by imposing OS-

    level restrictions on the site which isexperiencing heavy traffic. This is clearly an

    improvement because the other sites on the

    server stay online.However the disadvantage to this approach isthat the website which is gaining the traffic isnecessarily slowed down or stoppedcompletely. If the server were to try to fullyservice all incoming requests for that site, itwould crash, as above.

    This is where the Hybrid Web Cluster modelreally wins. The moment that a big spike in

    traffic happens is not when your users want

    to be worrying about migrating to a dedicatedserver!

    Rather than strangling the site which isexperiencing the spike in traffic, Hybrid WebCluster dynamically live-migrates the otherwebsites on that server to other servers inthe cluster, with no downtime for any sites, soyour users get the full benefit of automaticscalability.

    Site Juggler Live Migration delivers threeorders of magnitude2 greater scalability than

    shared hosting solutions, allowing websites to

    scale by intelligently and transparentlymigrating them between hosts.

    2 Assuming just 500 websites per server, you can burst to 2 dedicated servers or 1,000x scalability

    Wednesday 2 November 2011 4/11

  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    5/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Protect Against User Error with Continuous Data Protection

    When considering data protectionsystems, it's important to distinguishbetween systems such as RAID orsynchronous replication, which protectyou against the failure of hardware, but

    if a user accidentally deletes some data,such systems will replicate the deletionto the other device and the data will bepermanently lost.

    A better solution is Continuous DataProtection, or as we call it, our Point-In-

    Time Restore feature, which takescontinual point-in-time snapshots of all

    the data stored on the system, andexposes it to the end user via a friendly

    web user interface so that they canundo their mistakes without

    administrator intervention.

    Choosing Between Shared and Direct Attached Storage

    Our HCFS Data Replicationallows you to take advantage of the performance and cost savings of direct-

    attached storage. At his presentation at HostingCon, Siena Fath-Azam of Storm on Demand described thedichotomy between shared storage (Storage Area Network, or SAN) and Direct Attached Storage (DAS).

    A SAN is a storage device, typically from a vendor such as NetApp or EMC, which provides a centrallocation for your servers to keep their data. DAS just means connecting disks directly to your servers.The following table summarises his talk.

    Storage Area Network Direct Attached Storage

    Pros Ease of movement of applications

    between servers (data is never storedon a specific server)

    Reliability, typically because of more

    expensive hardware

    Easier to do traditional High-Availability where if an application fails

    on one server, you can start it onanother server

    Performance is always better.

    Cost is always lower.

    Easier to customize.

    Cons Performance network based storageis always slower than direct attachedstorage because the data has to travel

    further

    Cost always more expensive by afactor of 2-3x

    Failures are horrifying (everythingfails!). Examples: VPS.net, AmazonEBS, MediaTemple

    Difficult to deploy traditional HA, because if aserver fails then it had the data stored on it. Youneed something else.

    It's difficult to move applications betweeninstances.

    Hybrid Web Cluster's HCFS data replication is the missing piece of the puzzle making it trivial tomigrate instances between servers with direct-attached storage (there's a button in the Control Panel forit), thereby solving the data management issues normally associated with direct-attached storage. Itsimultaneously adds fault tolerance to otherwise vulnerable servers without relying on a similarly falliblecentral system, resulting in a more reliable, higher performance, and 2-3x less expensive solution.

    Furthermore, our replication system works across Wide Area Networks such as the Internet, allowing youto migrate websites, databases and mailboxes between data centers, and fail-over even if entire data

    centers fail.

    Wednesday 2 November 2011 5/11

    Each circle represents a snapshot the user can roll back to.

  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    6/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Technical Details

    Hybrid Web Cluster: A Paradigm Shift in Web Hosting

    Hybrid Web Cluster represents a fundamental shift in the way you are able to provision and deploy web

    hosting accounts across globally distributed physical infrastructure. In a nutshell, we keep your lightson in the face of hardware and network failures and automatically scale your websites in the faceofquickly-changing and sometimes significant traffic levels.The following key innovations provide ourlicensees with an unparalleled feature set:

    1. Our pure-software HCFSdata replication allows web clusters to run across geographically diverseregions, on inexpensive commodity hardware with high performance directly attached storage, or

    on public cloud infrastructure. Our replication system provides continuous data protection andautomatic disaster recoveryeven if an entire region fails.

    2. Our distributed protocol handlerAwesomeProxyprovides distributed and highly-availableimplementations of all the protocols you need: HTTP, HTTPS, FTP, MySQL, POP, IMAP, SMTP &SSH.

    3. Our live migration technologycontrols the HCFS andAwesomeProxysystems in tandem to

    provide two orders of magnitude of scalability beyond shared hosting via seamless and near-instant migration of websites and databases between servers.

    4. Our platform is compatible with every LAMP website and web application. Within each protectedcluster instance, we run standard installations of Apache, MySQL, Exim and Dovecot andapplications do not need to be modified to run in this context.

    5. We provide a feature-complete, white-label brandable and reseller-compatible Control Panel

    which can fully replace CPanel or Plesk, which integrates with industry standard domain and SSLcertificate providers and billing systems such as WHMCS, and has a complete API to allow you to

    integrate your existing billing & provisioning systems with your own cluster deployment.

    6. By leveraging OS-level multi-tenancy technology we offer customer densities orders of magnitudegreater than IaaS-based solutions (up to 2,000 customers per server, rather than 30) whileoffering revolutionary levels of dynamic scalability (bursting from multi-tenancy to dedicated

    hardware).

    In total, the technology provides never-before-seen resilience in the face of hardware, network andsystem failures, completely eliminating single points of failure, while allowing you to use your existinginvestment in commodity hardware to compete with a feature set usually reserved for enterprise-costand highly complex SAN-based solutions.

    More Than Just LAMP Web Hosting

    We will shortly be adding support for Python, Ruby, Node.js (via the emerging PaaSstandard CloudFoundry) and NoSQL data stores CouchDB, MongoDB, Redis andMemcache.

    Wednesday 2 November 2011 6/11

  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    7/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Analysis of a Typical Web Request

    The following diagram gives a high-level overview of the Hybrid Web Cluster system in terms of a typicalscenario where a user uploads a new photo to their Wordpress blog.

    Note that the blue and orange boxes refer to logical, not physical entities. The only physical hardwarerequired are the cluster nodes themselves. Note, therefore, the absence ofexpensive specializedhardware: in particular no hardware load balancer and no centralized shared storage.

    AwesomeProxy replaces load balancers and HCFS replaces SANs.

    This approach delivers a cost saving of 60-70% compared to classical clusters and cloud infrastructuresolutions based on shared storage.

    This is what happens when the user uploads a new photo to their blog:

    1. The user's browser looks up the website address in DNS and is returned a list of live nodes whichare geographically local to the current master for that site. The browser connects to one of them.

    2. AwesomeProxy discovers which website is being requested and passes on the request to thecorrect server. Apache writes the new photo to disk on the current master.

    3. Within a few seconds, HCFS detects the write to the filesystem and makes a consistent point-in-time snapshot of the new data. Moments later, the change has been replicated to the slaves forthat filesystem: typical cluster configurations (see Tunable Parameters section) means that thedata is replicated to another another machine in the same data center and one machine in aremote data center.

    The direct consequences of these two additional layers the distributed protocol handler above, and thereplication system below are that any server or entire data center can fail and the cluster willautomatically reconfigure itself so that your websites stay online. The distributed protocol handlerensures that requests are always routed to a server which is online and able to serve your site, and theHCFS replication engine ensures that your data is always safe.

    Wednesday 2 November 2011 7/11

  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    8/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Keeping you online in a disaster: Practical choices

    When building distributed (cloud) systems you can pick at most two of the following features:

    1. Consistency if a system is consistent, then queries to different nodes for the same data willalways result in the same answer

    2. Availability the system always responds to requests with a valid response

    3. Partition tolerance if the parts of a distributed system become disconnected from each other

    they can continue to operate

    Hybrid Web Cluster chooses Availability and Partition tolerance over Consistency, because for websiteowners, given the choice they would rather their website be online in a disaster scenario.

    When a cluster becomes partitioned, for example if an under-sea cable gets cut, and the European and

    US components of a cluster can no longer communicate with each other, the cluster elects new mastersfor all the sites on both sides of the partition in order to keep the websites online on both sides of the

    Atlantic.

    When traffic is re-routed or the under-sea cable is repaired, the cluster rejoins and the masters negotiatewhich version of the website is more valuable based on how many changes have been made on both

    sides of the partition. This keeps your websites online all the time, everywhere in the world.

    Tunable Parameters

    Hybrid Web Cluster allows customization of the topology and timing values for the replication and fail-over aspects of the software. The following variables can be adjusted on a per-cluster basis, giving youthe benefit of being able to customise the cluster to your specific requirements:

    Variable Explanation

    Check Timeout How many seconds to allow to a server to appear to be offline before taking

    recovery actions. This plus the length of time to effect an automatic recovery

    (about a minute) corresponds directly to RPO.

    Default: 60 seconds

    Snapshot Quick Timer How many seconds to wait before snapshotting and replicating a filesystemwhich has been modified once. A lower number results in faster replication and

    smaller RTO but uses proportionally more disk and network I/O.

    Default: 120 seconds

    Snapshot Interval Timer As above, but applying to filesystems which are being constantly modified.

    Default: 300 seconds

    Local Redundancy The number of slaves to set up for each filesystem in the data center local to the

    current master. Determines the failure-tolerance of the local data center in terms

    of how many servers may fail locally before anydata is lost.

    Default: 1

    Global Redundancy The number of other remote (relative to the current master) data centers inwhich to set up slaves for each filesystem. Determines the failure-tolerance in

    terms of how many data centers may fail before any data is lost.

    Default: 1

    Slaves per Remote Locality The number of servers to add as slaves in each remote data center.

    Default: 1

    Replication Concurrency How many concurrent replication events to allow. This should be adjusted to

    match approximately the number of spindles (disks) you have in each cluster

    node, due to disks performing better when operations are serialized.

    Default: 2

    Wednesday 2 November 2011 8/11

  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    9/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Integration Options

    Control Panel

    We have developed an advanced, easy-to-use, AJAX and whitelabel brandable Control Panel. Here is a

    brief tour of the Control Panel. It has multiple levels of users: Cluster Administrators, Resellers and WebHosting Users. This is the dashboard which shows a Cluster Administrators view:

    This shows a user adding a Wordpress blog on an external domain:

    Adding the blog takes just a few seconds, after which the blog issafely replicated across multiple data centers, immediately ready for

    automated fail-over and scalability if a server or data center fails orthe website gets a spike in traffic.

    Wednesday 2 November 2011 9/11

  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    10/11

    Hybrid Web Cluster Whitepaper Version 2.0

    The Control Panel supports advanced features such as full custom DNS editor with user-friendly wizards,and also allows complete white-label branding on a per-reseller level as the following screenshots show.This allows you to completely change the look and feel of the entire Control Panel with just a few clicks.Here we have used a popular cloud infrastructure brand just as an example:

    This is just a small taste of what our Control Panel can do. Sign up for a trial to explore it for yourself!

    APIIn addition to the powerful, flexible and modern Control Panel we also provide a comprehensiveJSON/XML API with over 100 commands which allows you to perform a full and complete integration withany billing and provisioning system, to add a Cloud Sites option to your existing systems. We have a fullAPI, documentation for which can be found at:

    http://www.hybrid-cluster.com/api

    The API allows you to control every aspect of scalable, redundant website, database and mailboxdeployments.

    WHMCS & Parallels APS

    We are also working on plugins for the above systems. Please contact us if you wish to test them out.

    Wednesday 2 November 2011 10/11

    http://www.hybrid-cluster.com/apihttp://www.hybrid-cluster.com/api
  • 8/3/2019 Hybrid Web Cluster - Technical Whitepaper 2.2 for GoGrid

    11/11

    Hybrid Web Cluster Whitepaper Version 2.0

    Conclusion

    Derive Competitive Advantage with Hybrid Web Cluster

    Clearly, the hosting industry is changing. Hybrid Web Cluster gives you the tools your business needs tosurvive in a highly competitive landscape, and with 60-70% cost savings over SAN-based products, and40-60x better densities than virtualisation-only solutions, can deliver the margins you need to thrive.

    Start your 30 day free trial today and derive competitive advantage. To obtain access to a Hybrid WebCluster installation, contact:

    Luke Marsden, CTO

    Hybrid Logic Ltd.

    +44-203-384-6649 (UK) / +1-415-449-1165 (US)

    [email protected]

    http://www.hybrid-cluster.com/

    Wednesday 2 November 2011 11/11

    mailto:[email protected]://www.hybrid-cluster.com/http://www.hybrid-cluster.com/mailto:[email protected]