Download - High Performance with Distributed Cachinginfo.couchbase.com/rs/302-GJY-034/images/High...High Performance with Distributed Caching: Key Requirements for Choosing the Right Solution

Transcript

High Performance with Distributed Caching

Key Requirements For Choosing The Right Solution

High Performance with Distributed Caching:Key Requirements for Choosing the Right Solution

Executive Summary

For many web, mobile, and Internet of Things (IoT) applications that run in clustered or cloud environments, distributed caching is a key requirement, for reasons of both performance and cost. By caching frequently accessed data in memory – rather than making round trips to the backend database – applications can deliver highly responsive experiences that today’s users expect. And by reducing workloads on backend resources and network calls to the backend, caching can significantly lower capital and operating costs.

Oracle Coherence and Memcached – two popular caching solutions

Two of the most common caching solutions in use today are Oracle Coherence (a proprietary product) and Memcached, an open source project. For businesses that run on Oracle databases, Coherence is often the default caching product. But many companies find that Oracle Coherence isn’t the best caching solution for every use case, so they seek out other products. The typical complaints with Oracle Coherence are its high cost, complexity, and lack of key features like cross data center replication.

At the other end of the spectrum, Memcached is a free open source product that’s used in thousands of web, mobile, and IoT applications around the world. It’s simple to install and deploy, and it delivers reliable high performance. However, Memcached has no enterprise support available, nor does it include a management console for easy monitoring. And many companies that deploy Memcached find they want additional capabilities not included in Memcached, such as more sophisticated availability features.

Why are companies replacing Oracle Coherence and Memcached with Couchbase Server?

A growing number of companies are replacing Oracle Coherence and Memcached with Couchbase Server. Couchbase Server is an open source (Apache 2.0 license), high-performance NoSQL document database, with strong roots in caching – in fact, Couchbase Server is a drop-in replacement for Memcached: It incorporates the Memcached protocol as a core component and extends it with additional capabilities. Couchbase Server is a great fit for many caching scenarios. More than 400 enterprises – including eBay, PayPal, LinkedIn, Intuit, Viber, and many others – have deployed Couchbase Server for mission critical applications.

Caching has become a de facto technology to boost application performance as well as reduce costs.

3

By caching frequently used data in memory – rather than making database round trips and incurring disk io overhead – application response times can be dramatically improved, typically by orders of magnitude.

About this guide

Based on Couchbase’s experience with leading enterprises, this guide:

• Explains the value of caching and common caching use cases

• Details the key requirements for an effective, highly available distributed cache

• Explains differences in capabilities across Oracle Coherence, Memcached, and Couchbase Server

• Describes how Couchbase Server provides a high performance, low cost, and easy-to-manage caching solution

Why Cache? Better Performance, Lower Costs

Today’s web, mobile and IoT applications need to operate at Internet scale: Thousands to millions of users. Terabytes (sometimes petabytes) of data. Sub-millisecond response times. Multiple device types. Global reach. To meet these Internet-scale requirements, modern applications are built to run on clusters of commodity servers in distributed computing environments, whether on enterprise data center or public clouds such as Amazon or Google Cloud.

Caching has become de facto technology to boost application performance as well as reduce costs. By caching frequently used data in memory – rather than making database round trips and incurring disk IO overhead – application response times can be dramatically improved, typically by orders of magnitude.

In addition, caching can substantially lower capital and operating costs by reducing workloads on backend systems and reducing network usage. In particular, if the application runs on a relational database like Oracle – which requires high end, costly hardware in order to scale – a distributed, scale-out caching solution that runs on low-cost commodity servers can reduce the need to add expensive resources.

APP APP APP

APPLICATION TIER

...

CACHE

...

DATABASE

...

DATA ACCESS LAYER

1

1 Write to cache and database.

RAM

APPLICATION TIER

...

CACHE

...

DATABASE

...

2 Read from cache.

3 If no longer cached, read from database.

2 3

DATA ACCESS LAYER

APP APP APP

RAM RAM RAM RAM RAM

Figure 1: High-level database cache architecture

4

Examples of common caching use cases

Due to clear performance and cost benefits, caching is used across numerous applications and use cases. Common use cases include:

• Speeding up RDBMS: Many web and mobile applications need to access data from a backend relational database management system – for example, inventory data for an online product catalog. However, relational systems were not designed to operate at Internet scale, and can be easily overwhelmed by the volume of requests from web and mobile applications, particularly as usage grows over time. Caching data from the RDBMS in memory is a widely used, cost-effective technique to speed up the backend RDBMS.

• Managing usage spikes: Web and mobile applications often experience spikes in usage (for example, seasonal surges like Black Friday and Cyber Monday, or when a video goes viral, etc.). In these and others cases, caching can prevent the application from being overwhelmed and can help avoid the need to add expensive backend resources.

• Mainframe offloading: Mainframes are still widely used in many industries, including financial services, government, retail, airlines, and heavy manufacturing, among others. A cache is used to offload workloads from a backend mainframe, thereby reducing MIPS costs (i.e., mainframe usage fees charged on a “Millions of Instructions Per Second” basis), as well as enabling completely new services otherwise not possible or cost prohibitive utilizing just the mainframe.

• Token caching: In this use case, tokens are cached in memory in order to deliver high performance user authentication and validation. eBay, for example, deploys Couchbase Server to cache token data for its buyers and sellers (over 100 million active users globally, who are served more than 2 billion page views a day).

• Web session store: Session data and web history are kept in memory – for example, as inputs to a shopping cart or real-time recommendation engine on an ecommerce site, or player history in a game.

Key Requirements for an Effective Distributed Caching Solution

The requirements for an effective caching solution are fairly straightforward. Enterprises generally factor six key criteria into their evaluation. How you weight them depends on your specific situation.

• Performance: Performance is the #1 requirement. Specific performance requirements are driven by the underlying application. For a given workload, the cache must meet and sustain the application’s required steady-state performance targets for latency and throughput. Efficiency of performance is a related factor that impacts cost, complexity, and manageability. How much hardware (RAM, servers) is needed to meet the required level of performance? Obviously less is better.

• Scalability: As the workload increases (e.g., more users, more data requests, more operations), the cache must continue delivering the same steady-state performance. The cache must be able to scale linearly, easily, affordably, and without adversely impacting application performance and availability.

• Availability: Data needs to be always available during both unplanned and planned interruptions, whether due to hardware failure or scheduled system maintenance, so the cache must ensure availability of data 24x365.

The use of a cache should not place undue burden on the operations team. It should be reasonably quick to deploy and easy to monitor and manage: set it and forget it.

5

The trend in enterprise it architecture is toward adopting open source software wherever possible, to help reduce risk, avoid vendor lock in, lower costs, and benefit from community-driven contributions and innovation.

• Manageability: The use of a cache should not place undue burden on the operations team. It should be reasonably quick to deploy and easy to monitor and manage: Set it and forget it.

• Simplicity: All other things equal, simplicity is always better. Adding a cache to your deployment should not introduce unnecessary complexity or make more work for developers.

• Affordability: Cost is always a consideration with any IT decision, both upfront implementation as well as ongoing costs. Your evaluation should consider total cost of ownership, including license fees as well as hardware, services, maintenance, and support.

Problems with Oracle Coherence: Cost, Complexity, and Missing Capabilities

Oracle Coherence is recognized as a product that works reasonably well when properly installed, configured, tuned, and maintained. Companies that run Oracle databases typically default to Coherence for caching, but many are not entirely satisfied with it.

The main complaints with Oracle Coherence include:

• Complexity: An Oracle Coherence deployment consists of application nodes, data nodes, a management node, an MBean server, a Management Server, an Oracle Database, Enterprise Manager and Enterprise Manager agent, Extend proxies, and more. Significant expertise is needed to configure Oracle Coherence, often requiring expensive consulting services. In contrast to less complex solutions like Couchbase Server – which a developer can easily install with no sys admin help, and a Couchbase cluster can be up and running in minutes – deploying Oracle Coherence is slow and complicated.

• Difficult to scale: Oracle Coherence requires substantial manual effort and downtime to scale. Key tasks, such as adding a node and rebalancing, are not fully automated.

Figure 2: The complexity of Oracle Coherence architecture, as illustrated here, is a common complaint of users who are not satisfied

END USER

ENTERPRISEMANAGER

ORACLERDBMS

ORACLE MGMTSERVER (OMS)

REPORTINGDEFINITIONS

XML

SQL

SQL

XML

HTTP

EM AGENTTARGET

METADATA AND COLLECTION FILES

JMX FETCHLETCOHERENCE HARDWARE

COHERENCEMANAGEMENT

NODE

J2EE, J2SEMBEANSERVER

APPLICATIONNODES DATA NODES PROXY NODE

TCMP

6

Customers such as Amadeus, the largest global distribution system for the travel industry, run over 1 million operations per second on Couchbase server.

• Lacks built-in features like cross data center replication: Despite its complexity, Oracle Coherence lacks some important built-in capabilities. For example, Oracle Coherence does not have integrated cross data center replication. Instead, it’s either up to the application to implement it using various hooks, or else customers need to install, configure, and manage an additional Oracle product (Golden Gate) if they want replication across data centers – resulting in additional cost and complexity.

• Memory limitations: Oracle Coherence can store at most 64GB of data per node even with journal storage (RAM + SSD). For many enterprise applications, that’s a tiny amount. As a result, you may need to use far more nodes to handle your data requirements, which adds more licensing costs as well as more complexity and management overhead.

• Proprietary: The trend in enterprise IT architecture is toward adopting open source software wherever possible, to help reduce risk, avoid vendor lock in, lower costs, and benefit from community-driven contributions and innovation. As a closed source proprietary product, Oracle Coherence runs counter to today’s preference for open source solutions.

• High cost: Oracle Coherence, like most other proprietary caching products, is expensive. Customers pay substantial fees for Oracle Coherence – which is licensed on a per-core basis, rather than a per-node basis – on top of their already costly Oracle database licenses. In addition, Oracle Coherence is designed to run on high-end, expensive hardware (e.g. Oracle Exalogic), whereas a number of other solutions (including Couchbase Server) are designed to run on inexpensive, commodity servers.

One big issue that Oracle Coherence customers frequently encounter stems from the Unlimited License Agreement. While the ULA at first blush looks attractive – once it’s signed, the customer can “freely” use the software everywhere – the result is that Oracle Coherence gets used in projects where it’s not the right fit. When it’s time to renew the ULA, many of these projects cannot justify the cost, so customers start looking for alternative solutions.

Memcached: A Simple, Powerful Open Source Cache

Memcached is a simple yet powerful, open source cache used by many companies, including YouTube, Reddit, Craigslist, Zynga, Facebook, Twitter, Tumblr and Wikipedia. It’s an in-memory, key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering. Key advantages of Memcached include:

• High performance: Memcached was engineered as a high performance, scalable cache. It is capable of delivering the throughput required by very large, Internet-scale applications.

• Simplicity: Intentionally designed as a pure, bare bones cache, Memcached is very simple to install and deploy. Users can set up Memcached quickly and easily.

• Low cost: Licensed under the Revised BSD license, Memcached is free open source software.

7

Lack of enterprise support, built-in management and advanced features

No commercial entity provides technical support for Memcached. So if you encounter an issue with Memcached, you need to rely on your own resources and the Memcached community to resolve those issues. In addition, Memcached does not include a built-in management console, so companies need to build their own tools to monitor its performance.

Because Memcached was intentionally engineered to be a high performance bare bones cache, it does not include some advanced features that many companies find useful, such as auto failover and load rebalancing, cross data center replication, and automated push-button scaling to easily add or remove nodes in a cluster. As described below, Couchbase Server builds on and extends the strengths of Memcached – including high performance and simplicity – to deliver a more powerful drop-in replacement for Memcached, and an affordable, less complex, open source alternative to Oracle Coherence.

Couchbase Server as a High-Performance Distributed Cache

Couchbase Server has become an attractive alternative to Oracle Coherence and Memcached, as well as other caching products. It’s the only solution that fully meets the requirements of modern web, mobile, and IoT applications that need to support thousands to millions of users, handle large amounts of data, and provide highly responsive experiences on any device.

For many enterprises, Couchbase Server hits the sweet spot by delivering performance, scalability, and availability, while being simple and easy to deploy and manage. And because it’s open source, Couchbase Server is an affordable choice, with enterprise support available from Couchbase.

General purpose NoSQL database with Memcached roots

Couchbase Server is a general purpose, document-oriented NoSQL database and has a strong caching heritage: Couchbase founders include the engineers who designed and built Memcached in 2004 at LiveJournal. LiveJournal was one of the Internet’s first social networks, before mySpace and Facebook. LiveJournal faced frequent usage spikes as well as continuously growing workloads that overwhelmed backend resources.

To solve those issues, LiveJournal engineers (i.e., the future founders of Couchbase) built Memcached as a high performance cache that’s “dead simple” to use. While it squarely met the goals for high performance and simplicity, Memcached was not designed as a high availability caching solution, so features like auto failover and cross data center replication were not built into the product.

In designing Couchbase Server, the engineers who created Memcached took its best aspects – high performance and simplicity – and extended the technology to incorporate additional capabilities, including high availability and easy manageability.

8

Meets the key requirements for distributed caching

Many of the world’s largest enterprises – including leaders in industries such as retail (Tesco), communications (Verizon, AT&T, Viber), media (DirecTV), ecommerce (eBay, PayPal), social networking (LinkedIn), among others – have deployed Couchbase Server for multiple use cases, including distributed caching. At these and other enterprises, Couchbase Server is proven in production to support some of the most demanding applications. Customers such as Amadeus, the largest global distribution system for the travel industry, run over 1 million operations per second on Couchbase Server, with consistent sub-millisecond latency.

Couchbase Server delivers against all key requirements for an effective distributed cache solution:

• High performance: Couchbase Server was purpose built to deliver sustained high throughput rates in distributed computing environments, providing consistent sub-millisecond latency and exceptional resource efficiency. In a recent benchmark run on the Google Cloud Platform, for instance, Couchbase Server sustained 1.1 million operations per second with median latency of 15 milliseconds, requiring just 50 nodes to achieve that level of throughput. (By comparison, the popular open source product Cassandra required 300 nodes to deliver comparable performance.)

• Elastic scalability: Couchbase Server is designed to provide linear, elastic scalability for web, mobile, and IoT applications at any scale. Customers such as eBay, PayPal, LinkedIn, AT&T, Viber, and many others serve millions of concurrent users globally. Adding or removing nodes can be done in minutes with push-button simplicity, without any downtime or code changes.

• Always-on availability: Couchbase Server is designed to be fault tolerant and highly resilient at any scale, delivering always on availability in case of hardware failures, network outages, or planned maintenance windows. Couchbase’s single node architecture in a distributed system is the foundation for providing high availability features, including: auto sharding, data replication (across nodes, clusters, and data centers), and auto failover. A number of major enterprises have selected Couchbase Server specifically because of its uniquely simple and powerful Cross Data Center Replication (XDCR) capabilities.

• Simplicity and ease of development: Couchbase Server is streamlined, compact, and highly efficient code with a small footprint. It can be installed and configured in minutes and requires very little time and training to learn. It’s also easy for developers to work with. SDKs are available for many languages and frameworks (including Java, C, .NET, Python, PHP, node. js, Go, Spring) and are rich with features to support developer productivity.

• Push-button manageability: Couchbase Server comes with a built-in, full-featured management console. It gives operations personnel complete visibility into the Couchbase deployment through dashboards that track dozens of metrics, as well as the tools to configure and manage the Couchbase deployment. These include push-button controls for tasks such as add or remove nodes, tune memory, or pause and resume. Many Couchbase customers rave about the level of ease and time saving they get from using the management console.

As an open source product, the total cost of ownership of Couchbase is typically a fraction of proprietary solutions like Oracle Coherence.

9

• Enterprise support: Couchbase is the commercial entity that provides development, support, and services for Couchbase Server. Customers who want an enterprise support subscription can choose from three options (Silver, Gold and Platinum).

• Affordability: As an open source product, the total cost of ownership of Couchbase is typically a fraction of proprietary solutions like Oracle Coherence, often as much as 80% less. Couchbase Server can be freely downloaded without any license fees, allowing you to prototype and experiment with zero cost or risk. Also, Couchbase Server is designed to run efficiently on low-cost, commodity servers rather than expensive hardware like Oracle Exalogic. And because Couchbase Server is far less complex to deploy and manage, it takes significantly fewer of your resources to support it.

Migrating from Oracle Coherence or Memcached to Couchbase is a Simple Process

If you’re looking to replace your existing caching solution, whether Oracle Coherence, Memcached, or another product, the migration to Couchbase Server is a simple process that takes minimal time and effort.

In general, deploying Couchbase Server as a cache is straightforward:

1. Download Couchbase Server and set up your Couchbase cluster

2. Make your application aware of the Couchbase cluster

Drop-in replacement for Memcached: No code changes required

Because Couchbase Server is fully compatible with the Memcached protocol, replacing Memcached with Couchbase is very simple. No changes to your application code are required at all. The Couchbase cluster behaves just like a Memcached server: it will be automatically discovered by your application and data will be automatically populated to the Couchbase nodes.

Migrating to Couchbase Server from Oracle Coherence

Replacing Oracle Coherence with Couchbase Server involves minimal code changes to your application at the cache interaction layer. Because of the limited development effort required, the process may take anywhere from several days

Figure 3: Architecture of deploying a Couchbase Server cluster as a caching layer

APP APP APP

APPLICATION TIER

...

User Requests

RDBMS

Cache Misses and WriteRequests

Read-WriteRequests

Active Active Active

Replica Replica Replica

COUCHBASE SERVER CLUSTER

Replication

to a couple of weeks. However, once these code changes have been made, the benefits of replacing Coherence with Couchbase – including less complexity, better manageability, easier scalability, and lower costs – are well worth the small effort required.

Beyond Caching: Expand Couchbase to Simplify your IT Infrastructure and Reduce Costs

Couchbase Server can be a great fit as an affordable, easy to manage, high performance cache. A next step is to consider using Couchbase to replace the primary data store. Since Couchbase Server is a general purpose NoSQL document database that includes an integrated distributed cache, enterprises can extend their use of Couchbase beyond caching, and deploy Couchbase as the primary underlying database.

In scenarios where a NoSQL database like Couchbase is a good fit for your application requirements, you can significantly reduce costs and simplify your IT infrastructure by consolidating on Couchbase as both cache and primary data store. In fact, many leading enterprises have extended their Couchbase deployment to be the primary data store, across a growing number of use cases including:

• Real time big data

• Profile management

• Mobile apps

• Content management

• Customer 360-degree view

Once you have installed Couchbase Server as a cache and realized the cost, manageability, and performance benefits from your deployment, it may be time to think about how to gain additional performance and cost advantages from Couchbase.

2440 West El Camino Real | Ste 600

Mountain View, California 94040

1-650-417-7500

www.couchbase.com

• Fraud detection

• Internet of Things

• Catalogs

• Digital communications

• Personalization

About CouchbaseCouchbase’s mission is to be the data platform that revolutionizes digital innovation. To make this possible,

Couchbase created the world’s first Engagement Database to help deliver ever-richer and ever more

personalized customer and employee experiences. Built with the most powerful NoSQL technology,

the Couchbase Data Platform was architected on top of an open source foundation for the massively

interactive enterprise. Our geo-distributed Engagement Database provides unmatched developer agility

and manageability, as well as unparalleled performance at any scale, from any cloud to the edge.