A Dynamic Caching Mechanism for Hadoop using Memcached Gurmeet Singh Puneet Chandra Rashid Tahir...

A Dynamic Caching Mechanism for Hadoopusing Memcached

Gurmeet Singh Puneet Chandra Rashid TahirUniversity of Illinois at Urbana Champaign Presenter: Chang Dong

Outline

1.Memcached2.Hadoop-Memcached

1 Memcached

What is memcached briefly?

• memcached is a high-performance, distributed memory object caching system, generic in nature

• It is a key-based cache daemon that stores data and objects wherever dedicated or spare RAM is available for very quick access

• It is a dumb distributed hash table. It does not provide redundancy, failover or authentication. If needed the client has to handle that.

Why was memcached made?

• It was originally developed by Danga Interactive to enhance the speed of LiveJournal.com

• It dropped the database load to almost nothing, yielding faster page load times for users, better resource utilization, and faster access to the databases on a memcache miss

• http://www.danga.com/memcached/

Memcached

Where does memcached reside?

• Memcache is not part of the database but sits outside it on the server(s).

• Over a pool of servers

Architecture

Why use memcached?

• To reduce the load on the database by caching data BEFORE it hits the database

• Can be used for more then just holding database results (objects) and improve the entire application response time

• Feel the need for speed– Memcache is in RAM - much faster then hitting

the disk or the database

Why not use memcached?

• Memcache is held in RAM. This is a finite resource.

• Adding complexity to a system just for complexities sake is a waste. If the system can respond within the requirements without it - leave it alone

What are the limits of memcached?

• Keys can be no more then 250 characters• Stored data can not exceed 1M (largest typical

slab size)• There are generally no limits to the number of

nodes running memcache• There are generally no limits the the amount of

RAM used by memcache over all nodes– 32 bit machines do have a limit of 4GB though

Memcached

Memcached Distributed Architecture

2 Hadoop-Memcached

MapReduce

• Disk access latency 1. Jobs are scheduled on the same node that houses

the associated data 2. data is replicated and placed in numerous ways to

improve throughput and job completion times

RAMClouds

• RAMClouds based solely on main memory

Contribution

• Propose caching mechanism that finds a balance between the aforementioned approaches.

• Combine data replication and placement algorithms with a proactive fetching and caching mechanism based on Memcached.

Design

• A. Two-Level Greedy Caching Receiver-Only greedy caching policy cache an object locally whenever a node needs it and it

is unavailable in its local cache

Sender-Only greedy caching policy cache an object whenever some other node requests for it

and theobject is in the filesystem but not in the cache

Design

• B. Fetching a Cached Block Simultaneous-Requesting Memcached-First

Design

• C. Replacement at the Memcached Servers replaces the LRU entry from the hash table and informs the

node that has cached the block.

Design

• D. Global Cache Replacement Policy N1 110 115 120 125 N2 50 55 60 65

Design

• Prefetching

Experiments and Results

A Dynamic Caching Mechanism for Hadoop using Memcached Gurmeet Singh Puneet Chandra Rashid Tahir...

Documents

Transcript of A Dynamic Caching Mechanism for Hadoop using Memcached Gurmeet Singh Puneet Chandra Rashid Tahir...

puneet ppt22

Puneet Physics

Memcached Talk

Scaling Rails with memcached

Go, memcached, microservices

Puneet manuja

Memcached magic Ligaya Turmelle. What is memcached briefly? memcached is a high-performance, distributed memory object caching system, generic in nature.

Gearman and Memcached

puneet kanodia

Memcached Presentation

Investigating Distributed Caching Mechanisms for Hadoop Gurmeet Singh Puneet Chandra Rashid Tahir.

Getting Memcached Secure

A Dynamic Caching Mechanism for Hadoop using …web.engr.illinois.edu/.../hadoop_plus_memcached.pdf1 A Dynamic Caching Mechanism for Hadoop using Memcached Rashid Tahir Gurmeet Singh

Modeling and Analyzing Latency in the Memcached system€¦ · Memcached, Latency, Modeling, Quantitative Analysis 1 Introduction Memcached [1] has been adopted in many large-scale

Memcached and PHP

Puneet Project

Caching with Memcached - Ilia · Memcached • Interface to Memcached - a distributed, in-memory caching system • Provides a simple Object Oriented interface • Offers a built-in

Why Memcached?

Simple Spring Memcached

Mysql Memcached En