Redis Modified 121017234754 Phpapp01

40
 R edis – Memory is the new Di sk Deepak Mittal IntelliGrape Software

description

redis

Transcript of Redis Modified 121017234754 Phpapp01

  • Redis Memory is the new DiskDeepak MittalIntelliGrape Software

    *

  • *AgendaWhat is RedisAvailable clientsData typesOperations on data typesPerformancePersistenceSweet spotsDesign considerations / Best practicesAdopters

    *Will have demos & recaps in between. Aim is to inspire you to dig deeper into Redis and use it to solve your hard problems and use-cases in a very elegant wayPlease ask questions in between

  • *About me?Deepak MittalCEO of IntelliGrape Software13 years of Software development experienceInterests : Performance tuning of web-applications and teams

    *

  • *Poll Familiarity with NoSQL? Memcache?Redis?

    *

  • *What is Redis

    Key/value storelikeMemcached on steroids

    *

  • *What is Redis

    In-memory NoSQL databasebacked by disk

    *

  • *What is RedisCollection of data structures exposed over the network

    *Not only Strings (Not just parathas)Redis is a way to share memory over the network

  • *What is RedisNot just Strings!

    Keys can containStrings, Hashes, Lists, Sets andSorted Sets

    * TCP / Text based API Keys are binary safe strings

  • *What is Redis

    *It scales up and down. Small VPS friendly as well as can scale up to hundreds of GB of RAMReal time access to analytics data solves this problem very elegantlySize of values < 512 MBRedis provides few low level primitives and you can use those primitives for your use-cases. Just like you use the constructs provided by your favorite programming language.

  • *A Brief History of RedisStarted in 2009 by Salvatore SanfillipoVMWare hires Salvatore & Pieter NoordhuisFinancially supported and promoted by VMWare

    *Salvatore started Redis to support his product start-up LLOOGG. VMWare hired Salvatore in March 2010Shortly thereafter, VMWare hired Pieter Noordhuis, one of the main committers

  • *Redis CharacteristicsIs single-threadedHas queryble key namespaceIs extremely fast (~100 K operations/second)Is easy to install (no dependencies)Is simple and flexibleStores everything in memoryWritten in ANSI C and BSD licensed (free, open)

    *There is no race condition, since everything is single-threaded, no locking required.

  • *Language Support

    Ruby, Python, PHP, Erlang,Tcl, Perl, Lua, Java, Scala,Clojure, C#, C/C++,JavaScript/Node.js, Haskell,IO, Go

    *

  • *Data TypesStrings (Integers)ListsHashesSetsSorted Sets

    *Set is very good for keeping circle of friendsRedis datatypes resemble datatypes in programming languages. So, Redis datatypes are very natural to us. Simple data structures makes Redis flexible and powerfulSorted set is very good for keeping an index of words in a documentList is good for implementing a queueHash is good for keeping complex objects, its also very efficient.Hash have a limitation that they values can only be strings. Use of BLPOP to prevent excessive polling or latency and for implementation of priority queuesSET Intersection to find common contacts in 2 circlesSET intersection to find online friendsSet intersection to find friends who have purchased Redis in actionSorted Set to implement a simple Chat systemSome people also use Redis as the Primary data store. Essentially, persistence is a side effect and we shouldnt be too much worried about persistence at the time of designing or developing the applicationSetting-up schema ahead of time is a real painThe fact that Redis data-structures mimic the programming data-structures is a big advantageString / Numeric values for auto-increment sequencesList and Hash are stand-alone, Set and Sorted Set are comparable, Sorted-set and list are sorted, set and hash are not-sorted. Hashes are small in size and very efficient. Redis supports several built-in data-types, allowing developers to structure their data in meaningful semantic ways, with the added benefit of being able to perform data-type specific operations inside Redis

  • *Thinking in RedisThis is not an RDBMSIts all about the KEYS, and NOT the valuesThere is no querying on valuesThere are no indexesThere are no schemas

    *It is not possible to run a query on the values. Just like in RDBMS, you can fire-up a query to find all employees whose name is Deepak, You can-not do that with unless you have stored your data like that.

  • Operations on Strings/IntegersSET key valueGET keyMGET / MSET INCR / DECRINCRBY DECRBY APPEND key valueSETNX key valueGETSET key value

    *

  • *Installation $ wget http://redis.googlecode.com/files/redis-2.4.2.tar.gz$ tar xzf redis-2.4.2.tar.gz$ cd redis-2.4.2$ make$ ./src/redis-server

    *

  • *Atomicity of CommandsAll of the built-in commands of Redis are atomicCommands can be chained together (pipelined) to create complex atomic transactionsRedis is single threaded and therefore, no locking is necessaryIn other words, commands like INCR wont tread on each others toes coming from multiple clients simultaneously!

    *

  • *Key ExpirationWhen caching, we dont want things to live foreverAny item in Redis can be made to expire after or at a certain time

    EXPIRE my_key 60EXPIREAT my_key 1293840000

    *

  • Operations on ListsLPUSH/RPUSH key value [value ...]LPOP/RPOP keyLRANGE key start stopLSET key index valueLLEN keyLTRIM key start stopLINSERT key before|after pivot valueBLPOP key [key ...] timeoutLREM key count value

    *

    *-1 stands for last index in the listLrange key 0 -1 (gets all elements in the list)

  • Operations on SetsSADD key member SPOP keySMEMBERS keySINTER key [key ...]SUNION key [key ...]SISMEMBER key memberSCARD key

    *

    *Set contains unique entries

  • Operations on Sorted SetsZADD key score member [score] [member]ZCARD key ZCOUNT key min max ZINCRBY key increment member ZRANK key memberZRANGE key start stop [WITHSCORES] ZSCORE key member

    *

    *ZRANGE and ZREVRANGE are expensive commandsZUNIONSTORE and ZINTERSTORE are quite commonly used to store temporary resultsWhen using sorted sets, the top operations are much more efficient, because the data is already ordered, but the memory usage is higher.

  • Operations on HashesHSET key field valueHGET key fieldHKEYS keyHVALS keyHLEN keyHEXISTS key fieldHDEL key field [field ...]

    *

    *- With hashes, No de-serialization is necessary to get one field from the hash map

  • Operations on KeysTYPE key TTL key KEYS patternEXPIRE key secondsEXPIREAT key timestampEXISTS keyDEL key

    *

    *Querying keys is expensive. Keys *Keys h*lloKeys h[ae]lloIf you need all keys or a sub-set of keys as a use-case, then consider storing all keys in a set

  • Redis Administration CommandsMONITORSAVE / BGSAVEINFODBSIZEFLUSHALLCONFIG SET parameter value

    *

    *Configuration can be changed on the fly using redis-cli or by making changes to redis.conf file.

  • *PerformanceWe added two million items into a list in 1.28 seconds, with a networking layer between us and the server

    Salvatore Sanfilippo, Creator

    *

  • *PerformanceAlmost same for read and write, handles writes faster than read100 K ops/second Performance depends a lot on HardwarePersistence configuration Operation complexityPayload sizeBatch size (pipelining)

    *Compare that with ~ 6K transactions/sec on MySQL

  • *Redis-benchmark

    *

  • *DemoRedis CLIGetting HelpOperations on StringsKey ExpiryOperations on Lists, SetGroovy program operating on Sorted Set and HashMONITOR / INFO / DBSIZE / TYPERedis benchmark

    *

  • *Data DurabilityDump data to disk after certain conditions are met

    OR

    Manually

    AND/OR

    Append Only File (AOF)

    *- Choose persistence strategy based on the usage scenarios. Can trade performance for improved durabilityAt the time of snap-shotting, memory consumption increases considerably.

  • *Advantages of PersistenceOn server restart, Redis re-populates its cache from files stored on diskNo issue of cache warm-upIt is possible to back-up and replicate the files/cache

    *Even if you are using memcache and you dont need any of the additional functionality offered by Redis, data durability is a good enough reason for you to switch to Redis

  • *Memory Requirements?Depends upon the payload size, and the data-type used Instagram reports that they stored 300 million key-value pairs in about 5 GB of memory => 16 MB/million pairs (they also report that Memcache requires 52 MB for the same million keys)

    *The amount of RAM that Redis needs is proportional to the size of the dataset. Large datasets in Redis are going to be fast, but expensive

  • *Sweet Spots / Use-casesCache ServerStats collectionTag CloudMost read / shared articlesShare state between processes / Session TokensAuto-completionAPI rate limitingActivity FeedsJob Queue (LPUSH & RPOP)URL Shortener mapping

    *

  • *Case StudiesVideo Marketing PlatformReal-time analyticsStats for usage reports to clientsContent Publishing applicationTag Cloud Top Read ArticlesURL Shortening informationSocial Media Marketing applicationDaily, Monthly and overall stats collectionList of online users

    *Instagram stores images as keysInstagram did some re-design of the way keys were stored, made use of Hash and got some performance improvement

  • * Design Considerations / Best PracticesAvoid excessively long keysUse consistent key namespace nomenclatureHave human readable keysSince there is no querying on values, keys should be thought-out before handStore data based on use-cases. Ad-hoc queries can-be very expensive Multiple databases are useful for segmenting key namespaces, especially where key queries are needed

    *If you have a big data-set or a data-set that doesnt change often, use AOF. AOF can be used together with snapshotting. Use pipe-lining to improve the performance when executing multiple commands (and where output from previous command is not required)Network latency might be the biggest bottleneck when using Redis

  • *Design Considerations / Best PracticesAll data must fit in memoryData duplication/redundancy is OKPersistence strategy should be chosen based on the kind of data that you are storing and performance trade-off must be understoodGood for write-heavy, frequently changing and non-relational dataPolyglot-persistence is a smart choiceStart by using Redis to augment, and then replace

    *Going with the default configuration can make you lose data. Often times, the persistence strategy depends upon the usage patterns and your own data type.The fact that Redis does not have any index, makes you think about the usage pattern from the very beginning, which is unlike RDBMS world, where usually, the usage pattern is secondary.

  • *Some AdoptersStackoverflowCraiglistGithubDiggInstagramBlizzard Entertainment

    *

  • *Learning Resourceshttp://redis.io/commandshttp://try.redis-db.orgRedis CLI

    *

  • *Advanced FeaturesMaster-slave replicationPub / SubTransactionsUsing multiple databasesCommands pipeliningClustering / ShardingLUA scripting

    *MULTI and EXEC commands allow transactional behavior in RedisUsing DISCARD inside a transaction will abort the transaction, discarding all the commands and return to the state before the transaction began.Use of WATCH command to read a value in the middle of a transaction.

  • *What is RedisSimple & Easy to use (Key / Value, data-types map to programming languages)Flexible (Supports commonly used data-typesFast (100 Kops / second)Powerful (data-types, operations on data-types)Safe (persistence)Easy to use (various client libraries)Widely adopted (Digg, Craiglist, Github )

    *Redis is ridiculously fast. Redis is built for speed, developers talk about speed in microseconds. Can also be used as the primary data storeMemory hungry (you should determine what kind of data you want to store)Doesnt have schema, but the way you store your data depends a lot upon your use-caseRedis is very flexible VPS friendly (512 RAM), as well as can scale to hundreds of GBs of RAM

  • *DiscussionNow that you understand Redis and its capabilities, what use-cases do come to your mind?

    What are your concerns around usage of Redis?

    *-

    *

    *Will have demos & recaps in between. Aim is to inspire you to dig deeper into Redis and use it to solve your hard problems and use-cases in a very elegant wayPlease ask questions in between

    *

    *

    *

    *

    *Not only Strings (Not just parathas)Redis is a way to share memory over the network

    * TCP / Text based API Keys are binary safe strings

    *It scales up and down. Small VPS friendly as well as can scale up to hundreds of GB of RAMReal time access to analytics data solves this problem very elegantlySize of values < 512 MBRedis provides few low level primitives and you can use those primitives for your use-cases. Just like you use the constructs provided by your favorite programming language.

    *Salvatore started Redis to support his product start-up LLOOGG. VMWare hired Salvatore in March 2010Shortly thereafter, VMWare hired Pieter Noordhuis, one of the main committers

    *There is no race condition, since everything is single-threaded, no locking required.

    *

    *Set is very good for keeping circle of friendsRedis datatypes resemble datatypes in programming languages. So, Redis datatypes are very natural to us. Simple data structures makes Redis flexible and powerfulSorted set is very good for keeping an index of words in a documentList is good for implementing a queueHash is good for keeping complex objects, its also very efficient.Hash have a limitation that they values can only be strings. Use of BLPOP to prevent excessive polling or latency and for implementation of priority queuesSET Intersection to find common contacts in 2 circlesSET intersection to find online friendsSet intersection to find friends who have purchased Redis in actionSorted Set to implement a simple Chat systemSome people also use Redis as the Primary data store. Essentially, persistence is a side effect and we shouldnt be too much worried about persistence at the time of designing or developing the applicationSetting-up schema ahead of time is a real painThe fact that Redis data-structures mimic the programming data-structures is a big advantageString / Numeric values for auto-increment sequencesList and Hash are stand-alone, Set and Sorted Set are comparable, Sorted-set and list are sorted, set and hash are not-sorted. Hashes are small in size and very efficient. Redis supports several built-in data-types, allowing developers to structure their data in meaningful semantic ways, with the added benefit of being able to perform data-type specific operations inside Redis

    *It is not possible to run a query on the values. Just like in RDBMS, you can fire-up a query to find all employees whose name is Deepak, You can-not do that with unless you have stored your data like that.

    **

    *

    *

    *-1 stands for last index in the listLrange key 0 -1 (gets all elements in the list)*Set contains unique entries*ZRANGE and ZREVRANGE are expensive commandsZUNIONSTORE and ZINTERSTORE are quite commonly used to store temporary resultsWhen using sorted sets, the top operations are much more efficient, because the data is already ordered, but the memory usage is higher.

    *- With hashes, No de-serialization is necessary to get one field from the hash map*Querying keys is expensive. Keys *Keys h*lloKeys h[ae]lloIf you need all keys or a sub-set of keys as a use-case, then consider storing all keys in a set

    *Configuration can be changed on the fly using redis-cli or by making changes to redis.conf file. *

    *Compare that with ~ 6K transactions/sec on MySQL*

    *

    *- Choose persistence strategy based on the usage scenarios. Can trade performance for improved durabilityAt the time of snap-shotting, memory consumption increases considerably.

    *Even if you are using memcache and you dont need any of the additional functionality offered by Redis, data durability is a good enough reason for you to switch to Redis*The amount of RAM that Redis needs is proportional to the size of the dataset. Large datasets in Redis are going to be fast, but expensive*

    *Instagram stores images as keysInstagram did some re-design of the way keys were stored, made use of Hash and got some performance improvement

    *If you have a big data-set or a data-set that doesnt change often, use AOF. AOF can be used together with snapshotting. Use pipe-lining to improve the performance when executing multiple commands (and where output from previous command is not required)Network latency might be the biggest bottleneck when using Redis*Going with the default configuration can make you lose data. Often times, the persistence strategy depends upon the usage patterns and your own data type.The fact that Redis does not have any index, makes you think about the usage pattern from the very beginning, which is unlike RDBMS world, where usually, the usage pattern is secondary.

    *

    *

    *MULTI and EXEC commands allow transactional behavior in RedisUsing DISCARD inside a transaction will abort the transaction, discarding all the commands and return to the state before the transaction began.Use of WATCH command to read a value in the middle of a transaction.

    *Redis is ridiculously fast. Redis is built for speed, developers talk about speed in microseconds. Can also be used as the primary data storeMemory hungry (you should determine what kind of data you want to store)Doesnt have schema, but the way you store your data depends a lot upon your use-caseRedis is very flexible VPS friendly (512 RAM), as well as can scale to hundreds of GBs of RAM

    *-