SHC Israel: GigaSpaces Case Study

22
Introducing Social Networking Into an e-commerce Platform Tomer Gabel | SHC Israel 03.02.2011

description

A case study on our (Sears Holdings Corporation, Israel, a.k.a. Delver) use of GigaSpaces as a key part of our social commerce infrastructure. I presented this at the GigaSpaces Roadshow 2011 in Paris, France.

Transcript of SHC Israel: GigaSpaces Case Study

Page 1: SHC Israel: GigaSpaces Case Study

Introducing Social Networking Into an e-commerce Platform

Tomer Gabel | SHC Israel 03.02.2011

Page 2: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 2

Social Commerce: An Introduction

• The last few years have seen tremendous growth in social networks– Some estimates place

Facebook above Google

– Even if not, we’re talking

millions of daily unique

visitors

• So the obvious question is… where’s the money?

Page 3: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 3

Social Commerce: An Introduction

“So, clearly, whomever figures [it] out… is going to make an astounding amount of money and have a huge impact on net culture.” - Gordon Gould, ThisNext

“It’s a matter of time—within the next five or so years—before more business will be done on Facebook than Amazon” – Sumeet Jain, CMEA Capital

“Social networks seem like a marketer's dream” – Andy Leaver, Bazaarvoice

Page 4: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 4

Social Commerce: Business Case

• What’s wrong with traditional e-commerce?– Discovery/recommendation features are extremely hard to get right

– Overly broad market targeting means lost sales and disgruntled, ad-

weary customers

– The trust model is inherently broken

• Impossible to gauge truth and accuracy in customer reviews

• “Wisdom of the masses” does not always apply

– Not fun!

• Shopping is a social experience (going to the mall, holiday shopping

sprees)

• This does not translate to existing e-commerce sites!

Page 5: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 5

Social Commerce: Business Case

• “Social commerce” aims to address these deficiencies– Correlating interests and products is more accurate and significantly

easier when based on social context

• Social circles are inherently constructed on shared interests and

perspectives

• A customer’s social network is much smaller in scope than

generating a global, statistical recommendation model

– More accurate personalized data exposes new opportunities

• Personalized discovery allows more opportunity to tap the long tail

• Social interaction makes it easy to identify domain experts

– A single opinion provided by a friend, family member or acquaintance is

more trustworthy than dozens of unrelated product reviews/ratings

Page 6: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 6

Social Commerce: Business Case

• Most crucially, social commerce is all about user engagement and collaboration:– Should I buy an iPhone, Blackberry or Android phone?

– Which wedding dress looks best?

– Which video games are suitable for a preschooler?

Page 7: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 7

Social Commerce: The Axiom

Social features

increase user engagement

Increased conversion Profit!

Page 8: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 10

The Technical Challenge

• sears.com is a fully blown commercial retail site– Over 1 million page-views

daily

– Over 270,000 visitors daily

– Traffic can easily spike

up to ten times in the

holiday season!

Page 9: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 11

The Technical Challenge

• Processing social networks is not an easy proposition– Massive amounts of

branching data

– No data locality

– Very few assumptions can

be made about the data

• Let’s address each of these in turn

Source :NetworkWeaver

Page 10: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 12

The Technical Challenge

• Massive amounts of branching data:

– Imagine every Facebook user

(500 million)

– Imagine each person is only

connected to 100 others

(conservative estimate)

– How is user X connected with Y?

• X has 100 friends

• Each of them has 100 friends

• 10,001 nodes visited!

• 101 reads from the underlying

storage system!

X

Y

Page 11: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 13

The Technical Challenge

• No data locality:

– Any object may be

connected to any other

object in no particular

order

– How to split the data?

– Some research is

being done in the area

(SPAR)

Page 12: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 14

The Technical Challenge

• No easy assumptions:

– No “typical user”

– Not enough data to

draw archetypes

– Significant,

unavoidable long tail

– Difficult to pre-tune

data structures

Page 13: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 15

The Technical Challenge

• The crux of the problem:– High branch factor necessitates many loads to serve even a

simple request

– No data locality + high branch factor means very high random

I/O

– Traditional storage models (RDBMS, flat files etc.) are a poor fit

• Serious research into graph storage, social network composition etc. only dates back a few years– No best practices or “accepted truths” to build on

Page 14: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 16

Use Case for GigaSpaces

• To solve the graph storage and traversal problem, we arrived at the following requirements:– Completely in-memory storage

• No data locality means caching is inefficient

• Massive amounts of random I/O cannot scale vertically, and

hardware (basically, spindle count) cost quickly becomes prohibitive

• If data access is sufficiently fast, data can be randomly partitioned

– Horizontal scaling with a well-known scale-up strategy

• Add more memory or more nodes to handle data growth

• Add more CPUs or additional nodes to handle load growth

Page 15: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 17

Use Case for GigaSpaces

• Additional requirements include:– Map/Reduce execution framework

• Graph traversal and data analysis requirements lend well to the

map/reduce paradigm

– Code execution on the data nodes

• Because of the massive amounts of data involved, the network

interface will be quickly saturated by retrievals

• Memory retrieval is at least two orders of magnitude faster than

network throughput (DDR2-800 on a dual channel memory

controller has a theoretical throughput maximum of 102.4Gb/s)

Page 16: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 18

Use Case for GigaSpaces

• As an operations tech I had a few things to add to the list, namely…

• Nonfunctional requirements:– Built-in fault tolerance and high availability

– Zero-configuration (or as close to it as it gets) setup; in particular,

component discovery and assignment must be automated

– Well-documented deployment, configuration and tuning process

– Monitoring API

– Administrative client for diagnosis, trouble resolution and manual

intervention

Page 17: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 19

Use Case for GigaSpaces

• GigaSpaces features map well to our requirements

– Data grid

– Compute grid

– High availability

– Horizontal data and

load scaling

– Management API

• Very few viable alternatives:– Hadoop, neo4j are disk-

based

– Terracotta is overly

simplistic and has no

execution framework

– Oracle Coherence is

expensive and has a

limited feature set

Page 18: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 20

Delver Architecture

• We ended up with a hybrid platform:– GigaSpaces for graph storage, traversal and analysis

– MySQL for traditional, “simple” data as well as a backing store for

GigaSpaces

– .NET-based front-end, Java-based back-end

• We had to factor our organization accordingly– Data access team provides abstracted interfaces on top of GigaSpaces

and MySQL

– Back-end “heavy lifting” services (e.g. recommendation engine) work

directly against GigaSpaces

– Most other components either use the abstracted DAL or are simple

enough to work directly against MySQL using (N)Hibernate

Page 19: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 21

Delver Architecture

Page 20: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 22

Key Benefits

• Significantly reduced integration costs– GigaSpaces does a lot of what we need out of the box

– An alternative solution would require integrating several products,

incurring significant integration and development overhead

• Broad feature set– Social commerce is an emerging, dynamic market requiring rapid

experimentation and adaptation

– The large feature set allows us to introduce new features into the

system at a furious pace

– While primarily intended for graph storage, we also use GigaSpaces as

a message queue, distributed lock server and distributed scheduler

Page 21: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 23

QUESTIONS?COMMENTS?

Now is a good time for…

Page 22: SHC Israel: GigaSpaces Case Study

® Copyright 2011 SHC Israel Ltd. All Rights Reserved 24

Endgame

• Experience our work!– Visit Delver at http://www.delver.com/in?invite=friends-and-family

– Visit Sears Social at http://catalog.sears.com

– Read about our work at http://blog.delver.com

• Have anything to discuss?– Contact me at [email protected]

– Visit my blog at http://www.tomergabel.com

– Follow me on Twitter at http://www.twitter.com/tomerg

• Thank you for your time!