ElasticSearch and SOAnosqlroadshow.com/dl/NoSQL-Munich-2013/... · WHY SOA? Velocity Developer...

Post on 13-Jul-2020

1 views 0 download

Transcript of ElasticSearch and SOAnosqlroadshow.com/dl/NoSQL-Munich-2013/... · WHY SOA? Velocity Developer...

ElasticSearch and SOAPeter Bourgon

Evolution of searchCurrent architectureA bit about SOA

EVOLUTIONOF SEARCH

KATAMARI DAMACY, NAMCO

:(

✘ Solr✘ Batch replication✘ Very slow reindexing✘ Technical debt in app

DO WANT.

✔ Velocity✔ Reliability✔ Maintainability

HOW TO DO?

ElasticSearchGreenfield developmentParallel dark launchMigrate traffic slowlyScale out

WHY ELASTICSEARCH

✔ Good API✔ Clear path for growth✔ Batteries included✔ Works like you expect it to

OUR ELASTICSEARCH

External versioning No dynamic mappingMultiple indices (‘live’ alias)Custom scoring moduleIndex: 2N shards, 1 replicaCluster per datacenter

IMPORTANT SETTINGS

discovery:zen:

minimum_master_nodes: (N/2)+1

IMPORTANT SETTINGS

cluster:routing:

allocation:node_concurrent_recoveries: 1

IMPORTANT SETTINGS

bootstrap:mlockall: true

-Xmx = -Xms = (memory/2)

IMPORTANT LESSON

One cluster = single point of failure

OUR SEARCH COMPONENTS

DISCORANK

PageRank

DiscoRank

DISCORANK STRATEGY

Calculations done offlineSerialize to compact arrayLoad in ES custom scorer Recalculate/reload every N

INDEXING

Single-purpose applicationStateless and idempotentFull catalog ~= 1 hourBuild, iterate, iterate, iterate

SEARCHING

Single-purpose applicationStateless and idempotentSC/ES DSL translationOpen-source ES library

SUGGESTING

Single-purpose application“Stateless” and idempotentPrefix trie in memoryLucene Finite State Transducer

SERVICEORIENTEDARCHITECTURE

SOA AT SOUNDCLOUD

Bazooka platformAPI as firewall12 Factor applications

12 FACTOR APPS

Single codebase (repository)Config stored in environmentBacking services as resourcesStatelessHorizontal scaling with processes

WHY SOA?

VelocityDeveloper happinessDistributed systems are complex

SOA infrastructureElasticSearch core

QUICK STATS

1.5h

24h

TIME TO REINDEX

3s

900s

TIME TO VISIBILITY

OTHER STATS

Cleared (big!) feature backlog > 2x traffic since project start> 3x corpusFlat response curve

WIN

ElasticSearch works as expectedSOA = force multiplier

Thanks!Questions?

Peter Bourgonpeter.bourgon.org@peterbourgon