(Elastic)search in big data

32
(Elastic)search in Big Data Radu Gheorghe @radu0gheorghe @sematext

description

The place of Elasticsearch in Big Data landscape.

Transcript of (Elastic)search in big data

Page 2: (Elastic)search in big data

What is “search in Big Data”? Challenges?

Some solutions?

How does Elasticsearch do it?

Agenda

Page 3: (Elastic)search in big data

Search Expectations

Page 4: (Elastic)search in big data

headphones for iPhone 4, iPhone 5, iPhone 6 and iPhone 7iPhone 5iPhone 4

Relevancy...

iphone

Page 5: (Elastic)search in big data

iphone iphone 5Institute of Public Health

...and autocomplete...

iph

Page 6: (Elastic)search in big data

No results found for “iphnoe”iPhone 5iPhone 4

… and fuzziness...

iphnoe

Page 7: (Elastic)search in big data

Did you mean “iPhone”?iPhone 5iPhone 4

...and corrections...

iphnoe

shows resultsanyway

Page 8: (Elastic)search in big data

iPhone 5iPhone 4iPhone 3Galaxy S4

...and similar terms...

iphone

Page 9: (Elastic)search in big data

iPhone 5iPhone 4

...and don’t forget the statistics!

iphone☑ iOS☐ other

☑ <100RON☐ 100-200RON☐ >200RON

Page 10: (Elastic)search in big data

Wait. Fancy search == Big Data?

Page 11: (Elastic)search in big data

Fancy stuff isn’t free

iphone☑ iOS☐ other

☑ <100RON☐ 100-200RON☐ >200RON

N requests forautocomplete

Did you mean...

iPhone 5iPhone 4iPhone 3Galaxy S4

1 request foreach of the stats

1 request for synonyms, 1 for exact matches, etc

1 request for corrections

Page 12: (Elastic)search in big data

Distributed search. When one server doesn’t cut it

Page 13: (Elastic)search in big data

Log Search

web_server01

database01

backend01

search engine

10:01 - webapp - DB connect error10:00 - DB - I/O error

error

Page 14: (Elastic)search in big data

Log Analytics

unique IPs: 7584

iPhone 5iPhone 4Galaxy S4

best sellers

Romania: 200France: 150Hungary: 120

users per country

revenue per day

Page 15: (Elastic)search in big data

Distributed search solutions

Elasticsearch

Solr

Others: SenseiDB, Sphinx…

SaaS: CloudSearch, Logsene...

built on top of Lucene

Page 16: (Elastic)search in big data

Document-oriented

Lucene awesome: index & store data, relevancy, fuzzy, suggesters...

...all wrapped up in JSON over HTTP

Elasticsearch

Page 17: (Elastic)search in big data

Aggregations

revenue per dayunique IPs: 7584

Page 18: (Elastic)search in big data

Aggregations

revenue per dayunique IPs: 7584

Romania: 200France: 150Hungary: 120

unique IPs per country

Page 19: (Elastic)search in big data

Aggregations

revenue per day

Romania: 200France: 150Hungary: 120

unique IPs per country

unique IPs per country per day

Romania

unique IPs: 7584

Page 20: (Elastic)search in big data

Node 1

Page 21: (Elastic)search in big data

Node 1

Page 22: (Elastic)search in big data

Node 1 Node 2

Page 23: (Elastic)search in big data

Node 1 Node 2

Page 24: (Elastic)search in big data

Node 1 Node 2 Node 3

Page 25: (Elastic)search in big data

Node 1 Node 2 Node 3

Page 26: (Elastic)search in big data

Node 1 Node 2 Node 3

Page 27: (Elastic)search in big data

Node 1 Node 2

Page 28: (Elastic)search in big data

Node 1 Node 2

Page 29: (Elastic)search in big data

Big Data distributedsearch

search and real-time analytics

Page 30: (Elastic)search in big data

Big Data distributedsearch

search and real-time analytics

more search features

Page 31: (Elastic)search in big data

Big Data distributedsearch

search and real-time analytics

more search features

clients

usage(logs)