Search and Analytics (using Elasticsearch) - Costin Leau.pdf · Elasticsearch Open-Source Search &...

29
Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Search and Analytics (using Elasticsearch) Costin Leau

Transcript of Search and Analytics (using Elasticsearch) - Costin Leau.pdf · Elasticsearch Open-Source Search &...

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search and Analytics

(using Elasticsearch)

Costin Leau

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Why search?

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – what’s the big deal?

Basic/Metadata retrieval

“Find banks with more then (x) accounts”

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – what’s the big deal?

Basic/Metadata retrieval

“Find banks near my location”

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search – What we’re all about

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search categories

Basic/Metadata retrieval

Full-text search

Highlighting

Geolocation

Fuzzy search (“did-you-mean”)

Natural Language

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search categories

Basic/Metadata retrieval

Full-text search

Highlighting

Geolocation

Fuzzy search (“did-you-mean”)

Natural Language

data stores

search engines

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

‘Players’ in the search market

Search engines

- Google/Bing/Yahoo!/Ask.com/Yandex/Baidu

Open-Source

- Sphinx

- Apache Lucene

- Elasticsearch

- Solr

- Sensei

Enterprise Search

- Oracle Endeca / MDEX

- HP Autonomy

- Exalead

- IBM Enterprise Search

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Lightweight

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch

Open-Source Search & Analytics engine

- Structured & Unstructured Data

- Real Time

- Analytics capabilities (facets)

- REST based

Distributed

- Designed for the Cloud

- Designed for Big Data

Lightweight

Popular: >200K downloads/month

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Users

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Users

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Platform Adoption

http://www.thoughtworks.com/radar#platforms 2013

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Platform Adoption

http://www.thoughtworks.com/radar#platforms 2013

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Searches 50,000,000 venues every day using

Elasticsearch

Use Case - Geolocation

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case – Support/Reporting

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case - Centralized Logging

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Use Case - Pure Analytics

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Search and Big Data

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

A Holistic View of a Big Data System

ETL

Real

Time

Streams

Unstructured Data (HDFS)

RT Semi

structured

Database

(hBase,

Cassandra,

Mongo)

Big SQL (Greenplum,

AsterData,

Etc…)

Batch Processing Real-Time

Processing

(s4, storm)

Analytics

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

A Holistic View of a Big Data System

ETL

Real

Time

Streams

Unstructured Data (HDFS)

RT Semi

structured

Database

(hBase,

Cassandra,

Mongo)

Big SQL (Greenplum,

AsterData,

Etc…)

Batch Processing

Analytics

Real-Time

Processing

(s4, storm)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Hadoop eco-system

Hadoop Distributed File System (HDFS)

Map Reduce Framework (MapRed)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Hadoop eco-system

Hadoop Distributed File System (HDFS)

Map Reduce Framework (MapRed)

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Elasticsearch + Hadoop

0

10

20

30

40

50

60

M/R Pig Hive

Raw w/ ES

0

10

20

30

40

50

60

M/R Pig Hive

Raw w/ ES

Writing Reading / Querying

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Explore data through

(Elastic)Search

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited

Thank you! @costinl

http://www.elasticsearch.org/