SF ElasticSearch Meetup 2013.04.06 - Monitoring

17
Monitoring tools for ElasticSearch SF Meetup 2013.03.06 Sushant Shankar Shyam Kuttikkad

description

Using monitoring tools Zabbix for systems-level monitoring of ElasticSearch and SPM (http://sematext.com/spm/elasticsearch-performance-monitoring/index.html) for ElasticSearch-specific monitoring. Using these tools was crucial was optimizing index building performance as well as query performance. Some general tips for index building and query performance.

Transcript of SF ElasticSearch Meetup 2013.04.06 - Monitoring

Page 1: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Monitoring tools for ElasticSearch

SF Meetup2013.03.06

Sushant ShankarShyam Kuttikkad

Page 2: SF ElasticSearch Meetup 2013.04.06 - Monitoring

• Why and how we use ElasticSearch• Monitoring– Tools– Index Building– Query Performance

Page 3: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Who is asdfas• Social Sharing and Content Discovery platform

– We help >600,000 publishers with content distribution, user engagement, and advertising monetization

– 450 Fortune 1000 brand marketers leverage our unique social signals to deliver impactful advertising

• We develop Machine Learning algorithms operating on Big Data to:– Provide content sharing insights to Publishers– Build customized audience segments for advertising campaigns– Extract actionable insights out of social and interest data

www.33Across.comwww.tynt.com

Page 4: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Data firehose of 30B monthly events, 1.25B cookies

- Interaction with web content- Shares – images, copies- Searches

Social AudiencesBehaviorContextKnowledge

Real-time view

Build, understand,analyze

ElasticSearch!

Page 5: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Production ElasticSearch cluster

Build index using MR job and Bulk API

Hardware6 nodes, 24GB RAM16GB for ES service 4 cores3x 1.5TB drive

Index>1TB/index (replicated) ~300M documents~5KB / document~3 hours

Page 6: SF ElasticSearch Meetup 2013.04.06 - Monitoring

System monitoring using Zabbix

Index Build

Page 7: SF ElasticSearch Meetup 2013.04.06 - Monitoring

ElasticSearch specific monitoring using SPM

Scalable Performance Monitoring (http://sematext.com/spm/index.html)

• Index stats – Total/Refreshed/Merged documents• Shards – Total/Active/Relocating/Initializing• Search - Request rate and latency• Cache – {Filter, field} cache {count, evictions, size}• Machine – CPU, Memory, JVM, GC, Network, Disk

Page 8: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Index Building Optimization using Zabbix and SPM

Amount bulk indexed

# Shards

Time takenCPU util.

Mem util.Disk I/ONetwork

Page 9: SF ElasticSearch Meetup 2013.04.06 - Monitoring

in practice…

Page 10: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Debugging and Validating using SPM

Page 11: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Index Building: Learnings

• 2 shards / CPU• 10,000 documents (users) per indexing

request

• Bulk API for our use case• No replicas• Refresh off (index.refresh_interval = -1)

Page 12: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Query Performance: Learnings

• 1-2 Replicas (and for reliability)• Turn refresh on again (5s default)• Warm up effect (Index Warm up API 0.20+)• Optimize API• Simulate multiple users

Page 13: SF ElasticSearch Meetup 2013.04.06 - Monitoring

QUERIES?

Page 15: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Why we really need a search engine

… …

Batch! Good for complicated tasks (Machine Learning, Graph Algorithms, etc.)

Page 16: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Warm Up: load into memory and cache

Page 17: SF ElasticSearch Meetup 2013.04.06 - Monitoring

Other cool features

• Custom Scoring functions• Scripts – MVEL, Python• Facets

• Exploring:• Real-time indexing• Indexing images, files, etc.• Parent-child relationships