Elasticsearch in production
-
Upload
foundsearch -
Category
Technology
-
view
982 -
download
0
description
Transcript of Elasticsearch in production
Elasticsearch in productionAlex Brasetvik
@alexbrasetvik
How marketing thinks our users feel
How we developers sometimes feel
Who?
Co-founder of Found AS7+ years of search, 2+ Elasticsearch
We manage hundreds of Elasticsearch clusters
… on Amazon's cloud
Agenda
Memory (and stability)Security (and multi-tenancy)
Networking (and reliability)Client (and resiliency)
Memory
Search engines crave memoryCaches, caches, caches
Field- and filter cachesPage cache
Index building
PostgreSQL
Verifies resource usageSafe >>> fast
Uses disk if necessary
Elasticsearch trusts youBuilt for speed
It'll jump if you ask it to
What could possibly go wrong?
OutOfMemoryError
Woah there
I ate all the memories
Your cluster may or may not work any more
May or may not work?
What else was happening at the time?Corrupt cluster state, crashed Netty, …
In short: Don't end up there
Warning signs?
Monitor cache sizes and heap spaceOutgrowing page cache: gradual slowdown
Outgrowing heap space: sudden crash
Understand the memory profileTest realisticly
Bound cache sizes and flush thresholdsv0.90+ takes you longer with field filters, etc.
Large heaps are expensive to garbage collectKeep heap < 32GiB (But test!)
Lots of page cache is good, though!
Security
Elasticsearch trusts everyoneNot its job to do auth(z)
You're the gatekeeper
_search
Read only?Limit indexes / wrap with filters?
Protect the field caches
Arbitrary code execution
Elasticsearch has powerful scripting Not sandboxedOn by default
Any website can reach your machinehttp://127.0.0.1:9200/_search?callback=capture&source=…
Run in a virtual machine
Networking
Elasticsearch is distributedEasy (for a distributed system)
Supports many usage patterns.
Quite common topologyHigh availability, right?
Obey or risk split brains …… and irrecoverable data-loss
Stormy clouds
Zone vs instance failureThundering herds
Optimizing MTTR is not HA
Client considerations
Idempotent/retry-able requests Use a connection pool.
_bulk / _msearch
Have enough memoryHave a majority of nodes
Don't allow arbitrary search requestsUse retryable requests
Alex over Trondheim, Tore HelgedagsrudElephant, Roy CostelloWingsuit, Richard SchneiderLightning Storm and Stars, Justin EnnisWingsuit flock, Richard SchneiderOh salad, you so funny, Eatliver