Scale Splunk
-
Upload
splunk -
Category
Technology
-
view
7.390 -
download
0
description
Transcript of Scale Splunk
Scaling Splunk 101
Quick Overview of Scaling Splunk with Commodity Hardware
Erik SwanOct, 09
** Slides intentionally ugly, no designers were harmed during construction
Splunk (all in one)
Users
Single Server InstallCommodity Architecture
Data from Splunk Forwarders, Syslog, Files, etc.
Simplest Splunk install is a single server that functions as both indexer and search head.
A single box can easily index 100-200G per day, BUT for fast searching its best to use more than one box.
Improving Search and Indexing Performance
Splunk scales search and indexing performance horizontally by adding more indexers and in some cases scaling out a search tier.
By spreading the incoming load across more indexers you index faster.
Perhaps more importantly, by spreading the indexed data across more indexers your search performance improves linearly as well.
Consider that every doubling of hardware will double your index and search performance and don’t be shy of adding 10’s of servers.
RULE #1 – If your searches are slow, add another box!
Spunk Indexer
Users
Adding a Search Head
Splunk Search Head
Data from Splunk Forwarders, Syslog, Files, etc.
By splitting out a Search Head, search performance is improved and load is taken off the indexer for faster indexing.
Best to add sooner than later.
Best for volumes between 5-100G p/day
1 Indexer1 Search Head
Spunk Indexer
Users
Adding a second Indexer
Splunk Search Head
Spunk Indexer
Data from Splunk Forwarders, Syslog, Files, etc.As volume goes up beyond 100G OR you want to improve search performance its best to add a second Indexer.
**Remember adding indexers improves search performance linearly as well.
Best for volumes 20-200G p/day
2 Indexers1 Search Head
Spunk Indexer
Users
Adding additional Indexers
Splunk Search Head
Spunk Indexer Spunk Indexer(n) Indexers
TBs/day from Splunk Forwarders and SyslogFor every new ~100G, or again to improve search performance add another indexer.
RULE #1: If searches are slow, add an another indexer.
For volumes from 200G-1T p/day
Spunk Indexer
Users
Adding additional Indexers
Splunk Search Head
Spunk Indexer Spunk Indexer(n) Indexers
TBs/day from Splunk Forwarders and SyslogFor every new ~100G, or again to improve search performance add another indexer.
RULE #1: If searches are slow, add an another indexer.
For volumes from 200G-1T p/day
Assume 100G p/day:
Use Case : Log archival and some periodic troubleshooting1 Commodity Server
Use Case #2 : Archival, troubleshooting and summary reporting1 Index Server, 1 Search Server
Use Case #3: Archival, Trouble Shooting, and Reporting2 Index Servers, 1 Search Server
Use Case #4: Many ( >2 ) users doing constant use3+ Index Servers, 1 Search Server
Spunk Indexer
Users
Adding additional Search Heads
Splunk Search Head
Spunk Indexer Spunk Indexer(n) Indexers
Load Bal.
Splunk Search Head
TBs/day from Splunk Forwarders and Syslog
(n) Search Heads1~ 4T each p/day
Adding more Search Heads is a convenient way to improve search performance
Add an additional Search Heads when:1. It makes sense to partition
users.2. Too offload summary or
scheduled searches.
Spunk Indexer
Users
Adding additional Search Heads
Splunk Search Head
Spunk Indexer Spunk Indexer(n) Indexers
Load Bal.
Splunk Search Head
TBs/day from Splunk Forwarders and Syslog
(n) Search Heads1~ 4T each p/day
For every new ~TB p/day, add another search head.
For volumes > 2T p/day
(n) Indexers each <100G p/day(m) Search Heads for every ~1T p/day
Assuming a load of 1T p/day:
Use Case #1: Log archival and some periodic troubleshooting4 Index Servers, 1 Search Server
Use Case #2: Archival, trouble shooting and some summary reporting
8+ Index Server, 1 Search Server
Use Case #3: Archival, Trouble Shooting, and Reporting16+ Index Servers, 1 Search Server
Use Case #4: Many ( >2 ) users doing constant use20+ Index Servers, 1 Search Server
Spunk Indexer
Users
Long term storage, add a SAN
Splunk Search Head
Spunk Indexer Spunk Indexer(n) Indexers
Load Bal.
Splunk Search Head
Tier 1 SAN
TBs/day from Splunk Forwarders and Syslog
Long term storage can not be kept on local commodity IO.
If wanting to keep more than can be kept on local indexer disk, splunk can be configured to use SAN or other storage device.
Best for keeping >30 day – multi year data.
Multi-datacenter or deployment
If you have multiple data centers, it is often best to leave the data local and use distributed search between two deployments.
If you have data that naturally partitions such that users would rarely search across the data, partitioning entire deployments can help.
Obviously for DR as well.
Additional Scaling Topics
Summary Indexing – If your searches are slow consider using summary indexing: 1. video - http://www.splunk.com/view/SP-CAAACZW2. docs -
http://www.splunk.com/base/Documentation/4.0.5/User/UseSummaryIndexingForIncreasedReportingEfficiency
Routing High Volume data to Separate Index – If you are searching or reporting on a source that is dwarfed by the volume of another source, you can partition data such that the high volume source is in its own index: 3. docs -
http://www.splunk.com/base/Documentation/latest/Admin/Setupmultipleindexes#Why_have_multiple_indexes.3F