Splunk Dynamic lookup

44
Dynamic Lookups

Transcript of Splunk Dynamic lookup

Page 1: Splunk Dynamic lookup

Dynamic Lookups

Page 2: Splunk Dynamic lookup

Agenda

Lookups in General

Static Lookups

Dynamic Lookups- Retrieve fields from a web site- Retrieve fields from a database- Retrieve fields from a persistent cache

2

Page 3: Splunk Dynamic lookup

Enrich Your Events with Fields from External Sources

3

Page 4: Splunk Dynamic lookup

4

Splunk: The Engine for Machine Data

Web logsLog4J, JMS, JMX.NET eventsCode and scripts

ConfigurationssyslogSNMPnetflow

ConfigurationsAudit/query logsTablesSchemas

HypervisorGuest OS, AppsCloud

ConfigurationssyslogFile systemps, iostat, top

RegistryEvent logsFile systemsysinternals

Logfiles Configs Messages Traps Alerts

Metrics Scripts TicketsChanges

Linux/UnixWindows NetworkingDatabasesApplicationsVirtualization

& Cloud

Click-stream dataShopping cart dataOnline transaction data

Customer Facing Data

Outside the Datacenter

Manufacturing, logistics…CDRs & IPDRsPower consumptionRFID dataGPS data

Page 5: Splunk Dynamic lookup

5

Page 6: Splunk Dynamic lookup

6

Page 7: Splunk Dynamic lookup

7

Page 8: Splunk Dynamic lookup

8

Page 9: Splunk Dynamic lookup

Interesting Things to Lookup

• User’s Mailing Address• Error Code Descriptions• Product Names• Stock Symbol (from CUSIP)

• External Host Address• Database Query• Web Service Call for Status• Geo Location

9

Page 10: Splunk Dynamic lookup

Other Reasons For Lookup

10

• Bypass static developer or vendor that does not enrich logs• Imaginative correlations• Example: A website URL with “Like” or “Dislike” count

stored in external source• Make your data more interesting• Better to see textual descriptions than arcane codes

Page 11: Splunk Dynamic lookup

Agenda

Lookups in General

Static Lookups

Dynamic Lookups- Retrieve fields from a web site- Retrieve fields from a database- Retrieve fields from a persistent cache

11

Page 12: Splunk Dynamic lookup

Static vs. Dynamic Lookup

12

Static

Dynamic

External Data comes from a CSV file

External Data comes from output of external script, which resembles a CSV file

Page 13: Splunk Dynamic lookup

Static Lookup Review

13

• Pick the input fields that will be used to get output fields• Create or locate a CSV file that has all the fields you need in the

proper order• Tell Splunk via the Manager about your CSV file and your lookup• You can also define lookups manually via props.conf and

transforms.conf• If you use automatic lookups, they will run every time the

source, sourcetype or associated host stanza is used in a search• Non-automatic lookups run only when the lookup command is

invoked in the search

Page 14: Splunk Dynamic lookup

Example Static Lookup Conf Files

14

props.conf[access_combined]

lookup_http = http_status statusOUTPUT status_description, status_type

transforms.conf[http_status]

filename = http_status.csv

Page 15: Splunk Dynamic lookup

PermissionsDefine Lookups via Splunk Manager & set permissions there

15

local.meta

[lookups/http_status.csv]export = system

[transforms/http_status]export = system

Page 16: Splunk Dynamic lookup

Example Automatic Static Lookup

16

Page 17: Splunk Dynamic lookup

Agenda

Lookups in General

Static Lookups

Dynamic Lookups- Retrieve fields from a web site- Retrieve fields from a database- Retrieve fields from a persistent cache

17

Page 18: Splunk Dynamic lookup

Dynamic Lookups

18

• Write the script to simulate access to external source

• Test the script with one set of inputs

• Create the Splunk Version of the lookup script

• Register the script with Splunk via Manager or conf files

• Test the script explicitly before using automatic lookups

Page 19: Splunk Dynamic lookup

Lookups vs Custom Command

19

• Use dynamic lookups when returning fields given input fields

• Standard use case for users who already are familiar with lookups

• Use a custom command when doing MORE than a lookup

• Not all use cases involve just returning fields

• Decrypt event data

• Translate event data from one format to another with new fields

(e.g. FIX)

Page 20: Splunk Dynamic lookup

Write/Test External Field Gathering Script

20

External Data inCloud Your Python Script

Send: Input Fields

Return: Output Fields

Page 21: Splunk Dynamic lookup

Example Script to Test External Lookup

21

# Given a host, find the corresponding IP address

def mylookup(host):

try:

ipaddrlist = socket.gethostbyname_ex(host)

return ipaddrlist

except:

return[]

Page 22: Splunk Dynamic lookup

External Field Gathering Script with Splunk

22

External Data inCloud Your Python Script

Return: Output Fields

Page 23: Splunk Dynamic lookup

Script for Splunk Simulates Reading Input CSV

23

hostname, ip

a.b.c.com

zorrosty.com

seemanny.com

Page 24: Splunk Dynamic lookup

Output of Script Returns Logically Complete CSV

24

hostname, ip

a.b.c.com, 1.2.3.4

zorrosty.com, 192.168.1.10

seemanny.com, 10.10.2.10

Page 25: Splunk Dynamic lookup

transforms.conf for Dynamic Lookup

25

[NameofLookup]

external_cmd = <name>.py field1….fieldN

external_type = python

fields_list = field1, …, fieldN

Page 26: Splunk Dynamic lookup

Example Dynamic Lookup conf files

26

transforms.conf# Note – this is an explicit lookup

[whoisLookup]external_cmd = whois_lookup.py ip whoisexternal_type = pythonfields_list = ip, whois

Page 27: Splunk Dynamic lookup

Dynamic Lookup Python Flow

27

def lookup(input): Perform external lookup based on input. Return result

main()Check standard input for CSV headers.

Write headers to standard output.

For each line in standard input (input fields): Gather input fields into a dictionary (key-value structure) ret = lookup(input fields) If ret: Send to standard output input values and return values from lookup

Page 28: Splunk Dynamic lookup

Whois Lookup

28

def main():

if len(sys.arv) != 3:

print “Usage: python whois_lookup.py [ip field]

[whois field]”

sys.exit(0)

ipf = sys.argv[1]

whoisf = sys.argv[2]

r = csv.reader(sys.stdin)

w = none

header = [ ]

first = True…

Page 29: Splunk Dynamic lookup

Whois Lookup (cont.) to Read CSV Header

29

# First get read the “CSV Header” and output the field names

for line in r:

if first:

header = line

if whoisf not in header or ipf not in header:

print “IP and whois fields must exist in CSV

data”

sys.exit(0)

csv.write(sys.stdout).writerow(header)

w = csv.DictWriter(sys.stdout, header)

first = False continue…

Page 30: Splunk Dynamic lookup

Whois Lookup (cont.) to Populate Input Fields

30

# Read the result and populate the values for the

input fields (ip address in our case)

result = {}

i = 0

while i < len(header):

if i < len(line):

result[header[i]] = line[i]

else:

result[header[i]] = ''

i += 1

Page 31: Splunk Dynamic lookup

Whois Lookup (cont.) to Populate Input Fields

31

# Perform the whois lookup if necessary

if len(result[ipf]) and len(result[whoisf]):

w.writerow(result)

# Else call external website to get whois field from

the ip address as the key

elif len(result[ipf]):

result[whoisf] = lookup(result[ipf])

if len(result[whoisf]):

w.writerow(result)

Page 32: Splunk Dynamic lookup

Whois Lookup Function

32

LOCATION_URL=http://some.url.com?query=

# Given an ip, return the whois response

def lookup(ip):

try:

whois_ret = urllib.urlopen(LOCATION_URL + ip)

lines = whois_ret.readlines()

return lines

except:

return ''

Page 33: Splunk Dynamic lookup

Database Lookup

33

• Acquire proper modules to connect to the database

• Connect and authenticate to database

• Use a connection pool if possible

• Have lookup function query the database

• Return a list([]) of results

Page 34: Splunk Dynamic lookup

Database Lookup vs. Database Sent To Index

34

• Well, it depends…• Use a Lookup when:• Using needle in the haystack searches with a few users• Using form searches returning few results

• Index the database table or view when:• Having LOTS of users and ad hoc reporting is needed• It’s OK to have “stale” data (N minutes) old for a dynamic

database

Page 35: Splunk Dynamic lookup

Example Database Lookup using MySQL

35

# First connect to DB outside of the for loop

conn = MySQLdb.connect(host = “localhost”, user = “name of user”,passwd = “password”,db = “Name of DB”)

cursor = conn.cursor()

Page 36: Splunk Dynamic lookup

Example Database Lookup (cont.) using MySQL

36

import MySQLdb…

# Given a city, find its country

def lookup(city, cur):

try:

selString=“SELECT country FROM city_country where city=“

cur.execute(selString + “\”” + city + “\””)

row = cur.fetechone()

return row[0]

except:

return []

Page 37: Splunk Dynamic lookup

Lookup Using Key Value Persistent Cache

37

• Download and install Redis• Download and install Redis Python module• Import Redis module in Python and populate

key value DB• Import Redis module in lookup function

given to Splunk to lookup a value given a key

Redis is an open source, advanced key-value store.

Page 38: Splunk Dynamic lookup

Redis Lookup

38

###CHANGE PATH According to your REDIS install ######

sys.path.append(“/Library/Python/2.6/…/redis-2.4.5-py.egg”)

import redis

def main()

#Connect to redis – Change for your distribution

pool = redis.ConnectionPool(host=‘localhost’,port=6379,db=0)

redp = redis.Redis(connection_pool=pool)

Page 39: Splunk Dynamic lookup

Redis Lookup (cont.)

39

def lookup(redp, mykey):

try: return redp.get(mykey)

except: return “”

Page 40: Splunk Dynamic lookup

Combine Persistent Cache with External Lookup

40

• For data that is “relatively static”• First see if the data is in the persistent cache• If not, look it up in the external source such as a database or

web service• If results come back, add results to the persistent cache and

return results• For data that changes often, you will need to create your own cache

retention policies

Page 41: Splunk Dynamic lookup

Combining Redis with Whois Lookup

41

def lookup(redp, ip): try: ret = redp.get(ip) if ret!=None and ret!='': return ret else: whois_ret = urllib.urlopen(LOCATION_URL + ip) lines = whois_ret.readlines() if lines!='': redp.set(ip, lines) return lines… except:

Page 42: Splunk Dynamic lookup

Where do I get the add-ons from today?Splunkbase!

42

Add-On Download Location Release

Whoishttp://splunk-base.splunk.com/apps/22381/whois-add-on

4.x

DBLookuphttp://splunk-base.splunk.com/apps/22394/example-lookup-using-a-database

4.x

Redis Lookuphttp://splunk-base.splunk.com/apps/27106/redis-lookup

4.x

Geo IP Lookup (not in these slides)

http://splunk-base.splunk.com/apps/22282/geo-location-lookup-script-powered-by-maxmind

4.x

Page 43: Splunk Dynamic lookup

43

Conclusion

Lookups are a powerful way to enhance your search experience beyond indexing

the data.

Page 44: Splunk Dynamic lookup

Thank You