Splunk DBX 1.0.9 DeployDBX

Splunk DB Connect 1.0.9

Deploy and Use Splunk DB Connect

Generated: 5/16/2013 5:58 pm

Copyright © 2013 Splunk, Inc. All Rights Reserved

Table of ContentsIntroduction..........................................................................................................1

About Splunk DB Connect........................................................................1 How Splunk DB Connect fits into the Splunk picture................................1 How to get support and find out more information about Splunk..............1

Before you deploy................................................................................................3 Deployment requirements.........................................................................3 Release notes...........................................................................................4 Security and access controls....................................................................5 Architecture and performance considerations...........................................8

Install Splunk DB Connect................................................................................10 Install and configure Splunk DB Connect................................................10 Install required database JDBC drivers for MySQL and Oracle if

needed.....................................................................................................13 Define and add a new/custom database to DB Connect........................14

Configure and use Splunk DB Connect...........................................................17 Add or manage a database connection..................................................17 Configure database inputs......................................................................19 Set up a database lookup table...............................................................23 Use the Splunk DB Connect search commands.....................................24 Troubleshoot Splunk DB Connect...........................................................26

Configuration file reference..............................................................................29 Configuration file reference.....................................................................29 database.conf.spec.................................................................................29 database_types.conf.spec......................................................................30 dblookup.conf.spec.................................................................................32 dboutput.conf.spec..................................................................................32 java.conf.spec.........................................................................................33 inputs.conf.spec......................................................................................37

i

Introduction

About Splunk DB Connect

This app enables you to enrich and combine machine data with database data.Splunk DB Connect enables you to easily configure database queries andlookups in minutes via the Splunk user interface.

Quickly deploy Splunk for real-time collection, indexing, analysis andvisualizations of machine data and then import and index data already stored inyour database for additional analytic insight. Furthermore, database lookupsenable you to reference fields in an external database that match fields in yourevent data. Using this match, you can enrich your event data by adding moremeaningful information and searchable fields to them.

How Splunk DB Connect fits into the Splunkpicture

Splunk DB Connect is one of a variety of apps and add-ons available within theSplunk ecosystem. All Splunk apps and add-ons run on top of a core Splunkinstallation, so you'll be installing Splunk first, and then installing Splunk DBConnect.

For specifics about what you'll install where, see "Install Splunk DBConnect" in this manual.

•

For details about apps and add-ons, refer to "What are apps andadd-ons?" in the Splunk Admin Manual.

•

To download Splunk, visit the download page on splunk.com.• To get more apps and add-ons, visit Splunkbase.•

How to get support and find out more informationabout Splunk

Splunk DB Connect version 1.0.8 and later is officially Splunk Supported.

1

To file a support case about Splunk DB Connect, send email [email protected] or use the Support Portal.

Find more information about Splunk

You have a variety of options for finding more information about Splunk:

The core Splunk platform documentation• Splunk Answers• The #splunk IRC channel on EFNET•

2

Before you deploy

Deployment requirements

What databases are supported?

Splunk tests and supports connecting to the following databases:

Oracle Database• Microsoft SQL Server• MySQL•

Additionally, though unsupported, the following databases can be connected toout of the box:

Sybase• PostgreSQL• SQLite• H2• HyperSQL• Generic ODBC support•

You can also add your own database types by providing JDBC drivers.

What versions of Splunk are supported?

Splunk 4.3 or later.

What operating systems are supported?

Splunk DB Connect runs on the Splunk-supported versions of the following:

Linux• Mac OS X• Windows Server 2003/2008R2• Windows XP/7 (for development/testing purposes)•

3

Additional software requirements

The following are required before deploying Splunk DB Connect:

A Java Runtime Environment (JRE) Version 1.6 or aboveDo not use a hotspot (client mode only) JVM.♦ http://www.oracle.com/technetwork/java/javase/downloads/index.html♦

•

Deployment Checklist

For a quick install, you'll want to have:

The Splunk DB Connect bits• Connectivity to the database (machine to query, path through network,etc)

•

Credentials to access the data in the database• Required queries or schema• JDBC drivers for MySQL and Oracle and anything else listed on theSplunk DB Connect's Splunkbase details page

•

Release notes

This topic contains listings of known issues and resolved issues for this release.

Known issues

The following issues have been reported in this release of Splunk DB Connect:

Manual use of local = 1 is required for lookup in distributed environment.Refer to this topic for details. (DBX-92)

•

Lookup failed due to spaces in table column name. Workaround is to usean advanced database lookup with custom sql like: "SELECT [the field] asthe_field" (DBX-100)

•

DB Connect does not work with Splunk search head pooling, theJavaBridge doesn't start (DBX-14)

•

In dbquery view, syntax highlighted editor doesn't work on IE7 and 8.Workaround is to disable syntax highlighting in the UI. (DBX-23)

•

CSS for autocomplete in Splunk Manager displays incorrectly in IE 7 and8. (DBX-23)

•

4

Changelog

The following issues have been resolved in this release of Splunk DB Connect:

Can't modify permissions from the UI for dbquery and dbinfo commands.Workaround by modifying [commands] stanza from default.meta intolocal.meta (DBX-105)

•

Race condition in Fill All Columns button in lookups manager UI can resultin %s/%s popup message which goes away after a few moments.(DBX-90)

•

Lookup only grabs the first match in the DB, even though the lookupshould be returning multiple rows (DBX-102). This is now configurable indblookup.conf.

•

Conflicting dblookup.conf stanzas cause "Script for lookup table <lookupname> returned error code 1. Results may be incorrect" error (DBX-119)

•

Postgres issue with OOM in Java heap space (DBX-127)• Password are not automatically encrypted if they are not entered in the UI(DBX-120)

•

Fetch Database Names button logs password in web access logs(DBX-125)

•

Can't connect to Access MDB via ODBC. Validation fails (DBX-117)• dbquery did not return the columns in the correct order (DBX 130)•

Security and access controls

This topic describes how Splunk DB Connect handles credentials for databaseaccess.

Access to Database Connections

In Splunk DB Connect versions 1.0.8 and earlier, database connection objectscannot be restricted to a particular role. When creating a database connection,the credentials you use will be implicitly used by every user that has access todbquery, dblookup, or any other commands that use the connection. Forinstance, dbquery myConnection "SELECT * FROM Audit_Table" will not checkwhether the executing user has rights to the myConnection object. You can,however, limit which roles have access to the dbquery command. By default, onlyadmins have access to dbquery, dblookup, and dboutput commands.

Make sure you use a database account with appropriately limited permissions.The recommended solution to work with databases regarding security (both

5

read-only and read-write), is to limit the permissions of the database user,specified in the database connection, to the minimum necessary to fulfil its tasks.i.e. the user should only have read access (SELECT) to required tables/views. Incase of dboutput the user should be granted limited write access as well(INSERT, UPDATE). This configuration needs to be done on the DBMS side - sodescribing the necessary steps for each DBMS type is out of scope for thesedocs.

An additional mitigation is to configure the database connection as read-only.Finally, one could also have separate search heads for different user access todatabase connections.

The config for default permissions is found in$SPLUNK_HOME/etc/apps/dbx/metadata/default.meta:

[] access = read : [ admin ], write : [ admin ]

### Manager ###

[manager] access = read : [ * ], write : [ admin ] export = system

[manager/databases] access = read : [ admin ], write : [ admin ] export = system

[manager/dbmon] access = read : [ admin ], write : [ admin ] export = system

[manager/dblookups] access = read : [ admin ], write : [ admin ] export = system

### Commands ###

[commands] access = read : [ admin ], write : [ admin ] export = system

[commands/dbquery] access = read : [ admin ], write : [ admin ] export = system

[commands/dbinput]

6

access = read : [ admin ], write : [ admin ] export = system

[commands/dbinfo] access = read : [ admin ], write : [ admin ] export = system

[commands/dbmonpreview] access = read : [ admin ], write : [ admin ] export = none

### Other settings ###

[inputs/dbmon-*] access = read : [ admin ], write : [ admin ]

[savedsearches] access = read : [ admin ], write : [ admin ] export = none

[props] access = read : [ * ], write : [ admin, power ] export = system

[transforms] access = read : [ * ], write : [ admin, power ] export = system

[eventtypes] access = read : [ * ], write : [ admin, power ] export = system

[lookups] export = system

[searchscripts] access = read : [ * ], write : [ admin ] export = system

Read-Only Connections

When a database connection is configured to be read-only (this can be donethrough the manager UI), DB Connect will set the JDBC connection flag"read-only" to true. This is done with the following Java method:http://docs.oracle.com/javase/6/docs/api/java/sql/Connection.html#setReadOnly(boolean)

It is up to each JDBC driver to implement this setting when establishing theconnection. In addition, dbquery uses the JDBC method executeQuery(String)instead of execute(String), which is designed to only execute queries (as

7

opposed to updates). With this method, most JDBC drivers do the appropriatething to validate statements before they are actually sent to the database server.Finally, the dboutput command does an addition check to see if the databaseconnection is configured as read-only before submitting the statement to theJDBC driver for execution.

Granting non-admin user access to dbquery

As an admin, you may build some dashboards that uses dbquery and want toshare those dashboards with non-admin users. Because of a known issue, DBConnect version 1.0.8 and earlier prevents you from enabling access to dbqueryfrom the UI. For a work-around, please see:http://splunk-base.splunk.com/answers/76006/dbquery-command-permissions

Keep in mind that enabling this access allows users to do ad-hoc queries;moreover if they know the name of a DB Connection they can issue ad-hocqueries to that database. Please see the suggested mitigations in the Access toDatabase Connections section above.

Architecture and performance considerations

If you have a trial or personal Splunk deployment running on a single host(indexer and Splunk Web both running on the same system), you can installSplunk DB Connect on this system.

To use Splunk DB Connect for reporting or database lookups in a distributedsearch environment, you must install it on a search head.

Note: In a distributed search environment, you must force the lookup to beperformed on the search head where Splunk DB Connect is installed. To forcethe lookup to be performed locally, add local=1 after the lookup command.

Example:

index=test | lookup local=1 mysql_table ip_address as clientip OUTPUT

host | table clientip, host

This is not currently possible when using automatic lookups. For moreinformation about automatic lookups, refer to this topic in the core Splunkplatform documentation.

For database inputs, depending on the anticipated volume of your deployment,

8

there are 3 options:

Small scale: install Splunk DB Connect on a search head for monitoringand configure it to forward events to the indexer(s)

•

Medium scale: use a dedicated Splunk heavy forwarder to performmonitoring and forward events to indexer(s).

•

Large scale: Use multiple dedicated Splunk forwarders and partition themonitors among them.

•

Search head pooling and Splunk DB Connect

Splunk DB Connect does not currently support Splunk's search head poolingfunctionality. To run DB Connect, you must install it on a standalone Splunkinstance or on a separate search head that is not a member of a pool.

Performance considerations

Because Splunk DB Connect queries your database, there is a possiblity thatyour queries may impact that database's performance. In particular, if the initialrun of your query to the database retrieves a lot of data, this may affect theperformance of your database. Subsequent runs of the query should generally beless impactful, as they are only retrieving data that is new since the previous runof the query. To mitigate this, you can set the tail.follow.only directive, which isonly exposed in inputs.conf.

Lookups generate multiple selects that should be within the expected workloadfor a database and should not affect performance. Splunk DB Connect willexecute a separate SELECT statement for each unique combination of inputfields. This may happen more than once per search, because the search previewfunction in Splunk may invoke the lookup multiple times during execution of asearch for parts of the results. Splunk will not cache the results betweeninvocations of the lookup.

9

Install Splunk DB Connect

Install and configure Splunk DB Connect

This procedure describes how to install and configure Splunk DB Connect. Itassumes that you have an existing Splunk instance to use as the underlyingplatform. For information on installing Splunk, refer to "Before you install" in thecore Splunk platform documentation.

Install Splunk DB Connect

The easiest way to install Splunk DB Connect is to use Splunk Manager. To dothis:

1. Download Splunk DB Connect from Splunkbase and save it to a locationaccessible from your Splunk instance.

2. Log into your Splunk instance, navigate to Manager > Apps and click Installapp from file.

3. Select the app package dbx-....tar.gz and upload it.

4. When the upload is complete, follow the instuctions to restart Splunk.

Upgrade from a previous version

Upgrading from an earlier version of Splunk DB Connect is similar to installing itfrom scratch:

1. Download the latest Splunk DB Connect from Splunkbase.

2. Log into your Splunk instance, navigate to Manager > Apps and click Installapp from file.

3. Select the app package and check the box to upgrade it.

4. When the upgrade is complete, follow the instructions to restart Splunk.

10

Configure Splunk DB Connect

UI Setup

To complete the app setup from the UI, navigate to the app homepage. You willbe presented with a setup page with some pre-populated values. These defaultvalues should be appropriate for most use cases.

Note: You must click the Save button at least once because this will enable theJBridge server (you can verify that this is the case by checking that the scriptedinput jbridge_server.py is enabled).

Command Line Setup

As an alternative you can setup DB Connect by hand without the setup page.Here are the steps to do so:

1. Create $SPLUNK_HOME/etc/apps/dbx/local/app.conf

[install]is_configured = 1

2. Create $SPLUNK_HOME/etc/apps/dbx/local/java.conf

[java]home = <JAVA_HOME path here>

Note: (JAVA_HOME is usually the path up to but not including the bin directorywhere the java binary lives).

3. Enable the Java Bridge server (scripted input) in$SPLUNK_HOME/etc/apps/dbx/local/inputs.conf

[script://$SPLUNK_HOME/etc/apps/dbx/bin/jbridge_server.py]disabled = 0

4. Create the sink for database inputs in$SPLUNK_HOME/etc/apps/dbx/local/inputs.conf

[batch://$SPLUNK_HOME/var/spool/dbmon/*.dbmonevt]crcSalt = <SOURCE>

11

disabled = 0move_policy = sinkholesourcetype = dbmon:spool

5. Restart Splunk

Advanced Setup Options

The following information can be used in conjunction with the config files to makeadvanced configuration changes.

Java

Java Installation directory (JAVA_HOME): This must be the directory ofyour Java JRE (Runtime Environment). The default value will be retrievedby your JAVA_HOME environment variable (if set properly).

•

Java command line options: These command line parameters will be usedwhen starting your Java instance. You can specify maximum memoryusage, localization, and default file encoding.

•

Important: Incorrect format of this field can mean that the extension will not startcorrectly

Java Bridge Server

Address: The IP of your Java Bridge Server (this will typically be 127.0.0.1(localhost))

•

Port: The port of your Java Bridge Server. Default is 17865.•

Important: There must not be any firewall rules activated for this port.

Threads: Number of listening threads to work on Java Bridge commands.•

Note: Too many or not enough threads could lead to a slow performance of thejava bridge service

Turn on debugging: When enabled, the Java Bridge will log any debuginformation jbridge_client.log.

•

Important: Enabling debugging will have negative impact on Splunk DBConnect's performance; do not use this in a production environment.

12

Logging configuration

Log level: Logging severity for Splunk DB Connect.• Logfile: The name of the Splunk DB Connect logfile, located in$SPLUNK_HOME/var/log/splunk and contains all logs of the Java bridgeserver.

•

Configuration adapter

Adapter type• Enable configration caching•

Database Connection Handling

Factory Type• Enable connection pooling• Cache database and table metadata• Preload database configuration•

Database Inputs

Scheduler Threads• Output Type• Default timestamp output format•

Database Lookups

Enable caching of database lookup definitions• Cache invalidation timeout•

Persistence

Global Store type•

Install required database JDBC drivers for MySQLand Oracle if needed

Most of the databases in the list in "About Splunk DB Connect" are preconfiguredin DB Connect and only require that you add a database connection and defineinputs for that database.

13

However, if you're planning to connect a MySQL or Oracle database to Splunkusing Splunk DB Connect, you must download and install the relevant JDBCdrivers:

MySQLhttp://dev.mysql.com/downloads/connector/j/♦ The archive contains the JDBC driver:mysql-connector-java-*-bin.jar

♦

Note: Use version 5.1.7 or newer.♦

•

Oracle JDBCojdbc6.jar fromhttp://www.oracle.com/technetwork/database/enterprise-edition/jdbc-111060-084321.html

♦ •

Install the drivers

Once you've downloaded the drivers, copy them into$SPLUNK_HOME/etc/apps/dbx/bin/lib

and then restart Splunk.

Adding other databases that aren't in the list?

If you want to add a custom database that is not in the list in "About Splunk DBConnect", refer to "Define and add a new/custom database" for instructions.

Define and add a new/custom database to DBConnect

In addition to the databases predefined and supported in the shipping Splunk DBConnect package (see the list in "About Splunk DB Connect"), you can addconnection support for any custom-defined database that has JDBC drivers.

Note: At a minimum, Splunk DB Connect supports querying custom-defineddatabase connections. For some custom database connections, features relatedto generation of queries may not work. Additionally, depending on the JDBCdriver's implementation of database-metadata, the custom dbinfo searchcommand may not work.

14

Download and install the relevant JDBC driver

Before you can complete the process of adding a new database connection thatisn't already included in the shipping Splunk DB Connect package, you must firstdownload that database's Java Database Connectivity (JDBC) driver and copythe .jar file to $SPLUNK_HOME/etc/apps/dbx/bin/lib.

Add the custom database to database_types.conf

If you're adding a new (not in the list in "About Splunk DB Connect") databaseconnection, you must define it by creating a stanza for it in a copy ofdatabase_types.conf.

Important: Do not edit this file in $SPLUNK_HOME/etc/apps/dbx/default, butinstead create and then edit a copy of it in $SPLUNK_HOME/etc/apps/dbx/local.For more information about precedence and Splunk configuration files, refer to"About configuration files" in the core Splunk platform documentation.

Example new database stanza in database_types.conf

[postgresql] displayName = PostgreSQL jdbcDriverClass = org.postgresql.Driver defaultPort = 5432 connectionUrlFormat = jdbc:postgresql://{0}:{1}/{2} testQuery = SELECT 1 AS test defaultCatalogName = postgres defaultSchema = public

Connection Validation

Every time a connection is re-used in the pool DB Connect will try to validate thatthe database connection is actually working. If validation fails, you will probablysee an error message like "ValidateObject failed".

There are two ways DB Connect tries to validate a connection:

If a testQuery is specified in database_types.conf, DB Connect willexecute that query, and a response will validate that the databaseconnection is working.

•

If testQuery is not specified, DB Connect will try to use the Java methodconnection.isValid(), and rely on the JDBC driver to answer. Some JDBCdrivers do not implement this API call (seems like Derby is build against

•

15

Java 1.5 source, where JDBC doesn't have the method isValid). Theworkaround is to specify a manual testQuery. The simplest one is SELECT1.

Note: As of 1.0.9, you can disable connection validation by settingvalidationDisabled=true in database_types.conf.

Add the database connection in Manager

Once you've defined the new database, proceed to "Add or manage a databaseconnection" to continue configuration.

16

Configure and use Splunk DB Connect

Add or manage a database connection

Before working with a database, you have to establish a connection to it. Once aconnection is configured, you will be able to use it in queries, lookups, inputs,and outputs.

Note: If you are adding a database connection for a database that is notsupported out-of-the-box (not in the list in "About Splunk DB Connect"), follow thesteps in "Define and add a new/custom database to DB Connect" beforefollowing the steps in this procedure.

Create a new database connection

1. Log into Splunk Manager as a user with the Admin role and navigate toManager > External Databases (under the Data section of Manager).

2. Click Add New. The "Add new database" panel is displayed.

3. Supply the following information:

Unique name: Enter a unique name that identifies this new databaseconnection. You will reference the database using this name from SplunkDB Connect commands, lookups, and monitors.

•

Database type: The type of database to which you want to connect.• Hostname or IP address: The hostname or IP address of the databaseserver. For local database types (such as SQLite or ODBC) you can useany value (for example, "localhost") here.

•

Port: The TCP Port to connect to. You can leave this field empty if you'reusing the default port of the selected database type or if it is a localdatabase.

•

Database name or Oracle SID: You can leave this field empty to connectto the default database, if the selected database type supports this. Pressthe button to get a list of available database names for the enteredconnection information. Note: When adding a local database such asSQLite, specify the fully qualified path to the database file. Alternativelyyou can place the SQLite file into $SPLUNK_HOME/var/dbx (you might needto create this directory) and name it as database_name.sqlitedb, then youcan use "database_name" instead of the fully qualified path.

•

17

Username and Password: If the database connection requires usernameand password for authentication, provide them here. For Windows users,you can use the following notation in the username field:<DOMAIN>\<USERNAME>. arg.useNTLMv2 = true is implied if you use thisnotation. You can override this in the config file.

•

Read-Only: You can set the database connection to read-only. If this isenabled, Splunk DB Connect will not send run any modifying SQLstatements against the database. The dbupdate command will not work.

•

Validating the database connection information: When this checkboxis enabled, Splunk DB Connect will try to connect to the database beforesaving the connection information. If the connection doesn't succeed youwill see an error message.

•

Update or delete database connections

You can update and delete database connections via the same Manager panel.When you make a change, Splunk will automatically reload the database list inthe Java Bridge Server (JBS).

Manage database connections via configuration files

You can manage your database connections by editing a copy of thedatabase.conf file (and if you're adding a custom database that is not supportedout-of-the-box) the database_types.conf file.

Important: Do not edit these files in $SPLUNK_HOME/etc/apps/dbx/default, butinstead create and then edit a copy of this file in$SPLUNK_HOME/etc/apps/dbx/local. For more information about precedence andSplunk configuration files, refer to "About configuration files" in the core Splunkplatform documentation.

After editing database.conf, you must either restart Splunk (which will restart theJBS as well) or reload Splunk via the following command:

splunk cmd python $SPLUNK_HOME/etc/apps/dbx/bin/reload.py databases

The JBS will pick up the modifications and will automatically encrypt plainpasswords in the configuration files.

18

Configure database inputs

Database inputs allow you to fetch data from databases and index that data withSplunk. Unlike standard inputs, data from database inputs are retrieved on aregular basis (based on a schedule) by the DBmon scheduler.

Note: Because Splunk DB Connect queries your database, there is a possiblitythat your queries may impact that database's performance. In particular, if theinitial run of your tail query to the database retrieves a lot of data, this may affectthe performance of your database. Subsequent runs of the query shouldgenerally be less impactful, as they are only retrieving data that is new since theprevious run of the query.

To add a database input:

1. Log into Splunk Web and navigate to Manager > Data inputs and clickDatabase Inputs, then New.

2. Specify a unique name for your input, and use the information in the followingsections to configure your input.

Monitor Types

When configuring a database input, you have the following choices for inputtypes:

Dump

A dump input is simplest case. It will essentially execute the same query everytime it runs and will output all results. If you do not specify an interval in theschedule, it will run only once.

Tail

A tail monitor works similar to the tail input monitor in Splunk file input monitorfunctionality. It will determine new records in the given table and will output onlythose. In order for Splunk to know when there is new data in your database, youmust provide a column within the monitored table (or within the query result) thathas a value that will always be greater than the value in any older record.

Reasonable choices for this tail.rising.column are:

19

Auto incremented values (like auto IDs or sequence filled columns)• Creation or Update timestamps•

Scheduling

This setting specifies how often the database inputs are executed. There are 3different ways on how to define such schedules:

Automatic

This is the default mode for tail and dump inputs. The database input willautomatically select the delay between the executions of the data retrieval querydepending on the quantity of results it produces and how long it takes to executethe query. It will execute more frequently on tables with many new records inevery run.

To use this mode, select auto.

If this schedule type is used, the database input query runs the first timeimmediately after startup and will then determine the delay for subsequentexecutions.

Fixed delay

This is a static value: the amount of time the database input query should waitbetween executions. This can be expressed as a relative time expression or as anumber of seconds.

Examples:

1h (a fixed delay of 1 hour)• 3 (3 seconds)•

If this schedule type is used, the input query will run immediately after startup forthe first time and will then use the specified delay betweens subsequentexecutions.

Cron expression

This schedule type allows you to specify a cron expression for when the monitorgets executed.

Examples:

20

0/5 * * * * (Every 5 minutes)30 18 * MON-FRI * (Every weekday at 6.30 pm)

Note: Database inputs do not keep track of executions that are missed whenthey are not running (for example, when Splunk is stopped).

Query generation

You can specify just the table name or view you want to monitor and let thedatabase input generate a SQL query for you, or you can specify the SQL queryyourself, which provides flexibility and allows you to use certain conditions or joinother tables, etc.

To specify a custom query, place the where clause in curly braces {{...}}. Theliteral $rising_column$ will be replaced with the name specified in the risingcolumn setting. The literal ? will be substituted with the checkpoint value.Example: SELECT * FROM my_table {{WHERE $rising_column$ > ?}}

For the initial run (for example, if there's no checkpoint state for the inputyet), DB Connect will execute the query without the part within the curlybraces {{...}}

•

For any of the following queries, DB Connect will execute the query withthe part in the curly braces.

•

When your rising column is a date, make sure you wrap the checkpointparameter in a to_date, such as: {{AND $rising_column$ >to_date(?,'YYYY-MM-DD"T"HH:MI:SS')}}. The format you use must be thesame as the format that you selected.

•

For Oracle make sure you put the name of the rising column in uppercase.

•

Output formatting (including timestamps)

These settings determine how the results are converted into a text-based formatSplunk can index.

Formats

Key-Value based• Multiline Key-Value based• CSV• Template based• Timestamp output•

21

Output Timestamps

You want to make sure to get this right, otherwise your line merging may notwork as desired and multiple events may get indexed as a single clump.

You can either enable or disable the output of a timestamp value. If enabled, theevent is prefixed with the timestamp value. If you specify a timestamp column,the timestamp value is fetched from the given column of each database resultrow. Otherwise the current time is used.

Splunk DB connect expects the timestamp column in your database to be of typedatetime/timestamp. If it is not (for example, it is in format char/varchar/etc.), youmust check the Output timestamp box and specify theoutput.timestamp.parse.format so that DB Connect can obey the timestampoutput format setting.

For example, if the database column EVENT_TIME contains strings (for example,CHAR, VARCHAR, VARCHAR2, etc) with values like 01/26/2013 03:03:25.255then you must specify the parse format in the appropriate copy of inputs.conf:

output.timestamp = trueoutput.timestamp.column = EVENT_TIMEoutput.timestamp.parse.format = MM/dd/yyyy HH:mm:ss.SSS

Seeing incorrectly merged or split events? If you are seeing merged orincorrectly split events, check out "Issues with bad linebreaking" in theTroubleshooting topic later in this manual for more information.

Working with Splunk DB Connect and custom source types

If the data from your database is not in a common format, you may want tocreate a custom source type to tell Splunk how to handle it. For more informationabout source types, refer to "Why source types matter (a lot)" in the core Splunkproduct documentation.

To create a custom source type, you must create line-breaking/timestampingsettings manually in the appropriate copy of props.conf. In most cases, you canjust copy the settings from the corresponding default source type. Here's whatyou need for the various timestamp formats available in DB Connect:

Key-Value:

22

SHOULD_LINEMERGE = falseLINE_BREAKER = ([\r\n]+)

Multiline Key-Value:

KV_MODE = noneREPORT-mkv = dbx-mkvSHOULD_LINEMERGE = falseLINE_BREAKER = ([\r\n]---91827349873-dbx-end-of-event---[\r\n])LINE_BREAKER_LOOKBEHIND = 10000

Template:

SHOULD_LINEMERGE = falseLINE_BREAKER = ([\r\n]---91827349873-dbx-end-of-event---[\r\n])LINE_BREAKER_LOOKBEHIND = 10000

CSV:

SHOULD_LINEMERGE = falseLINE_BREAKER = ([\r\n]+)

Set up a database lookup table

Splunk DB Connect allows you to define a lookup table that uses an externaldatabase as its source. Refer to "About lookups and field actions" in the coreSplunk platform documentation for more information.

To set up a database lookup table:

1. Log in to Splunk Web and navigate to Manager > Lookups > DatabaseLookups and click Add new.

2. Specify a unique name for the lookup and enter the database and table to usefor the lookup.

3. At this point you have two main options:

Specify fields directly: bring in all columns from the table using the Fillall columns button or specify fields to be used in the lookup. You can fill

•

23

all the columns and then use the Delete link next to each field to trim thelist.Use a specific SQL query to pull the data in: select the Configureadvanced Database lookup settings checkbox, then define a SQL queryand use the $input_field$ as a placeholder for each input field value.

•

4. Click Save.

A corresponding scripted lookup definition is created and you can use it withinSplunk as though it were a regular lookup by using the | lookup command.

You can also configure an automatic lookup. For information about automaticlookups, refer to this topic in the core Splunk product documentation.

Create a lookup manually via dblookup.conf

You can create a lookup manually through the dblookup.conf file. This is useful ifyou have a table with many columns that would be cumbersome to select usingManager. However, you must also create the lookup definition manually intransforms.conf with external_cmd = dblookup.py <name from dblookup.conf>

By default, only 1 result will be fetched from the database for each lookup inputrow. If you want to return more than one row, you can change max_matches indblookup.conf.

Lookups and Splunk DB Connect in a distributed environment

Some constraints exist when running DB Connect in a distributed Splunkenvironment and using lookups:

If you are running DB Connect in a distributed environment, you must usethe local=1 option in your lookup command, like this: | lookup local=1 .

•

Automatic lookups are not supported.•

Use the Splunk DB Connect search commands

Splunk DB Connect provides custom search commands to query data in yourdatabases.

24

dbquery

The database query command allows you to execute any query against yourdatabases and returns the rows as Splunk results. You can work with thoseresults like you can with any other Splunk results (for example, results fromsearches). It is similar to the core Splunk platform inputlookup command.

Usage

| dbquery database "sql" [limit=limit]

where:

database is the external database as specified in database.conf• sql is the SQL query statement to execute• limit is optional and allows you to specify the maximum count of resultsthat should get returned

•

Example

| dbquery ASSET_DB "SELECT id,name, ip_address,owner,last_update FROMhosts WHERE active = 1"

dbinfo

The dbinfo command fetches schema information from the database.

Usage

dbinfo database=<database-spec> [table=<table-spec>] (tables|columns)

dboutput (beta feature)

This functionality is in beta and is currently not officially supported.

The dboutput command allows you to insert or update data in your database.Use with caution, as this will actually write data to and change the contents ofyour database.

Usage of this command is currently limited to output 50,000 search results.

25

Usage

dboutput type=<insert|update> database=<database> table=<table>[key=<key_field>] [fields?]

Note: When using type=update, you must specify a key field/column.

Troubleshoot Splunk DB Connect

This topic contains information about troubleshooting common issues withSplunk DB Connect.

Answers

Have questions? In addition to the common troubleshooting tips listed in thistopic, you can visit Splunk Answers and see what questions and answers theSplunk community has about using Splunk DB Connect.

Java Bridge Server not running

A status error indicating that the Java Bridge Server is not running and errorsrelating to REST keep-alive failed in dbx.log typically occur when Splunk DBConnect is running in a VM that has been suspended and then restarted. If youmust suspend and restart the VM that Splunk DB Connect is running in, you mustrestart Splunk upon waking.

Input not updating

For a dbmon-tail, check the latest checkpoint value, which is stored in$SPLUNK_DB/persistentstorage/dbx ($SPLUNK_DB is, if not specified otherwise:$SPLUNK_HOME/var/lib/splunk). Each input has its own directory, which is a hashof its name (for example, a 32 character long hex string). This directory typicallycontains 2 files:

manifest.properties: contains meta-information, such as the name of theinput

•

state.xml: contains the actual state in XML format•

You must first identify the state directory and then you can inspect the XML file.

26

This state file looks like this:

<list> <value key="latest.record_update"> <value class="sql-timestamp">2012-12-07 04:22:25.703</value> </value></list>

Error creating PersistentValueStore

If you see the following error in the jbridge.log:

ERROR Java process returned error code 1! Error: Initializing Splunkcontext...Environment:SplunkEnvironment{SPLUNK_HOME=/opt/splunk,SPLUNK_DB=/opt/splunk/var/lib/splunk}Configuring Log4j... [Fatal Error] :1:1: Premature end of file.Exception in thread "main"com.splunk.config.SplunkConfigurationException: Error creatingPersistentValueStore type xstream:com.thoughtworks.xstream.io.StreamException: : Premature end of file.

you may have inadvertently corrupted your persistent store file. To resolve theproblem, remove $SPLUNK_DB/persistentstorage/dbx/global recursively.

Issues with bad line breaking/line merging

The problem is that Splunk has certain heuristics for linebreaking. Normally, logfile data has timestamps for each event. Splunk understands that well. If youhave timestamps in your database rows, then you shouldn't have line breakingissues. Just be sure to set output timestamp and specify as timestamp columnthe column that, you know, has the timestamp.

If you don't have timestamps in your db rows

If you don't have timestamps in your database rows, you have two options:

Click output timestamp and leave the timestamp column blank. Splunk willoutput the current time when indexing.

•

Use the default sourcetype in the input config. Leave it blank and SplunkDB Connect will use dbmon:kv as the sourcetype (in the normal casewhere you're using the key-value output format). However, if you put

•

27

something custom in the sourcetype field, you should then tell Splunk howto linebreak for that sourcetype. Just copy over the props.conf settings forthe default stanzas - specifically, add "SHOULD_LINEMERGE = false".

If your timestamp is not of type datetime/timestamp

Splunk DB connect expects the timestamp column in your database to be of typedatetime/timestamp. If it is not (for example, it is in format char/varchar/etc.), youmust check the Output timestamp box and specify theoutput.timestamp.parse.format so that DB Connect can obey the timestampoutput format setting.

For example, if the database column EVENT_TIME contains strings (for example,CHAR, VARCHAR, VARCHAR2, etc) with values like 01/26/2013 03:03:25.255then you must specify the parse format in the appropriate copy of inputs.conf:

output.timestamp = trueoutput.timestamp.column = EVENT_TIMEoutput.timestamp.parse.format = MM/dd/yyyy HH:mm:ss.SSS

28

Configuration file reference

Configuration file reference

Splunk DB Connect includes several custom configuration files. The following arethe spec files associated with them:

database.conf.spec• database_types.conf.spec• dblookup.conf.spec• dboutput.conf.spec• java.conf.spec•

Splunk DB Connect also includes a custom set of input fields, so there is anadditional inputs.conf.spec file to describe them.

The most current versions of these spec files are located in$SPLUNK_HOME/etc/apps/dbx/README.

database.conf.spec

# Copyright (C) 2005-2012 Splunk Inc. All Rights Reserved.# The file contains the configured database connections

[<name>]

host = <string>* The IP address or the hostname of the database.

port = <integer>* The port number of the database. If omitted the default port numberfor the given database type is used.

username = <string>* The username which is used for authenticating against the database.

password = <string>* The password which is used for authenticating against the database. Itwill be automatically encrypted if it is set in* clear-text.

database = <string>

29

* The database name or SID.

type = <database_type>* The database type. References a stanza in database_types.conf

readonly = true|false* Whether the database connection is read-only. If it is readonly, anymodifying SQL statement will be blocked

database.sid = true|false* Only applies to Oracle database connections (ie. type=oracle). Set to*true* if the Oracle database is only reachable* using an SID. By default the the service name format is used.

default.schema = <string>* Sets the default schema for the database connection if the databasetype supports it (Currently only Oracle supports* it).

testQuery = <string>* Supply a specific test query for validating connections to thisdatabase* If defined it overrides the testQuery of the database type (seedatabase_types.conf)

validationDisabled = [true|false]* Turn off connection validation for this database connection* If defined it overrides the validationDisabled of the database type(see database_types.conf)* Caution: disabling validation can lead to unpredictable results whenusing it with connection pooling

database_types.conf.spec

# @copyright@# This file contains the database type definitions

[<name>]

displayName = <string>* A descriptive display name for the database type.

typeClass = <string>* The FQCN (fully qualified class-name) of a class implementing thecom.splunk.dbx.sql.type.DatabaseType interface.

jdbcDriverClass = <string>* The FQCN of the JDBC Driver class. Only used when no typeClass is

30

specified.

defaultPort = <integer>* The default TCP port for the database type. Only used when notypeClass is specified.

connectionUrlFormat = <string>* The JDBC URL as a MessageFormat string. The following values will bereplaced:* {0} the database host* {1} the database port (the port specified in database.conf or thedefault port)* {2} the database name/catalog or SID* Only used when no typeClass is specified.

testQuery = <string>* A simple SQL that is used to validate the database connection. Onlyused when no typeClass is specified.

supportsParameterMetaData = [true|false]* Whether the given JDBC driver supports metadata forjava.sql.PreparedStatement.* Only used when no typeClass is specified.

quoteChars = <string>* Override the quote characters for the database type. If not specifiedthe default ANSI-SQL quote characters will be used.* Only used when no typeClass is specified.

defaultCatalogName = <string>* Configure the default catalog name for a generic database type. Usedfor querying the catalog names (ie. databases)

local = true|false* This flag marks a database type as local (ie. it is accessed via thefilesystem instead of TCP)

defaultSchema = <string>* Set the default schema prefix for the database type (defaults to null)

streamingFetchSize = <n>* Number of results to be fetched at a time when streaming is enabledfor a JDBC statement.

streamingAutoCommit = [true|false]* Turn auto-commit on or off for java.sql.Connection instances instreaming mode

validationDisabled = [true|false]* Turn off connection validation for database connections of this type* Defaults to false* Caution: this can lead to unpredictable results when using this with

31

connection pooling

dblookup.conf.spec

# Copyright (C) 2005-2012 Splunk Inc. All Rights Reserved.# This file contains the configured database lookup definitions

[<name>]

database = <database>* The database. References a stanza in database.conf

table = <string>* The database table name. Only used in simple mode (advanced = 0).

fields = <csv-list>* A list of fields/columns for the lookup* It possible to simply specify the field or the field and the column inthe form: <field> as <sql-column>

advanced = [1|0]* Whether to perform a simple lookup against the table or use a customSQL query

query = <string>* A SQL query template. Expressions in the form of $fieldname$ arereplaced with the input provided by splunk.

input_fields = <csv-list>* list of fields/columns for as input for the SQL query template

max_matches = <n>* Maximum number of results fetched from the database for each lookupinput row* Defaults to 1

dboutput.conf.spec

# Copyright (C) 2005-2012 Splunk Inc. All Rights Reserved.

[<name>]

database = <string>* The database to use (references the database.conf stanza)

32

table = <string>* The table to update

mode = insert|update* for the simple mode

fields = <string>* the fields used for the update in the form of <field_name> [AS<column_name>]. You can use * as a wildcard here.

key = <string>[ AS <string>]* Only applies to mode=udpate. The key columns to use for the SQL UPDATEstatement.

not.found = insert|ignore|fail* Only applies to mode=udpate. If there are no records update (ie. nomatching* key value has been found for the result, this option defines howdboutput* should behave in that case.* - insert: dboutput should insert the result instead* - ignore: do nothing* - fail: Rollback the changes (if possible) and fail the execution

sql = <string>

advanced = true|false* true to use the predefined SQL statement, false to use the table andautomatically generate the SQL* Defaults to false.

sql.update = <string>

sql.insert = <string>

java.conf.spec

# @copyright@# The master configuration file. Global settings for Java and Splunk DBConnect are# configured in here.

################## Java settings ##################[java]

33

home = <path>* Path to the Java JRE or JDK installation directory

options = <string>* Arbitrary Java command line options* For example memory settings or system properties

[bridge]

addr = <bind address>* The address/interface the Java Bridge server should listen for* connections on.* In most cases only 127.0.0.1 makes sense.

port = <bind port>* The port the Java Bridge server should listen for connections on.

threads = <n>* The size of the thread pool for Java Bridge command execution.* This defines the number of commands that can run concurrently

debug = true|false* Turn on debugging for the Java Bridge client

[logging]

level = INFO|DEBUG|WARN|ERROR|FATAL* The global logging severity

file = <filename>* The filename for the Splunk DB Connect logfile which is placed at* $SPLUNK_HOME/var/log/splunk

console = true|false* Enable or disable STDOUT output of log events. (only for debugging).

logger.<logger_name> = INFO|DEBUG|WARN|ERROR|FATAL* Override the global log level for a specific logger

[persistence]

global = <store type>* The type used for the global persistent store.* - xstream: Data is stored in XML flatfiles. These files are readable* and easy to inspect and change* - jdbm: Data is stored in a btree key-value database. The performance* is better for big amounts of data but the files are binary.

type.<type_name> = <fqcn>* A type definition for a implementation of the PersistentValueStore

34

interface

[config]

adapter = <config_adapter>* A class implementing the com.splunk.config.ConfigurationAdapterinterface

cache = true|false* Enable or disable caching of configuration values

[output]

default.channel = <string>* A channel is a way on how to get data into Splunk. Currently this is* done using the "spool" channel. There will be other options infuture.

default.timestamp.format = <string>* The default format of timestamp values for events generated by DBmon.* The can be overridden on per-input definition basis. The format is* expressed as Java SimpleDateFormat pattern.

type.<type> = <string>* Allows the registration of a output channel*(a class implementing com.splunk.output.SplunkOutputChannel)

format.<format> = <string>* Allows the registration of a output format* (a class implementing com.splunk.dbx.monitor.output.OutputFormat)

[cache]default.type = <softref|lru>cleaner.interval = <relative_time_expression>

[rest]keep-alive.timeout = <relative_time_expression>

###################### Database settings ######################[dbx]

database.factory = persistent|default* The database connection factory to use

database.factory.pooled = true|false* Enable database pooling

pool.maxActive = <n>* The maximum number of active database connections

35

pool.maxIdle = <n>* The maximum number of idle database connections

cache.tables = true|false* Turn on caching of table metadata information

cache.tables.size = <n>* The size of the table metadata cache

cache.tables.invalidation.timeout = <relative_time_expression>* The amount of time before the cached metadata information of a tableis* considered invalid and fetched again

preload.config = [true|false]* When enabled, the database factory will fetch and check all configured* database on startup. Otherwise there fetched when they are used forthe* first time.

query.stream.limit = <n>* Force streaming results for queries with a max. result limit greaterthan this (default is 10000).* This setting affects only certain database types.

jdbc.streaming.fetch.size = <n>* Number of results to be fetched at a time when streaming is enabledfor a JDBC statement. It can be overridden on* a per-database-type basis using the "streamingFetchSize" parameter indatabase_types.conf.* This setting affects only certain database types* Default is 500

[dbmon]

threads = <n>* The size of the thread pool for database inputs

output.channel = <string>* The output channel to use* - spool: Temporary files, that are moved into a file monitor sinkhole* - rest: Events are uploaded via REST to Splunkd

output.buffer.limit = <file-size-expression>* Only applies to the spool output channel. The max. size of the tempfiles, before they are moved to the sinkhole.* Defaults to 5MB

output.time.limit = <n>* Only applies to the spool output channel. The time limit for movingfiles to the sinkhole in milliseconds.* Defaults to 5000

36

[dblookup]

cache = true|false* When set to true, database lookup definitions are cached in memory

cache.size = <n>* The cache size for database lookups definitions (number of entries)

cache.invalidation.timeout = <relative_time_expression>* The amount of the before a database lookup definition is consideredinvalid* and removed from the cache.

[startup]init.<n> = <FQCN>

[dboutput]

batch.size = <n>

inputs.conf.spec

# Copyright (C) 2005-2012 Splunk Inc. All Rights Reserved.# This file contains the database monitor definitions

[dbmon-<type>://<database>/<unique_name>]

interval = <relative time expression>|<cron expression>|auto* Use to configure the schedule for the given database monitor* There are 3 different schedule types* - auto - the scheduler will automatically choose an interval based onthe number of generated results* - fixed delay between runs - The number of millisconds or a relativetime expression* Examples:* interval = 5000 (run every 5 seconds)* interval = 1h (run every hour)* - a cron expression* Examples:* interval = 0/15 * * * * (run ever 15 minutes)* interval = 0 18 * * MON-FRI * (run every weekday at 6pm)

query = <string>* The query options allows you to define the exact SQL query that isexecuted against the database

37

table = <string>* If no query is specified DBmon will automatically create a SQL queryfrom the given table name (ie. SELECT * FROM <table>)

output.format = [kv|mkv|csv|template]* The output format to use. The following are available:* - kv: Simple Key-Value pairs* - mkv: Multiline Key-Value pairs (ie. each key-value pair will beprinted on it's own line* - csv: CSV formated events* - template: allows you to specify the generated events using the<output.template> or <output.template.file> options

output.template = <string>

output.template.file = <string>

output.timestamp = [true|false]* Controls weather or not the generated event is prefixed with atimestamp value

output.timestamp.column = <string>* The column of the result set where the timestamp is fetched from. Ifthis is omitted the execution time of the monitor* will be used as the timestamp value

output.timestamp.format = <string>* The format of the output timestamp value expressed as a JavaSimpeDateFormat pattern.

output.timestamp.parse.format = <string>* Used for the case that the timestamp in the column defined by<output.timestamp.column> is a string value (ie.* varchar, nvarchar, etc). It allows you to define the(SimpleDateFormat) pattern to parse the timestamp with.

output.fields = <list>* The fields that should be printed in the generated event

# A Tail Database monitor will remember the value of a column in theresult and will only fetch entries with higher value# in future executions.[dbmon-tail://<database>/<unique_name>]

tail.rising.column = <string>* A column with a value that is always rising. The best option is to usean auto-incremented value or a sequence. A* creation or last-update timestamp is also a good choice.

tail.follow.only = [true|false]* This only affects the first execution of the monitor. If this optionsis set to true (default is false) nothing will be

38

* indexed at the first run.

[dbmon-dump://<database>/<unique_name>]

[dbmon-change://<database>/<unique_name>]

change.hash.algorithm = MD5|SHA256

[dbmon-batch://<database>/<unique_name>]

39

Splunk DBX 1.0.9 DeployDBX

Documents

Transcript of Splunk DBX 1.0.9 DeployDBX