Context Driven Scalable SIEM Solution

Context Driven Scalable SIEM Solution

Dr. Ertuğrul AKBAŞ

[email protected]

Cyber-attacks have grown exponentially more frequent and sophisticated, demanding near real-

time, highly available, and automated responses to threats. The global cost of cybercrime has

already grown to $100 billion annually [1], not counting the intangible damage to enterprise

and government security. In addition to the data loss, security breaches can cause

immeasurable—and sometimes irrevocable damage to brand.

Analyzing machine data from firewalls and perimeter devices in real time is vital to thwarting

and predicting threats. Every router, switch, firewall, intrusion prevention system (IPS), web

proxy, or other security element has a story to tell about the confidentiality, integrity, and

availability of the IT environment. Relevant data from across these systems is critical to

investigations as well as for continuous monitoring for situational awareness. However, the real

return on investment for security solutions lies in making them work together to provide a

comprehensive view of the enterprise security posture. This combined and chronological view

of all relevant data allows the security team to prioritize events and responses, and to effectively

engage with IT operations and other areas of the business.

The Methodology

SIEM solutions are usually used for real-time threat monitoring, incident forensics,

demonstrating regulatory compliance, and streamlining IT operations. In most organizations,

these functions are designed with the intent of leveraging them to protect sensitive data. In such

scenarios, SIEM can be effectively integrated with:

• Application Security Solutions

• DDoS Protection Solutions

• Firewalls

• Secure Mail & Web Gateways

• DLP Systems

• IPS

• End Point Security Solutions

• Database Security Systems

• OSs

The methodology presented in this paper is based on the ability to identify and understand the

flow of log streams. Understanding and decoding log flow is the first step. Output of this step

is categorized event streams like;

Malicious->DNS->Attack

Compromised->Virus->Attachment->Not Cleaned

Informational->VPN->Tunnel->Failed

Labeling, categorization and identification can be used interchangeably. This log identification

can be used for scenario based correlation, but might also be used for any number of other

controls.

This technology give s us the power of defining human readable correlation rules like:

“Visit a website and suddenly make lots of connections”

After a log or log stream labeled they are not just logs from now on, they represents a process

in your network. This labels represents each SIEM integrated device or application like:

Application Security Solutions, DDoS Protection Solutions, Firewalls, Secure Mail & Web

Gateways, DLP Systems, IPS, End Point Security Solutions, Database Security Systems state

Some previous works also point out log content analysis and make some classifications like [2]:

Authentication and Authorization Reports

Systems and Data Change Reports

Network Activity Reports

Resource Access Reports

Malware Activity Reports

Failure and Critical Error Reports

We have nearly 300 categories with sub-category.

Once the feeds are incorporated and the best possible coverage has been achieved, detected

category will be ready for rule definition. Correlation rules can also correlate events via their

taxonomy allowing the creation of device-independent correlation rules.[3]

The Taxonomy Algorithm

No matter the source of the event, or the format it originated in, there are types of system and

network events common across many system types. A security analyst wanting to see all user

logins within a certain time period, should not have to know what the specific attributes for

each event type for each system type is, to retrieve that information. SureLog maintains a

taxonomy of event types that normalized fields can be matched to and retrieved via. Correlation

directives can also correlate events via their taxonomy allowing the creation of device-

independent correlation rules.

A taxonomy aids in pattern recognition and also improves the scope and stability of correlation

rules. Our comprehensive log taxonomy is then applied in order to enable the cross-device,

cross-infrastructure correlation. This log taxonomy takes into account more than 400,000

distinct signatures to make sure that no matter the device, the message can be categorized.

Signatures are a way to match information in the log streams. Once the data are categorized,

the advanced correlation and alerting intelligence can be applied for prioritization of the logs.

The taxonomy is constructed of high-level, first-tier groups such as Access, Application,

Authentication, DoS, Exploit, Informational, Malware, Policy, Recon, Suspicious Activity,

System, etc. Each first-tier group is then broken down further into sub-groups and even further

as necessary, each lower tier representing more specific event classification. By referring to the

highest level of the Normalized Taxonomy, all lower-tier event classifications in that branch

are included in the selection. This allows the operator to select a more general event group, such

as Authentication, and all sub-group branches (Login, Logout, Password, etc.) and their

children (Admin Login, Database Login, Domain Login, etc.) of the Authentication parent will

also be included in the selection.

Sample Execution :

The identification algorithm of correlation for a load balancing switch for web will analyze logs

from this log point and in order to identify abnormal health status condition, intelligent key

search will look for ERROR<vrrp>transmit-cannot-receive within log streams.

The correlation engine has thousands of signatures for most of the : Application Security

Solutions, DDoS Protection Solutions, Firewalls, Secure Mail & Web Gateways, DLP Systems,

IPS, End Point Security Solutions, Database Security Systems state, Oss.

Attack Classification

Classifying attacks against log anonymization is an early step towards a comprehensive study

of the security of anonymization policies. If network owners can select classes of attacks that

they wish to prevent, they can then ensure that their anonymization policies meet their security

constraints, while allowing as much non-private information as possible to be revealed—thus

increasing a data set’s utility.

As described previously, we wish to provide network owners with a taxonomy of attacks, the

classes of which they can select to prevent, rather than having to focus on individual attacks.

We also wish to formally express relationships between attacks, allowing for expression of

attack groupings in a logic about anonymization. This taxonomy must be complete (every

known attack can be placed in at least one class) and mutually exclusive (no attack can be a

member of more than one class). The classes must be fine-grained enough for network owners

to select specific classes without seriously impacting the utility of a log. Finally, the classes

must be tied together in a more concrete way than a description in natural language

References

1. “The Economic Impact of Cybercrime and Cyber Espionage”, July 2013, Center for Strategic

and International Studies

2. “Top 6 SANS Essential Categories of Log Reports 2013”, v 3.01

3. http://www.anetusa.net/surelog

http://www.anetusa.net/surelog

Context Driven Scalable SIEM Solution

Science

Transcript of Context Driven Scalable SIEM Solution