A Novel Mind Map Based Approach for Log Data Extraction

19
Dileepa Jayathilake A Novel Mind Map Based Approach for Log Data Extraction Department of Electrical Engineering University of Moratuwa Sri Lanka ICIIS 2011

description

Software log file analysis helps immensely in software testing and troubleshooting. The first step in automated log file analysis is extracting log data. This requires decoding the log file syntax and interpreting data semantics. The expected output of this phase is an organization of the extracted data for further processing. Log data extractors can be developed using popular programming languages targeting one or few log file formats. Rather than repeating this process for each log file format, it is desirable to have a generic scheme for interpreting elements of a log file and filling a data structure suitable for further processing. The new log data extraction scheme introduced in this paper is an attempt to provide the advanced features demanded by modern log file analysis procedures. It is a generic scheme which is capable of handling both text and binary log files with complex structures and difficult syntax. Its output is a tree filled with the information of interest for the particular case. My speech in ICSCA 2011 - http://dileepaj.blogspot.com/2011/07/speech-in-icsca-2011.html

Transcript of A Novel Mind Map Based Approach for Log Data Extraction

Page 1: A Novel Mind Map Based Approach for Log Data Extraction

Dileepa Jayathilake

A Novel Mind Map Based Approach for Log Data Extraction

Department of Electrical Engineering University of Moratuwa Sri LankaICIIS

2011

Page 2: A Novel Mind Map Based Approach for Log Data Extraction

Background

Problem Identification

Solution Overview

Solution Design

Implementation

Conclusion

AG

EN

DA

Page 3: A Novel Mind Map Based Approach for Log Data Extraction

BACKGROUNDFunctional Conformance

Quality Verification

Troubleshooting

System AdministratorsDomain Experts

DevelopersApplication Logs

Monitoring Tool Logs

LOG FILE ANALYSIS

Testers

Page 4: A Novel Mind Map Based Approach for Log Data Extraction

Require Expertise

Labor Intensive

Error-prone

Advantage of Recurrence not used

BACKGROUND

PITFALLS IN MANUAL APPROACH

Page 5: A Novel Mind Map Based Approach for Log Data Extraction

PROBLE

M

IDENTI

FICAT

ION

Challenges

Result

Automation abandoned

Proprietary Implementation

Costly

Rules not human readable

Difficult to add new rules

Less resilient to format changes

CHALLENGES

Reports not customizable

Different log formats & structure

Lack of a common platform

Making rules human & machine

readable

Page 6: A Novel Mind Map Based Approach for Log Data Extraction

XML

Universal format

Ubiquitous use

Many tools available

Costly meta data

Less human readable

Associated languages are complex

Not every log is xml

Log File Grammars Formal definitions

Regular expression based

Assume line logs

Fail with complex log file structures

Unable to handle difficult syntax

Distant from XML

EXISTING SUPPORT

PROBLE

M

IDENTI

FICAT

ION

Page 7: A Novel Mind Map Based Approach for Log Data Extraction

Handle arbitrary formats and structures of log files

In lined with XML

Friendly for non-developers

Ability to generate custom reports

A GENERIC LOG ANALYSIS FRAMEWORK

+

Resilient to log file format and structure changes

A knowledge representation which is both human and machine readable

EXPECTA

TIONS

SOLUTI

ON

OVERVIEW

Page 8: A Novel Mind Map Based Approach for Log Data Extraction

InterpretationUnified mechanism for extracting information of interest from both text and binary log files with arbitrary structure and format

ProcessingEasy mechanism to build and maintain a rule base for inferences

PresentationFlexible means for generating custom reports from inferences

Log Files

Knowledge Representation Schema

SOLUTION OVERVIEW

SOLUTI

ON

OVERVIEW

Page 9: A Novel Mind Map Based Approach for Log Data Extraction

Resembles human knowledge organization better

MIND MAPS

Easy to add content

Easy to visualize

Easy access to computers

Tree

Can utilize existing tree algorithms

Easily convertible to XML

Can utilize existing tools

Easy to combine

MIND MAP AS KNOWLEDGE UNIT

SOLUTI

ON

DESIGN

Page 10: A Novel Mind Map Based Approach for Log Data Extraction

InterpretationUnified mechanism for extracting information of interest from both text and binary log files with arbitrary structure and format

Log Files

GENERIC INTERPRETATION SOLU

TION

DESIGN

Page 11: A Novel Mind Map Based Approach for Log Data Extraction

SOLUTI

ON

IMPL

EMENTATI

ONLOG FILE GRAMMAR

Assume knowledge on file structure and syntax

Able to handle a spectrum of log file types

Based on hierarchical log entries

Log entries identified by attribute combinationTranslates a log file into a mind mapResilient for malformed log files

Page 12: A Novel Mind Map Based Approach for Log Data Extraction

SOLUTI

ON

IMPL

EMENTATI

ON

PARSER

Page 13: A Novel Mind Map Based Approach for Log Data Extraction

SOLUTI

ON

IMPL

EMENTATI

ON

EXAMPLE

LE ≡ ([A,S,E,S,B], NO); A ≡ ([A1,A2,A3], NO); A1 ≡ (‘v’); A2 ≡ (‘a’); A3 ≡ (‘l’);S ≡ ({SPACE, TAB}, -1, 0, NO); SPACE ≡ (‘ ‘); TAB ≡ (‘\t’); E ≡ (‘=’); B ≡ ({ZERO, ONE, …, NINE, DECIMAL_POINT}, -1, 1); ZERO ≡ (‘0’); ONE ≡ (‘1’); … ; NINE ≡ (‘9’); DECIMAL_POINT ≡ (‘.’)

val = 2.3

Page 14: A Novel Mind Map Based Approach for Log Data Extraction

SOLUTI

ON

IMPL

EMENTATI

ON

MICROSOFT SHAREPOINT LOG FILE

Difficult syntax

Page 15: A Novel Mind Map Based Approach for Log Data Extraction

SOLUTI

ON

IMPL

EMENTATI

ONMICROSOFT APPLICATION VERIFIER LOG

XML

Page 16: A Novel Mind Map Based Approach for Log Data Extraction

SOLUTI

ON

IMPL

EMENTATI

ON

TRADING SYSTEM LOG

Corrupted Log

Page 17: A Novel Mind Map Based Approach for Log Data Extraction

CONCLUSION

The new schemeIs capable of expressing both text and binary log files with different structures and formats ranging from flat messages

to complex hierarchies.

Page 18: A Novel Mind Map Based Approach for Log Data Extraction

REFERENCES[1] J. H. Andrews, “Testing using log file analysis: tools, methods and issues,” Proc. 13th IEEE

International Conference on Automated Software Engineering, Oct. 1998, pp. 157-166.

[2] D. Jayathilake, “A mind map based framework for automated software log file analysis,” International Conference on Software and Computer Applications., in press.

[3] T. Takada and H. Koike, “Mielog: a highly interactive visual web browser using information visualization and statistical analysis,” Proc. USENIX Conf. on System Administration, Nov. 2002, pp. 133-144.

[4] L. Destailleur, “AWStats,” [Online]. Available: http://awstats.sourceforge.net

[5] J. Valdman, “Log file analysis,” Department of Computer Science and Engineering (FAV UWB)., Tech. Rep. DCSE/TR-2001-04, 2001.

[6] J. H. Andrews, “Theory and practice of log file analysis,” Department of Computer Science, University of Western Ontario., Tech. Rep. 524, May 1998.

[7] T. Buzan and B. Buzan, The Mind Map Book. New York: Penguin Books, 1994, pp.79-91.

[8] J. Cowie and W. Lehnert, “Information extraction,” Comm. ACM 39, 1996, pp. 80–91.

[9] J. Abela and T. Debeaupuis, “Universal Format for Logger Messages,” The Internet Engineering Task Force. [Online]. Available: http://tools.ietf.org/html/draft-abela-ulm-05

Page 19: A Novel Mind Map Based Approach for Log Data Extraction

QUESTIONS