Large Scale Log collection using LogStash & mongoDB
-
Upload
gaurav-bhardwaj -
Category
Engineering
-
view
1.815 -
download
5
description
Transcript of Large Scale Log collection using LogStash & mongoDB
![Page 1: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/1.jpg)
Large scale log collection
Guided byProfessor Simon Shim
Team #14 Gaurav Bhardwaj <009297431> Vaibhav Bhor <009313434> Sumant Murke <009303879> Amod Rege <009259692>
CMPE 283: VIRTUALIZATION TECHNOLOGIES
![Page 2: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/2.jpg)
1. Project Overview2. Objective3. Project Part-2 4. Project Part-1 (DRS-DPM)5. Screenshots6. Lessons learnt 7. Conclusion
AGENDA
![Page 3: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/3.jpg)
Objective
Manage and test Virtual Machines Simulate DRS- DPM functionality Develop large scale analysis tool, which collects VM as
well as Host performance data. Understand need to Gather and Analyze log Data To come up with a framework which provides complete
solution for virtual Machine log file collection & analysis.
![Page 4: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/4.jpg)
Design
![Page 5: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/5.jpg)
![Page 6: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/6.jpg)
Components
Agent Collector Aggregator Local storage (mongoDB) Central storage (MySQL) Visualization
![Page 7: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/7.jpg)
Agent
Uses Java VI api to collect system metrics Collects Host as well as Virtual Machine stats Writes to a text file every 5 secs Takes following parameter VM Name, vHost
Name , y/n VM Name => Name of Virtual Machine it has to
monitor, y=> to collect stats for both vHost as well as
VM, n=> to collect only VM stats Vhost-Name => Name of vHost it has to
monitor
Java -jar Agent.jar “vHost Name” “vm
Name” “y/n”
![Page 8: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/8.jpg)
Agent flow
![Page 9: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/9.jpg)
Parsing file using LogStash
LogStash reads log file written by agent, For every append in log file it detects and
generates an event, parses each line of log file and stores it in mongoDB.
Conf file(logshipper.conf) supplied to LogStash
Input {file=> ”*.log”} Filter {filter=>json} Output {output=> mongoDB }
bin/logstash -f logshipper.conf
![Page 10: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/10.jpg)
Collector
Takes IP of all agents Connects to local storage of each VM Pulls data in a round robin manner Clears data from mongoDB after reading Stores in MySQL Configuration file for connection information Automated run every 5 min using crontab
Python collector.py “conf file”
![Page 11: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/11.jpg)
Aggregator & Central DB design
24 hour 1 hour 5 minute data VM and vHost stats Schema
![Page 12: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/12.jpg)
DRS-DPM (Part-1)
Initialize the environment and get number of VM's and host's.
Initialize standard variables vmCount and hostCount. If number of virtual machines is greater than vmCount.If new machine is powered on. Move newly added virtual machine to host with minimum load. End if End ifIf number of host machines is greater than hostCount. If cpu load of new host is less than 30% Migrate the virtual machine to host with minimum load. Power off the host. End if find the VM with minimum load Migrate the virtual machine under new host. end if
Avoided ping-pong migration
![Page 13: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/13.jpg)
Is our design good ?
Agents: will not append will re-write to file DataBase (mongoDB) Collector:
Collects data, stores it in MySQL and removes it from local Storage
Can connect to as many client specified in conf file
Aggregator purges main table DataBase (MySQL): Aggregator clears the
main table Visualization module is totally decoupled from
server and storage
![Page 14: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/14.jpg)
Visualization approach Library
We used canvas.js a Javascript library for visualization.
CanvasJSUsed canvas.js to plot the graphs.We used canvas.js since it is easy to use
and provides different types of visualization.
Data Source: MySQL DatabaseMySQL database was used from which data
was plotted on the graph.MySQL was used to get data in structured
format and then plotted on the graph.
![Page 15: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/15.jpg)
Output Graphs
![Page 16: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/16.jpg)
Output Graphs
![Page 17: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/17.jpg)
Output Graphs
![Page 18: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/18.jpg)
Output Graphs
![Page 19: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/19.jpg)
Output Graphs
![Page 20: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/20.jpg)
Tools & Technology Agents
- Java VI api Collectors
- Python script automated with CRONTAB Log file parsing
- LogStash with mongoDB plugin Stress api
Manually increase CPU, IO and RAM consumption stress --cpu 2 --io 1 --vm 1 --vm-bytes 128M --timeout 10s --verbose
Visualization tools CanvasJS JavaScript Library JSP & HTML5
Programming languages - Java, Python, JavaScript
Utilities Putty , winscp
Database MySQL mongoDB
![Page 21: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/21.jpg)
Lessons learnt
Using VI java api Concept behind DRS-DPM. Never clone a vHost Not every Virtual Machine is Linux Automation using CRONTAB ESX log files awareness Designing systems Working with SQL and No-SQL databases and
understanding their usage context
![Page 22: Large Scale Log collection using LogStash & mongoDB](https://reader031.fdocuments.in/reader031/viewer/2022020714/5552c1a5b4c90581158b47f8/html5/thumbnails/22.jpg)
THANK YOU...