Splunk live university of alberta 2015
-
Upload
dostatni -
Category
Data & Analytics
-
view
112 -
download
3
Transcript of Splunk live university of alberta 2015
![Page 1: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/1.jpg)
Greg DostatniTeam Lead, Application Hosting
Splunk at the University of Alberta
Copyright © 2015 Splunk Inc.
![Page 2: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/2.jpg)
2
• At U of A since 2007• Responsible for 10-person
team managing applications and databases university-wide
• Splunk user since 2013• I’ve eaten BBQ chicken
intestines on a stick. Yummy.• splunk> take the sh out of IT
![Page 3: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/3.jpg)
3
The University of Alberta
• Public research university based in Edmonton and founded in 1908
• 39,000+ students and 18,000 employees
• 5 campuses and 18 faculties• One of the top 100 universities
worldwide
![Page 4: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/4.jpg)
4
IT at the University of Alberta
Central IT group for authentication, wireless and core services
Independent IT groups for most faculties and departments
University-wide initiative to consolidate more of IT
Need to standardize IT operations and tame diverse technology stacks
4
![Page 5: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/5.jpg)
5
Application Hosting Objectives
• Centralize more of IT• Build and manage shared
environments• Develop custom services as
needed• Roll out/upgrade applications• Investigate performance
problems
IT
Libraries
LMS
Public website + CMS
Ticketing
Billing systems
Research group serversOther applications
and databases
![Page 6: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/6.jpg)
6
Challenges after Restructuring IT
• More interdependencies among teams
• Massive volume of data, housed in silos
• “Running blind” – no understanding of the data
• Time-consuming to gather data for incidents
![Page 7: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/7.jpg)
7
Splunk Timeline
• Funding to rebuild Splunk environment
• New hardware, clustering with dedicated storage
• 400 data sources• 133 sourcetypes
April 2015
• Management notification of syslog data loss
• Incidents escalated
• Splunk in production?
Sept. 2014
• Data loss concerns from restarting Splunk
• Management relying on Splunk reports
• Splunk not in production
March 2014
• Pilot deployed• Splunk as syslog
target• Log aggregation
test; no need for backup
Sept. 2013
![Page 8: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/8.jpg)
8
Splunk at the University of Alberta
Infrastructure Applications
(mail, authentication)
Networking and Security
(switches, IPS)
Application Hosting
(apps, databases)
![Page 9: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/9.jpg)
9
Example: Troubleshooting Authentication Systems
Before
• 12GB/day, 20 machines• No aggregation• Reactive issue response
based on user feedback• Manual investigations• Delay in getting data
After
• Centralized data• ½ hour to troubleshoot• Proactive alerts for issues• Easy access to
infrastructure data• Real-time reporting
![Page 10: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/10.jpg)
10
Example: Performance MonitoringTrack and correlate request response times to gauge user satisfaction
![Page 11: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/11.jpg)
11
Example: First Responders AppDashboards for initial incident review
![Page 12: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/12.jpg)
12
Example: Proactive AlertsTrigger alerts on both the count and percentage of messages
![Page 13: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/13.jpg)
13
Example: Executive Dashboards
![Page 14: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/14.jpg)
14
Splunk Deployment Takeaways
Successes
• Visibility cutting through team boundaries
• More advanced initial incident investigation
• Openness - signed standard IT agreement for access to Splunk data
• Management loves reports• Defusing situations with rapid
access to facts
Challenges
• Accepting syslog data directly• Log standardization• Figuring out what to look at in the
logs to understand “good” system behavior
![Page 15: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/15.jpg)
15
Aha! MomentsTransactions
• End-to-end monitoring of 4M+ email messages per day (greylisting spam filtering Google)
• Used transactions to combine logs across systems into single, message-centric log
• Ability to easily search for anomalies
Generic Alerts
• Created alert to catch errors across systems in real time
• Used existing alert and removed host specification to create the generic alert
• Catches errors that were not in Splunk at the moment the alert was created
10-second Query
• 10-second window = ~35,000 events
• Statistics to rank likely events triggering issues
• New Splunk window to analyze unusual messages
• Ability to examine small slice of time in detail while running statistics over longer period of time
![Page 16: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/16.jpg)
16
“Splunk allows us to erase these lines and any analyst can see all the data from
anywhere and investigate a problem from end to end.”
![Page 17: Splunk live university of alberta 2015](https://reader034.fdocuments.in/reader034/viewer/2022042716/55cf0dd1bb61eb6e1b8b459c/html5/thumbnails/17.jpg)
Thank you