Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
Big Data Monitoring Cockpit
-
Upload
stefan-bergstein -
Category
Software
-
view
438 -
download
0
Transcript of Big Data Monitoring Cockpit
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Big Data Monitoring CockpitSanjay Chaudhary - HP Software product management,
Stefan Bergstein - HP Software R&D,
Draft as of 5/13/13
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.3
Changing face of data..
695,000 status updates
98,000+ tweets
698,445 Google searches
1,820TB of data created
11million instant messages
168 million+ emails sent
YouTube
Viber
Qzone
Amazon Web Services
GoGrid
Rackspace
LimeLight
Jive Software
salesforce.com
Xactly
Paint.NET
Business
Education
Entertainment
Games
Lifestyle
Music
Navigation
News
Photo & Video
Productivity
Reference
Social Networking
Sport
Travel
Utilities
Workbrain
SuccessFactors
Taleo
Workday
Finance
box.net
TripIt
Zynga
Zynga
Baidu
Yammer
Atlassian
Atlassian
MobilieIronSmugMug
SmugMug
Atlassian
Amazon
AmazoniHandy
PingMe
PingMe
Associatedcontent
Flickr
Snapfish
Answers.com
Tumblr.
Urban
Scribd.Pandora
MobileFrame.com
Mixi
CYworld
Renren
Yandex
Yandex
Heroku
RightScale
New Relic
AppFog
BromiumSplunk
CloudSigma
cloudability
kaggle
nebula
Parse
ScaleXtreme
SolidFire
Zillabyte
dotCloud
BeyondCore
Mozy
Fring Toggl
MailChimp
Hootsuite
Foursquare
buzzd
Dragon Diction
SuperCam
UPS Mobile
Fed Ex Mobile
Scanner Pro
DocuSign
HP ePrint
iSchedule
Khan Academy
BrainPOP
myHomework
Cookie Doodle
Ah! Fasion Girl
PaperHost
SLI Systems
NetSuite
OpSource
Joyent
Hosting.com
Tata Communications
Datapipe
PPM
Alterian
Hyland
NetDocuments
NetReach
OpenText
Xerox
Microsoft
IntraLinks
Qvidian
Sage
SugarCRM
Volusion
Zoho
Adobe
Avid
Corel
Microsoft
Serif
Yahoo
CyberShift
Saba
Softscape
Sonar6
Ariba
Yahoo!
Quadrem
Elemica
Kinaxis
CCC
DCC
SCMADP VirtualEdge
Cornerstone onDemand
CyberShift
KenexaSaba
Softscape
Sonar6
Workscape
Exact Online
FinancialForce.com
IntacctNetSuite
Plex Systems
Quickbooks
eBay
MRM
Claim Processing
Payroll
Sales tracking & Marketing
CommissionsDatabase
ERP
CRM
SCM
HCM
HCM
PLM
HP
EMC
Cost Management
Order Entry
Product Configurator
Bills of MaterialEngineering
Inventory
Manufacturing Projects
Quality Control
SAP
Cash Management
Accounts ReceivableFixed AssetsCosting
Billing
Time and Expense
Activity ManagementTraining
Time & Attendance
Rostering
Service
Data Warehousing
The InternetGigabytes
Client/ServerMegabytes
Every 60 seconds
IBM
Unisys
Burroughs
Hitachi
NECBull
Fijitsu
Mainframe Kilobytes
Mobile, Social, Big Data & The Cloud
Zettabytes
217 new mobile web users
Yottabytes
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.4
Operating System
.. is changing the infrastructure landscape
Hadoop Distributions(Apache, Cloudera,
MapR, Hortonworks)
Infrastructure
Meaning Based Analytics
User segmentation
Software testing
Market research
Vertica
Autonomy IDOL
Ad hoc SQL Compliant Analytics Business
Users
Multi-dimensional analysis
Predictive analysis
Geographical analysis
Data Assimilation
Data Consolidation, Aggregation
Transformation into structured data
UnstructuredClick Stream Data
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.5
In which stage is your Big Data deployment?
• Prove business value
• Return investment
• Realize full operational potential
• Production-readiness
• Management of Big Data platforms
• Delivery in production IT
Move beyond POC phase
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.6
From experimentation to real production environments
The road aheadAvailability
security robustness
End to end hardening
Data volumes growth
On demand jobs
24x7 support service
You are here!
Stakeholder’s expectations
Are these new management challenges?
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.7
IT management drives Production-readiness
AutomatedEnd to end Integrated
HP Solution: Big Data Monitoring Cockpit
Ensure Performance and high availability for
big-data middleware across network, storage, compute
resources
Fast, dynamic, and cost-effective monitoring
Monitoring scale-up and scale-out with the big-data
platform.
Quick and consistent trouble response integrated
in existing 24x7 support operations.
visibility for all stakeholders to keep up with demand in no time for seamless support
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.8
HP Big Data Monitoring Cockpit
• Event dashboard
• Topology & stream-based event correlation
• Performance Graphing
• 3rd party integrations
• Closed loop incident mgmt
• “Monitoring Automation” - simplified, automated monitoring configuration
• Management Packs for Infrastructure, Oracle, Vertica, and Hadoop
• HP NMC and SE integration
Network | Storage | Compute | Middleware
HP Operations Manager i
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.9
OMi capabilities for your big data environment
OMi Management Pack
Real time Visualization of your end to end big data environment
Monitor availability & performance of big data applications & infrastructure components
Isolate, diagnose & Remediate problems impacting your complex big data environment
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.10
Example use case
Hadoop Monitoring at work
Visualize• Service availability displayed on WatchList (e.g.,
NameNode, JobTracker)
• Topology view provides the impact of the service unavailability on the entire big data environment
Monitor• Event dashboard groups events related to a single
incident
• Root cause is identified by grouping events into cause & symptom
Remediate• Detailed instructions guide operator to resolve the
incident manually
• Routine incidents can be resolved automatically through orchestration
“As a HadoopAdministrator, I would like to know the availability of the Hadoop Services at any given point of time”
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.11
Real time visualization of your end to end big data environment
Visualize
JobTracker
NameNode
DataNode
TaskTracker
DataNode
TaskTracker
JobTracker
NameNode
DataNode
TaskTracker
DataNode
TaskTracker
Cluster
Master
Slave 1 Slave 2
Secondary Master
Slave 1 Slave 2
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12
Obtain the health & performance view in a single dashboard
View from the top..
Consolidated Event Dashboard• Consolidated event management
with a single pane of glass
• Events from different monitoring systems (e.g., network, storage, system ..)
• View performance and Health of specific nodes
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.13
Obtain the health & performance summary of specific Big Data technology element
View from the top..
Hadoop Dashboard• Real time reflection of topology &
health
• Instant information on availability of services e.g., HDFS, MapReduce
• Notifications on performance & workload
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14
Root cause analysis
.. and drill down specific events
Cause and Symptoms• Reduce MTTR by identifying cause
& symptoms
• Group events related to a single incident
• Adapt to changes in infrastructure automatically
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.15
Guided and automated problem resolution
Remediate
• Automate routine remediation tasks and processes
• Audit compliance through documentation generation and reporting
Guided problem Resolution
ToolsCreate tools to help users perform common tasks, for example you can run a command tool to check the status of a infrastructure element
Custom actionsAdministrators can define a variety of custom actions for the operator to use when resolving certain types of events
InstructionsEvent specific resolution instructions embedded with the event
Automate Resolution : Operation orchestration
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16
Analyze short term trends in your big data environments
Graphs
Performance Graphs• Analyze real time & historical data
together
• Check for performance trends
• Compare multiple big data components simultaneously in the same graph
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.17
Analyze long term trends of your big data environments
Reports
Service Health Reports• Executive reporting –
performance highlights, metrics & trends
• Analyze impacts to service delivery
• Consolidation, capacity planning & forecasting
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.18
Big Data monitoring use cases
Ensure Hadoop jobs are completed on optimal time Monitor available M/R slots,
slots used, completed and waiting jobs
Alert on high wait time or failure of jobs
Visualize map and reduce slot usage trends Understand the impact of failed jobs
Remediate: Realign the Hadoopcluster for optimal performance
Ensure analytics reports are executed on time on Vertica Monitor the resource queues &
rejection, lock and query execution time
Alert on lock, execution time, long queue or resource rejection
Analyze real time & historical data together, Check for performance trends
Remediate: Create resource pool for high priority requests
Avoid any data loss on Vertica cluster Monitor and alert on node
down, not K-Safe and critical node state
Visualize the cluster availably and impact on your service topology
Report on capacity planning & forecasting
Remediate: Immediately restore any failed /Shutdown node.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.19
OMi: Big Data Monitoring Cockpit
HadoopDistributed storage &
application processing
VerticaInteractive real-time
analytics
DiscoveryData
CollectionAvailability monitoring
Performance monitoring
Event Correlation
Real time graphs
Trend Reporting
InfrastructureSystem, OS, Virtual OS,
Cluster
Variety of supported operating systems / virtual operating systems- CPU- Memory- Disk- Network- Resources
Monitor 1000s of Hadoop nodes & clusters through a single dashboard- MapReduce- Job Statistics- RW ops Throughput- HDFS availability
Maintain the infrastructure analyzing your big data- Query Performance- Resource rejection- Process status- Node status
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.20
Find out more about HP BSMSession Description Date/Time Session Description Date/Time
RT3429 Operations Analytics Roundtable Tue 1:00 BB2922 Monitoring Automation – An Early Evaluation at Fidelity Thu 2:00
BB2647 Latest Innovations in BSM Wed 10:30 BB2969 OMi, Bringing IT all together: Metric, Events &Topology Thu 2:00
CDA3397 HP Business Service Management: roadmap and strategy Wed 12:00 DT3438 Get your Bigdata for IT - HP Operational Analytics Thu 3:00
RT3431 Event Correlation Makes A Difference - Meet the Experts on OMi Wed 12:00 BB3059 Measuring and Realizing Value from Your BSM Solution Thu 3:30
TB2913 The management cockpit for Big Data platforms Wed 3:00 TB2919 Automating Monitoring with HP Operations Manager I Thu 3:30
CDA2800 Get your Big Data for IT - HP Operational Analytics Wed 3:00 Demo Booth
TB2909 BigData for IT : All you wanted to know about IT Ops Analytics Wed 4:30 Application Performance Management for hybrid environments 31
RT3430 What do BigData techniques for your IT really mean? Thu 9:30 Performance Anywhere 32
TB2897 End to End Monitoring - Capability to Maturity: Thu 9:30 BSM Mobility 33
BB2916A success story on how a partner successfully leveraged the HP BSM partner community program to deliver BSM content
Thu 9.30 Predictive Analytics / Cross IT Domain Reporting 34
TB3061 SHR 9.30 and Migrating from OVPI to SHR Thu 9.30 Systems Management 35
BB2798 How PlayTech uses Predictive Analytics to Prevent Business Outages Thu 11:00 Configuration Manager & UCMDB 36
RT3432OMi Roundtable - How can Automated Monitoring make a difference to IT Ops
Thu 11:00 Operation Bridge (OMi) - Monitoring Automation + CLIP 37
BB2799 Customer Experiences exploring Operational Analytics Thu 12:30 Operations Analytics v2 38
RT2797 The Business Value of BSM Thu 2:00 Storage Essentiels 9.6 39
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Thank You!
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.22
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.23
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.24