Streams Monitoring

18
Zbigniew Baranowski CERN, IT-PSS Streams Monitoring 3D mini Workshop January 26th 2007 Zbigniew Baranowski

description

Streams Monitoring. 3D mini Workshop January 26th 200 7 Zbigniew Baranowski. What do we use Streams Monitoring for?. Replication topology State of streams connections Process error notifications Monitoring streams performance (latency, throughput etc.) in each phase of replication - PowerPoint PPT Presentation

Transcript of Streams Monitoring

Page 1: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring

3D mini Workshop January 26th 2007

Zbigniew Baranowski

Page 2: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 2/18

What do we use Streams Monitoring for?

• Replication topology• State of streams connections• Process error notifications• Monitoring streams performance

(latency, throughput etc.) in each phase of replication

• Monitoring resources that have impact on streams performance (Stream Pool, Redo generation)

Page 3: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 3/18

Streams Performance?

• Database availability• State of each process

(Enabled,Disabled,...) error checking• Queues state (amount of messages,

spilling)• Number of LCRs in each phase

(captured, propagated and applied)• Replication latency• Number of bytes propagated• Redo generated in DBs• Streams pool size usage

Page 4: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 4/18

Monitor architecture

• „Strmmon” Daemon– written in Python 2.3.4 – running on central repository server together with 3D

OMS– collects streams and instances information – generates and stores logs of these activities in DB– reports about errors and warnings

• End-user web application– written in PHP5(using JpGraph and GraphViz)– distributes data to end-user

• Performance graphs• Connections status• diagnostic• Etc

• Current production node– Intel(R) Xeon(TM) CPU 2.40GHz – 1024 MB of RAM

Page 5: Streams Monitoring

Monitor architecture

CERN

CNAF

RAL

IN2P3

Server running script

End User (Web

Browser)

3D CERN

PHPPHP

Page 6: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 6/18

Web User Interface

http://oms3d.cern.ch:4889/streams/main.php

Username:*****Passwd:*****

Features:• Monitor summary• Experiments connection maps• Database list• Active Streams• Graphs• Process and queues connections

Page 7: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 7/18

Monitor View

Page 8: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 8/18

Connection view

Page 9: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 9/18

Database list

Page 10: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 10/18

Database detailed view

Page 11: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 11/18

Active streams connection dashbord view

Page 12: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 12/18

Datailed stream view

Page 13: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 13/18

Graph generator

Page 14: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 14/18

Graph examples Replication of single

transaction

Page 15: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 15/18

ExamplesRedo Generated

Page 16: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 16/18

Graphs ExamplesLHCb test(propagation)

Page 17: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 17/18

Burst of transactions

Page 18: Streams Monitoring

Zbigniew Baranowski

CERN, IT-PSS

Streams Monitoring - 18/18

What next?

• Migration to faster machine– Intel(R) Xeon(TM) CPU 3.00GHz – 4GB of RAM

• Improving web script performance• Collecting input from user to improve front

page (map with connection and states) and make other clear for users

• Reuse of daemon script for other monitoring of T1s for crosschecking with OMS

• More data? CPU load, disk I/O etc.• 3D headline with logo is missing