the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max...
Transcript of the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max...
![Page 1: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/1.jpg)
From 10 standalone Zabbix platforms to a major one in the cloud.Zabbix-Summit 2019 – 11/12 october
in the cloud
![Page 2: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/2.jpg)
The story begins for me in 3DS Outscale
2014 My job interview
• ”What will you do during your first weeks ?”
3DS Outscale
• Dassault Systèmes IaaS Cloud Provider
• Several datacenters around the world
• Tons of equipments, VMs, expert workers
My first weeks
• Let’s focus on the monitoring subject!
![Page 3: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/3.jpg)
First steps with Zabbix
• Several Zabbix Servers
• 1 Datacenter:
� 1 Zabbix-Server + Frontend
� 1 PostgreSQL Database
![Page 4: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/4.jpg)
What was my first year like ?
• Zabbix Configuration• Zabbix templating• Scripting for UserParameters• Reworked the alerting• Dashboarding using Ruby / Dashing
My daily work:
I soon became the monitoring guy.
![Page 5: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/5.jpg)
Multiple Zabbix
platforms everywhere!
Z
ZZ
Z
ZZZ
ZZ
Z
Zabbix Server +
Frontends
Zabbix DB X 10+
![Page 6: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/6.jpg)
Not totally satisfied
Minor issues:
- Configuration management
- Template deployment
- Dogfooding
Major issues:
- Performance
- Resilience
![Page 7: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/7.jpg)
The Zabbix-Cloud project was born…
• Create a better Zabbix Platform with :
� Performance� Resilience� Convenient to use for the teams
![Page 8: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/8.jpg)
I started by requesting politely…
•And became certified
A Zabbix Training
•That’s when Kévin, Kévin and Romain joined me.
A small team
![Page 9: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/9.jpg)
The monitoring team.
Manages the monitoring platforms.
Provide the monitoring work requested by the other teams.
Work on various projects.
![Page 10: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/10.jpg)
Our toolkit!
• Saltstack
• Salt-cloud
• Git
• Backup tools
• Cloud accounts to request the API
![Page 11: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/11.jpg)
We kept our old Zabbix platforms.
Z
ZZ
Z
ZZZ
ZZ
Z
For the moment, they will survive.
![Page 12: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/12.jpg)
We then created our first instance: “The Adm”
• We deployed the ADM in the EUW2 Region
• Deployed our toolkit
• Started writing Salt states and cloud infrastructure configuration
All proxies to par1.
A
![Page 13: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/13.jpg)
The Monitoring
VPCs
Connectivity to private subnets with production equipment and vms.
EU-WEST Monitoring
VPCFirewall FR
Firewall Asia
Firewall US
VPN
Vlan to monitor
Vlan to monitor
Vlan to monitor
Vlan to monitor
Vlan to monitor
SOUTHEAST-ASIA
Monitoring VPC
US-EAST Monitoring
VPCVPN
VPN Vlan to monitor
![Page 14: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/14.jpg)
A few months later…
All proxies to par1.
Z
Zabbix-Proxies
Zabbix –Cloud server platform..
Z
A
![Page 15: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/15.jpg)
Remember the small architecture ? Here is
what it looks like now !
![Page 16: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/16.jpg)
![Page 17: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/17.jpg)
Add 2 hundred proxies
( Poor config syncer ! )
Sending them to the Zabbix-Server
Collecting data everywhere
![Page 18: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/18.jpg)
Tips for better performance
Database
DB partitioning and Tuning
High IOPS volumes for the DB
Strong CPU
Massive ram for InnoDB cache
Specific to Zabbix
Active items
Proxies – Active too –
Internal process monitoring
![Page 19: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/19.jpg)
Tips for resilience
Multiple frontends
Multiple DB nodes
Multiple Asterisk
LBs
Send ALL your monitoring data twice, to multiple platforms!
![Page 20: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/20.jpg)
Two Zabbix-cloud
platforms
All proxies to par1.
ZZabbix-Proxies
Zabbix-Cloud Server platforms.
Z
Z
Every Zabbix Agent have 2 proxies in ServerActive=
![Page 21: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/21.jpg)
Both monitoring platforms
operating a the same
time
Zabbix-proxy-1
FR Server
Zabbix-proxy-2
US Server
Grafana-1
![Page 22: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/22.jpg)
Twin monitoring platforms Fe
atu
rin
g th
e in
stan
ces US Zabbix
Server
• Server • Databases + LB• Frontends + LB• Grafana + LB• Jenkins + LB• Asterisk• Smashing• Custom
dashboards in NodeJS, PHP...
Jen
kin
s jo
bs
2x days
Config sync
+
Dashboards
Feat
uri
ng
the
inst
ance
s FR Zabbix Server
• Server • Databases + LB• Frontends + LB• Grafana + LB• Jenkins + LB• Asterisk• Smashing• Custom
dashboards in NodeJS, PHP...
![Page 23: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/23.jpg)
How is the configuration sync done?
MySQL dump
• Configuration tables.
Copy the dump
• To a node of the other Zabbix platform.
Stop everything that queries the DB
Restore the table in parallel
Restart everything
• Server, frontends, Grafana…
![Page 24: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/24.jpg)
Pack everything in an orchestration state
Play it regularly through a Jenkins
state.We do it 2x a day.
It failed 2 times in 2 years.
Name your -1 and -2 proxies with the same proxy name.
![Page 25: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/25.jpg)
Simplicity benefits
Grafana Dashboards• Our users can check their data.• Build their dashboards.• With no access to Zabbix.
A single Zabbix interface.Fast updates and configuration.
Single API
![Page 26: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/26.jpg)
Firing alerts… avoiding x2 calls
Zabbix Server USMain
Asterisk-FR-1
Asterisk-FR-2
Asterisk-US-1
Asterisk-US-2
Zabbix Server FRSecondary
![Page 27: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/27.jpg)
Where we are in summer 2019
Availability Max processed values /s
Items Proxies deployed AVG processed values
100% 250 000 during a burst.
1 300 000 active 160+ 10000
AVG Bandwith on server
Max tested processed alerts
Amount of TV using Grafana dashboards
Charge on the server
DB size
35 mbps Up to 30k 200+ Around 5% 1500 GB
![Page 28: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/28.jpg)
And about the old Zabbix Platforms?
Audited
Extracted the templates and
items
Abandoned
Deleted
![Page 29: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/29.jpg)
Daily mission : Monitoring
the monitoring platforms.
![Page 30: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/30.jpg)
What is coming next?
• Docker migration of everything ( Done for a part.)
• 4.2 migration
• Tons of new proxies and new cloud sites
• Opensourcing some of our home-made tools
![Page 31: the cloud. From 10 standalone Zabbix platforms to …...Where we are in summer 2019 Availability Max processed values /s Items Proxies deployed AVG processed values 100% 250 000 during](https://reader034.fdocuments.in/reader034/viewer/2022042102/5e7ed3cb83ebf315b716a9db/html5/thumbnails/31.jpg)
OUTSCALE1 rue Royale319 bureaux de la Colline92210 Saint-Cloud - Francetel: +33 1 53 27 52 70
outscale.com•Any questions?
• Let’s talk about it.
•Or mail [email protected]