MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between...
Transcript of MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between...
![Page 1: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/1.jpg)
MONITORING 101: POSTGRESQLJASON YEE, DATADOG @gitbisect
![Page 2: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/2.jpg)
TW: @gitbisect @datadoghq
![Page 3: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/3.jpg)
COLLECTING DATA IS CHEAP; NOT HAVING IT WHEN YOU NEED IT CAN BE EXPENSIVE
SO INSTRUMENT ALL THE THINGS!
TW: @gitbisect @datadoghq
![Page 4: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/4.jpg)
@gitbisect Technical Writer/Evangelist “Docs & Talks” Travel Hacker & Whiskey Hunter
@datadoghq SaaS-based monitoring Trillions of data points per day http://jobs.datadoghq.com
![Page 5: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/5.jpg)
SCALING & MONITORING POSTGRESQL AT DATADOG
TW: @gitbisect @datadoghq
![Page 6: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/6.jpg)
MOAR RESOURCES!
TW: @gitbisect @datadoghq
![Page 7: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/7.jpg)
MOAR INSTANCES!
TW: @gitbisect @datadoghq
![Page 8: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/8.jpg)
Writes Repl
TW: @gitbisect @datadoghq
Reads
![Page 9: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/9.jpg)
HOW WE DO IT
REQUIREMENTS▸Write master is writeable, read replicas are
readable!
TW: @gitbisect @datadoghq
![Page 10: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/10.jpg)
HOW WE DO IT
REQUIREMENTS▸Write master is writeable, read replicas are
readable!
▸ Read replicas are up to date and don’t lag
TW: @gitbisect @datadoghq
![Page 11: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/11.jpg)
HOW WE DO IT
REQUIREMENTS▸Write master is writeable, read replicas are
readable!
▸ Read replicas are up to date and don’t lag
▸ Additional read replicas can be provisioned quickly
TW: @gitbisect @datadoghq
![Page 12: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/12.jpg)
TW: @gitbisect @datadoghq
![Page 13: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/13.jpg)
HOW WE DO IT
SOLUTIONS▸ PostgreSQL!
▸ http://bit.ly/pg-repl-docs
▸WAL-E
▸ https://github.com/wal-e/wal-e
TW: @gitbisect @datadoghq
![Page 14: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/14.jpg)
HOW WE DO IT
SOLUTIONS▸ PostgreSQL!
▸ http://bit.ly/pg-repl-docs
▸WAL-E
▸ https://github.com/wal-e/wal-e
TW: @gitbisect @datadoghq
![Page 16: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/16.jpg)
Writes Repl
TW: @gitbisect @datadoghq
Standbys
App2 Reads
App1 Reads
![Page 17: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/17.jpg)
HOW DO WE THINK ABOUT MONITORING?
TW: @gitbisect @datadoghq
![Page 18: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/18.jpg)
4 QUALITIES OF GOOD METRICSNOT ALL METRICS ARE EQUAL
TW: @gitbisect @datadoghq
![Page 19: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/19.jpg)
1. WELL UNDERSTOOD
![Page 20: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/20.jpg)
1. WELL UNDERSTOOD
![Page 21: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/21.jpg)
1. WELL UNDERSTOOD
![Page 22: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/22.jpg)
TW: @gitbisect @datadoghq
2. SUFFICIENT GRANULARITY
![Page 23: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/23.jpg)
1 second Peak 46%
1 minute Peak 36%
5 minutes Peak 12%
![Page 24: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/24.jpg)
1 second Peak 46%
1 minute Peak 36%
5 minutes Peak 12%
![Page 25: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/25.jpg)
1 second Peak 46%
1 minute Peak 36%
5 minutes Peak 12%
![Page 26: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/26.jpg)
3. TAGGED & FILTERABLE
TW: @gitbisect @datadoghq
![Page 27: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/27.jpg)
![Page 28: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/28.jpg)
![Page 29: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/29.jpg)
![Page 30: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/30.jpg)
Query Based Monitoring“What’s the average throughput of application:nginx per version ?”
“How many requests per second is my role:accounting-app running application:postgresql hosted in region:us-west-1 compared to region:us-east-1?”
TW: @gitbisect @datadoghq
![Page 31: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/31.jpg)
4. LONG-LIVED
TW: @gitbisect @datadoghq
![Page 32: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/32.jpg)
![Page 33: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/33.jpg)
![Page 34: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/34.jpg)
M T W TH F M T W TH F M T W TH FM T W TH F
![Page 35: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/35.jpg)
M T W TH F M T W TH F M T W TH FM T W TH F
OUTAGE? TUESDAY HOLIDAY?
![Page 36: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/36.jpg)
TYPES OF METRICSA FRAMEWORK
TW: @gitbisect @datadoghq
![Page 37: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/37.jpg)
TW: @gitbisect @datadoghq
![Page 38: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/38.jpg)
TW: @gitbisect @datadoghq
![Page 39: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/39.jpg)
TW: @gitbisect @datadoghq
![Page 40: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/40.jpg)
TW: @gitbisect @datadoghq
P.S. - June 1! Mark your calendar!
![Page 41: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/41.jpg)
TW: @gitbisect @datadoghq
![Page 42: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/42.jpg)
TW: @gitbisect @datadoghq
WHAT TO PAGE ON?
![Page 43: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/43.jpg)
RECURSE UNTIL YOU FIND THE TECHNICAL CAUSES
TW: @gitbisect @datadoghq
![Page 44: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/44.jpg)
WHAT DO WE MONITOR AT DATADOG?
TW: @gitbisect @datadoghq
![Page 45: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/45.jpg)
METRICS
WHAT METRICS DO WE GATHER?connections commits rollbacks disk_read buffer_hit rows_returned rows_fetched rows_inserted rows_updated rows_deleted database_size deadlocks temp_bytes
temp_files bgwriter.checkpoints_timed bgwriter.checkpoints_requested bgwriter.buffers_checkpoint bgwriter.buffers_clean bgwriter.maxwritten_clean bgwriter.buffers_backend bgwriter.buffers_alloc bgwriter.buffers_backend_fsync bgwriter.write_time bgwriter.sync_time locks seq_scans
seq_rows_read index_scans index_rows_fetched rows_hot_updated live_rows dead_rows index_rows_read table_size index_size total_size table.count max_connections percent_usage_connections
replication_delay replication_delay_bytes heap_blocks_read heap_blocks_hit index_blocks_read index_blocks_hit toast_blocks_read toast_blocks_hit toast_index_blocks_read toast_index_blocks_hit
TW: @gitbisect @datadoghq
![Page 46: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/46.jpg)
Alert on work metrics, but resource metrics become work metrics? Alert on everything?
TW: @gitbisect @datadoghq
![Page 47: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/47.jpg)
How resource metrics become work metrics (and who to alert)
TW: @gitbisect @datadoghq
Leadership
![Page 48: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/48.jpg)
WHO TO ALERT?
TW: @gitbisect @datadoghq
Developers
![Page 49: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/49.jpg)
WHO TO ALERT?
TW: @gitbisect @datadoghq
PostgreSQL Team (Ops/DRE)
![Page 50: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/50.jpg)
WHO TO ALERT?
TW: @gitbisect @datadoghq
Ops/SRE
![Page 51: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/51.jpg)
POSTGRESQL WORK METRICS (AVAILABILITY)
TW: @gitbisect @datadoghq
![Page 52: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/52.jpg)
HOW WE DO IT
REQUIREMENTS▸Write master is writeable, read replicas are
readable!
▸ Read replicas are up to date and don’t lag
▸ Additional read replicas can be provisioned quickly
TW: @gitbisect @datadoghq
![Page 53: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/53.jpg)
ALERT ON WORK METRICS
WHAT ARE WE ALERTING ON?▸Write master is writeable, read replicas are
readable!
▸ Up/Down checks
▸ Latency
TW: @gitbisect @datadoghq
![Page 54: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/54.jpg)
ALERT ON WORK METRICS
WHAT ARE WE ALERTING ON?▸ Read replicas are up to date and don’t lag
▸Write master standby availability
▸Write master standby replication lag
▸ Read replica lag
TW: @gitbisect @datadoghq
![Page 55: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/55.jpg)
ALERT ON WORK METRICS
WHAT ARE WE ALERTING ON?▸ Additional read replicas can be provisioned
quickly
▸ Base backup is too old
TW: @gitbisect @datadoghq
![Page 56: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/56.jpg)
POSTGRESQL RESOURCE METRICS (CAPACITY)
TW: @gitbisect @datadoghq
![Page 57: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/57.jpg)
ALERT ON WORK METRICS
WHAT ARE WE ALERTING ON?▸ CPU
TW: @gitbisect @datadoghq
![Page 58: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/58.jpg)
ALERT ON WORK METRICS
WHAT ARE WE ALERTING ON?▸ CPU
▸Memory
TW: @gitbisect @datadoghq
![Page 59: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/59.jpg)
ALERT ON WORK METRICS
WHAT ARE WE ALERTING ON?▸ CPU
▸Memory
▸Disk space
TW: @gitbisect @datadoghq
![Page 60: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/60.jpg)
ALERT ON WORK METRICS
WHAT ARE WE ALERTING ON?▸ CPU
▸Memory
▸Disk space
▸ Connection limit
TW: @gitbisect @datadoghq
![Page 61: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/61.jpg)
MONITORING TO IMPROVE PERFORMANCE
TW: @gitbisect @datadoghq
![Page 62: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/62.jpg)
POSTGRESQL PERFORMANCE
WHERE TO GET THE MOST PERFORMANCE GAINS?Josh Berkus: http://bit.ly/pg-perf-15m
1. Cut Activity 2. Slow Queries 3. Scale Stack 4. Fix Hardware 5. Postgresql.conf
TW: @gitbisect @datadoghq
![Page 63: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/63.jpg)
TW: @gitbisect @datadoghq
CUT ACTIVITY
![Page 64: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/64.jpg)
TW: @gitbisect @datadoghq
SLOW QUERIES
![Page 65: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/65.jpg)
PERFORMANCE: LATENCY VS POTENTIAL
![Page 66: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/66.jpg)
LATENCY VS POTENTIAL
HOW DO YOU DEFINE PERFORMANCE?SELECT * FROM tiny_table WHERE nonindexed_col=1
VERSUS
SELECT * FROM massive_table WHERE indexed_col=1
TW: @gitbisect @datadoghq
![Page 67: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/67.jpg)
PERFORMANCE: RAM VS DISK
![Page 68: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/68.jpg)
“Aside from shared_buffers, the most important memory-allocation parameter is work_mem… Raising this value can dramatically improve the performance of certain queries…”
ROBERT HAAS
TW: @gitbisect @datadoghq
![Page 69: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/69.jpg)
“Aside from shared_buffers, the most important memory-allocation parameter is work_mem… Raising this value can dramatically improve the performance of certain queries, but it's important not to overdo it.”
ROBERT HAAS
TW: @gitbisect @datadoghq
![Page 70: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/70.jpg)
TW: @gitbisect @datadoghq
FINDING **INEFFICIENT** QUERIES
![Page 71: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/71.jpg)
LATENCY VS POTENTIAL
EXPLAIN ANALYZE
TW: @gitbisect @datadoghq
http://bit.ly/pg-explain
‣ Explain displays the execution plan
![Page 72: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/72.jpg)
LATENCY VS POTENTIAL
EXPLAIN ANALYZE
TW: @gitbisect @datadoghq
http://bit.ly/pg-explain
‣ Explain displays the execution plan
‣ Analyze runs it and gathers stats
![Page 73: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/73.jpg)
LATENCY VS POTENTIAL
EXPLAIN ANALYZEMerge Right Join (cost=25870.55..31017.51 rows=229367 width=92) (actual time=2884.501..5147.047 rows=354834 loops=1)
Merge Cond: (a.uid = b.uid) -> Index Scan using foo on bar a (cost=0.00..537.29 rows=9246 width=27) (actual time=0.049..41.782 rows=9246 loops=1)
-> Materialize (cost=25870.49..27204.80 rows=106745 width=81) (actual time=2884.413..3804.537 rows=354834 loops=1) -> Sort (cost=25870.49..26137.35 rows=106745 width=81) (actual time=2884.406..3099.732 rows=111878 loops=1)
Sort Key: b.uid Sort Method: external merge Disk: 8928kB… Total runtime: 5588.105 ms(14 rows)
http://bit.ly/pg-auto-explain
TW: @gitbisect @datadoghq
![Page 74: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/74.jpg)
POSTGRESQL PERFORMANCE
SUMMARY1. Understand the difference between work
metrics, resource metrics & events
TW: @gitbisect @datadoghq
![Page 75: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/75.jpg)
POSTGRESQL PERFORMANCE
SUMMARY1. Understand the difference between work
metrics, resource metrics & events 2. Alert on the appropriate work metrics
TW: @gitbisect @datadoghq
![Page 76: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/76.jpg)
POSTGRESQL PERFORMANCE
SUMMARY1. Understand the difference between work
metrics, resource metrics & events 2. Alert on the appropriate work metrics 3. Consider potential, not just latency
TW: @gitbisect @datadoghq
![Page 77: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/77.jpg)
POSTGRESQL PERFORMANCE
SUMMARY1. Understand the difference between work
metrics, resource metrics & events 2. Alert on the appropriate work metrics 3. Consider potential, not just latency 4. Keep smaller (reasonable) limits to find
less than optimal queries
TW: @gitbisect @datadoghq
![Page 78: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/78.jpg)
POSTGRESQL PERFORMANCE
RESOURCES‣ http://dtdg.co/monitor-postgres
‣ https://dtdg.co/rds-postgresql
‣ https://dtdg.co/postgresql-vacuums
TW: @gitbisect @datadoghq
![Page 79: MONITORING 101: POSTGRESQL · POSTGRESQL PERFORMANCE SUMMARY 1. Understand the difference between work metrics, resource metrics & events 2. Alert on the appropriate work metrics](https://reader035.fdocuments.in/reader035/viewer/2022070620/5e2fca5439368f1eac018bb1/html5/thumbnails/79.jpg)
RATE MY SESSION