Performance Analysis and Troubleshooting Method for Databases



1. Performance Analysis and Troubleshooting Methodologies for Databases Peter Zaitsev, CEO Percona June 13, 2018 Percona Technical Webinars © 2018 Percona. 1

2.Databases and Performance Databases are frequent Performance Trouble Makers © 2018 Percona. 2

3.Why Databases are Painful ? Generally Non-Linear Scalability Complex Often Poorly understood by developers © 2018 Percona. 3

4.Performance Work with Databases Troubleshooting Capacity Planning Cost and Efficiency Optimization Change Management © 2018 Percona. 4

5.Points of View BlackBox – WhiteBox – “Application “DBA, Ops” Developer” © 2018 Percona. 5

6.Developer Point of View Database as a Blackbox I throw queries at it and it responds DBaaS bring this “promise” to OPS too © 2018 Percona. 6

7.BlackBox Success Criteria for Databases Availability Response Time Correctness Cost © 2018 Percona. 7

8.Ops Point of View Load Resource Utilization System/Hardware Problems Scaling/Capacity Planing © 2018 Percona. 8

9.Methodologies for Performance Troubleshooting and Analyses © 2018 Percona. 9

10.Typical Default Troubleshooting by Random Googling © 2018 Percona. 10

11.Problems with Typical Approach Hard to Assure Outcome Hard to Train People Hard to Automate © 2018 Percona. 11

12.Methodologies Save the Day USE (Utilization, Golden Signals (Latency RED (Rate, Errors(Rate), Saturation, Errors) - Traffic - Errors - Duration) Method Tom Method by Brendan Saturations) Method by Gregg Wilkie Rob Ewaschuk © 2018 Percona. 12

13.USE Method © 2018 Percona. 13

14.USE Method Basics Developed to Troubleshoot Server Performance Issues Resolve 80% of problems with 5% of Effort Operating System Specific Checklists Available © 2018 Percona. 14

15.USE Method in One Sentence “For every resource, check utilization, saturation, and errors.” © 2018 Percona. 15

16.USE Method Terminology Defitinions Resource • all physical server functional components (CPUs, disks, busses, ...) Utilization • the average time that the resource was busy servicing work Saturation • the degree to which the resource has extra work which it can't service, often queued Errors • the count of error events © 2018 Percona. 16

17.USE Method Resources CPUs: sockets, cores, hardware threads (virtual CPUs) Memory: capacity Network interfaces Storage devices: I/O, capacity Controllers: storage, network cards Interconnects: CPUs, memory, I/O © 2018 Percona. 17

18.USE Method with Software Same Basic Resources Apply Additional Software Resources Apply Mutex Locks File Descriptors Connections © 2018 Percona. 18

19.USE Method Benefits Proven Track Record Broad Applicability Detailed Checklists Available © 2018 Percona. 19

20.USE Method Drawbacks Requires Good Understanding of System Architecture Requires Access to Low Level Resources Monitoring Hard to apply in Service “Blackbox” environments © 2018 Percona. 20

21.Cloud Computing Limits in place to isolate Tenants Dynamic Resource Management True “Hardware” properties on Hypervisor level only © 2018 Percona. 21

22.Understanding Queueing © 2018 Percona. 22

23.Many Different Levels Process running on • Actually waiting on the Memory CPU Process Running from • May be put to queue due to CPU User Standpoint saturation Disk request issued to • Can be waiting on network EBS © 2018 Percona. 23

24.USE Method for Linux © 2018 Percona. 24

25.Percona Monitoring and Management 100% Free and Open Source Purpose Build for Open Source Database Monitoring Based on leading Open Source Technologies – Grafana, Prometheus Easy to Set up © 2018 Percona. 25

26.CPU Utilization © 2018 Percona. 26

27.CPU Saturation © 2018 Percona. 27

28.Process View © 2018 Percona. 28

29.Physical Memory © 2018 Percona. 29