Taking Percona Monitoring and Management (PMM) to the next level

让Percona监控和管理(PMM)在Docker上运行只是几个简单步骤的问题。现在,如果你的目标是有一个生产准备,长期监测解决方案,还有更多的要点要考虑。
此外,将PMM的堆栈用于基础结构的其他组件可能是一种跨系统整合监控的好方法。
在本课程中,我们将讨论为生产调整PMM安装和扩展其默认功能的选项。

展开查看详情

1.Taking PMM to the next level Gabriel Ciciliani - Fernando Ipar

2.AGENDA PMM in 2 minutes Production notes Beyond what's in the box QA © The Pythian Group Inc., 2017 2

3.PMM in 2 minutes

4.https://www.percona.com/doc/percona-monitoring-and-management/architecture.html © The Pythian Group Inc., 2017 4

5.PMM Server Distributed as: ● A Docker image ● An Open Virtual Appliance (OVA) package ● An Amazon Machine Image (AMI) © The Pythian Group Inc., 2017 5

6.PMM Client Distributed as: ● An RPM, deb package ● Generic Linux binaries © The Pythian Group Inc., 2017 6

7.Quick Start Reference 1. Create pmm-data container (lives "forever") 2. Create pmm-server container (destroyed on every upgrade) 3. Install at least one agent 4. Point agent to pmm-server 5. Deploy at least one service to monitor https://www.percona.com/doc/percona-monitoring-and-management/deploy/server/docker.setting-up.html https://www.percona.com/doc/percona-monitoring-and-management/deploy/index.html#installing-clients © The Pythian Group Inc., 2017 7

8.Quick Start Reference: Step 1 # docker pull percona/pmm-server:latest # docker create \ -v /opt/prometheus/data \ -v /opt/consul-data \ -v /var/lib/mysql \ -v /var/lib/grafana \ --name pmm-data \ percona/pmm-server:latest /bin/true © The Pythian Group Inc., 2017 8

9.Quick Start Reference: Step 2 # docker run -d \ -p 80:80 \ --volumes-from pmm-data \ --name pmm-server \ --restart always \ percona/pmm-server:latest © The Pythian Group Inc., 2017 9

10.Quick start reference : Steps 3 through 5 # yum install pmm-client # pmm-admin config --server <pmm-server-host>:<pmm-server-port> # pmm-admin add mysql © The Pythian Group Inc., 2017 10

11.Production notes Running PMM in production

12.Production notes ● Store the pmm-data volumes on an dedicated volume ● Adjust total and Prometheus (if possible) memory on shared hosts ● Adjust metrics and queries retention ● Adjust metrics resolution if needed ● Enable authentication ● Backups ● Capacity planning © The Pythian Group Inc., 2017 12

13.Production notes: pmm-data # docker create \ -v /pmmdata/prometheus/data:/opt/prometheus/data \ -v /pmmdata/consul-data:/opt/consul-data \ -v /pmmdata/mysql:/var/lib/mysql \ -v /pmmdata/grafana:/var/lib/grafana \ --name pmm-data \ percona/pmm-server:latest /bin/true There are gotchas: ● When using docker.io (ie. Centos < 7) bind mounts will hide any image existing files in the mount point. ● On Centos, file privileges and owners set from within the container are visible from the host OS and may overlap with existing © The Pythian Group Inc., 2017 13

14.Production notes: adjusting memory # docker run --memory=4G -e METRICS_MEMORY=209152 ... ● METRICS_MEMORY only works on Prometheus v1 (PMM < 1.13) ● Set it to 2/3 of the total memory assigned © The Pythian Group Inc., 2017 14

15.Production notes: adjusting retention # docker run -e METRICS_RETENTION=4400h -e QUERIES_RETENTION=4400h ... ● METRICS_RETENTION sets storage.local.retention for Prometheus 1.x (PMM < 1.13) or storage.tsdb.retention for Prometheus 2.x. ● QUERIES_RETENTION is used as argument for the purge-qan-data script. © The Pythian Group Inc., 2017 15

16.Production notes: metrics resolution # docker run -e METRICS_RESOLUTION=5 ... ● Can be used in cases where latency is above 1 second © The Pythian Group Inc., 2017 16

17. Production notes: enabling authentication # docker run -e SERVER_USER=<username> -e SERVER_PASSWORD=<password> … # pmm-admin config --server 127.0.0.1 --server-user <username> --server-password <password> OK, PMM server is alive. PMM Server | 127.0.0.1 (password-protected) Client Name | coltrane Client Address | 172.17.0.1 © The Pythian Group Inc., 2017 17

18.Production notes: backups ● pmm-server container is ephemeral by design ● Cold backup is the official method ● pmm-data should be backed up while pmm-server is not running https://www.percona.com/doc/percona-monitoring-and-management/deploy/server/docker.backing-up.html © The Pythian Group Inc., 2017 18

19.Production notes: capacity planning As of v1.13: ● At least 2Gb of memory for a production environment (up 5 monitored target systems) ● From there, consider 8 target systems per Gb of memory available for pmm-server ● CPU: ~50 monitored systems per core (or 100K metrics/sec per CPU core) ● ~1 GB of storage for each monitored database node (data retention=1w) © The Pythian Group Inc., 2017 19

20.Production notes: capacity planning Monitoring PMM resources ● System Overview dashboard, choosing container host or container name ● If container name chosen, memory limits not reflected (full host memory will be displayed) ● Visible with docker stats pmm-server ● Keep an eye on: ● CPU Saturation and Max Core Usage ● Memory Utilization ● Disk IO Load https://www.percona.com/blog/2018/09/28/scaling-percona-monitoring-management-pmm/ © The Pythian Group Inc., 2017 20

21.Beyond what's in the box

22.Beyond what's in the box: Application Instrumentation 1 © The Pythian Group Inc., 2017 22

23.Beyond what's in the box: Application Instrumentation 1 $ cat /usr/local/percona/pmm-client/queries-mysqld.yml --- Metric name mysql_orders_placed_last_minute: query: "select count(id) as orders from pleu18.orders where created_at between now() - interval 1 minute and now()" metrics: - orders: Legend usage: "GAUGE" description: "orders placed per minute" © The Pythian Group Inc., 2017 23

24.Beyond what's in the box: Application Instrumentation 2 © The Pythian Group Inc., 2017 24

25.Beyond what's in the box: Application Instrumentation 2 ● Using Grok exporter https://github.com/fstab/grok_exporter ● Config: global: config_version: 2 input: type: file path: ./magento_exception.log readall: true © The Pythian Group Inc., 2017 25

26.Beyond what's in the box: Application Instrumentation 2 Config (continued): grok: patterns_dir: ./patterns additional_patterns: - 'MAGENTO_MESSAGE .*' - 'NUMBER [0-9]*' © The Pythian Group Inc., 2017 26

27.Beyond what's in the box: Application Instrumentation 2 Config (continued): metrics: - type: counter name: magento_critical_errors help: Total number of main.CRITICAL entries in the magento log match: '.* main.CRITICAL: %{MAGENTO_MESSAGE:message}' labels: Tweak MAGENTO_MESSAGE pattern to get error_message: '{{.message}}' more info server: host: localhost port: 9144 © The Pythian Group Inc., 2017 27

28. Beyond what's in the box: Application Instrumentation 2 Start and add it: # ./grok_exporter --config ./config.yml # pmm-admin add external:service magento --service-port=9144 --path /metrics Add the graph: name from config.yml labels from config. yml © The Pythian Group Inc., 2017 28

29.Beyond what's in the box: Cassandra © The Pythian Group Inc., 2017 29