- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Troubleshooting Crashing Issues
这个网络研讨会是为每个MySQL用户准备的!在本文中,我不会重点讨论如何分析核心文件、读取源代码和设置断点。相反,我将关注任何人都可以使用的技术,即使是新手。
许多教程,包括我自己的,基于RoelvandePaar的视频编写的,建议如何创建和分析崩溃时创建的核心文件。虽然这是最强大的故障排除技术之一,但并不是唯一的技术。对于任何级别的MySQL用户,都有更简单和可访问的方法。
您将了解MySQL服务器崩溃的原因、如何识别崩溃的来源以及如何在将来防止崩溃。
“与MySQL服务器的连接断开”和“服务器已断开”是应用程序在MySQL服务器崩溃时看到的消息。
在本次网络研讨会中,我将讨论:
-先检查什么
-如何理解错误消息
-车祸的典型原因
-如何在不接触mysql源代码的情况下找到崩溃的原因
-如何读取回溯
我还将介绍这些技术的局限性,例如它们没有揭示什么,以及它们中最复杂的部分是什么。
展开查看详情
1 .Troubleshooting Crashing Issues January, 25, 2018 Sveta Smirnova
2 .Table of Contents •Is this Really Crash? •What to Check? •Typical Reasons for Crashes •How to Get and Read Backtraces •What to Do Next 2
3 .Is this Really Crash?
4 .Two main symptoms • 2013: Lost connection to MySQL server... • Got an error reading communication packets 4
5 .They can happen, because • Connection error • Timeout • Server hangs • Normal shutdown • Crash 5
6 .What to Check?
7 .Was Server Restarted? • Uptime mysql> show global status like ’Uptime’; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | Uptime | 8690 | +---------------+-------+ 1 row in set (0.00 sec) 7
8 .Was Server Restarted? • Uptime mysql> show global status like ’Uptime’; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | Uptime | 8690 | +---------------+-------+ 1 row in set (0.00 sec) • How much time passed after planned restart? 7
9 .Was Server Restarted? • Uptime mysql> show global status like ’Uptime’; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | Uptime | 8690 | +---------------+-------+ 1 row in set (0.00 sec) • How much time passed after planned restart? • Not reliable 7
10 .Was Server Restarted? • Uptime • Error log file • Normal shutdown 2018-01-22T17:40:29.557089Z 0 [Note] Giving 1 client threads a chance to die gracefully ... 2018-01-22T17:40:31.572000Z 0 [Note] InnoDB: Starting shutdown... ... 2018-01-22T17:40:33.313018Z 0 [Note] /home/sveta.smirnova/5.7.14/bin/mysqld: Shutdown complete 7
11 .Was Server Restarted? • Uptime • Error log file • Crash 2018-01-18T18:47:46.749649Z 0 [ERROR] [FATAL] InnoDB: Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung. 2018-01-18 13:47:46 0x7eff727ff700 InnoDB: Assertion failure in thread 139635602749184 in file ut0ut.cc line 917 InnoDB: We intentionally generate a memory trap. ... Version: ’5.7.14-8’ socket: ’/tmp/mysql_sandbox5714.sock’ port: 5714 Percona Server (GPL), Release 8, Revision 1f84ccd ... 7
12 .Communication Errors and Timeouts • Aborted % mysql> show status like ’Aborted%’; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | Aborted_clients | 541 | | Aborted_connects | 683 | +------------------+-------+ 2 rows in set (0.00 sec) 8
13 .Communication Errors and Timeouts • Aborted % • log warnings > 1 2018-01-22T17:51:58.542771Z 3 [Note] Aborted connection 3 to db: ’unconnected’ user: ’root’ host: ’localhost’ (Got an error reading communication packets) 8
14 .Communication Errors and Timeouts • Aborted % • log warnings > 1 • Processlist mysql> show processlist; +----+------+-----------------+------+---------+------+----------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------------+------+---------+------+----------+------------------+ | 4 | root | localhost:38620 | NULL | Query | 0 | starting | show processlist | +----+------+-----------------+------+---------+------+----------+------------------+ 1 row in set (0.00 sec) 8
15 .Communication Errors and Timeouts • Aborted % • log warnings > 1 • Processlist mysql> show processlist; +----+------+-----------------+------+---------+------+----------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------------+------+---------+------+----------+------------------+ | 4 | root | localhost:38620 | NULL | Query | 0 | starting | show processlist | +----+------+-----------------+------+---------+------+----------+------------------+ 1 row in set (0.00 sec) • Watch if connections still open 8
16 .Server Hangs • Processlist ... | 732 | msa ... | 8645 | statistics | SELECT c FROM sbtest1 WHERE id=5028787 | | 733 | msa... | 8645 | statistics | SELECT c FROM sbtest1 WHERE id=3023457 | | 734 | msa... | 8645 | statistics | SELECT c FROM sbtest1 WHERE id=4980329 | | 735 | msa... | 8645 | statistics | SELECT c FROM sbtest1 WHERE id=4995094 | | 736 | msa... | 8645 | statistics | SELECT c FROM sbtest1 WHERE id=5263263 | | 739 | msa... | 0 | starting | show processlist | +-----+----------+------+------------------------+----------------------------------------+ 735 rows in set (0.00 sec) 9
17 .Server Hangs • Processlist • Error log 2018-01-19T20:38:43.381127Z 0 [Warning] InnoDB: A long semaphore wait: --Thread 139970010412800 has waited at ibuf0ibuf.cc line 3454 for 321.00 seconds the semaphore: S-lock on RW-latch at 0x7f4dde2ea310 created in file buf0buf.cc line 1453 a writer (thread id 139965530261248) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: fffffffff0000000 Last time read locked in file ibuf0ibuf.cc line 3454 Last time write locked in file /mnt/workspace/percona-server-5.7-binaries-release/label_exp/ debian-wheezy-x64/percona-server-5.7.14-8/storage/innobase/btr/btr0btr.cc line 177 2018-01-19T20:38:43.381143Z 0 [Warning] InnoDB: A long semaphore wait: --Thread 139965135804160 has waited at buf0buf.cc line 4196 for 321.00 seconds the semaphore: S-lock on RW-latch at 0x7f4f257d33c0 created in file hash0hash.cc line 353 a writer (thread id 139965345621760) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file buf0buf.cc line 4196 9 Last time write locked in file ...
18 .Crash • Error Log Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 7fdd70bf1df0 thread_stack 0x800000 /home/sveta.smirnova/5.7.14/bin/mysqld(my_print_stacktrace+0x2e)[0xfeb29e] /home/sveta.smirnova/5.7.14/bin/mysqld(handle_fatal_signal+0x4a1)[0x942931] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7fe00f5ee330] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fe00d775c37] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fe00d779028] /home/sveta.smirnova/5.7.14/bin/mysqld[0x9130f3] /home/sveta.smirnova/5.7.14/bin/mysqld(_Z26btr_pcur_move_to_next_pageP10btr_pcur_tP5mtr_t+0x1ff)[0 /home/sveta.smirnova/5.7.14/bin/mysqld[0x1187413] /home/sveta.smirnova/5.7.14/bin/mysqld(_Z15row_search_mvccPh15page_cur_mode_tP14row_prebuilt_tmm+0 /home/sveta.smirnova/5.7.14/bin/mysqld(_ZN11ha_innobase13general_fetchEPhjj+0x1d2)[0x10933d2] /home/sveta.smirnova/5.7.14/bin/mysqld(_ZN7handler13ha_index_nextEPh+0xe1)[0x99be61] 10
19 .Crash • Error Log /home/sveta.smirnova/5.7.14/bin/mysqld(_ZN7handler15read_range_nextEv+0x29)[0x9a0349] /home/sveta.smirnova/5.7.14/bin/mysqld(_ZN7handler21multi_range_read_nextEPPc+0xd2)[0x997762] /home/sveta.smirnova/5.7.14/bin/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0x52)[0xf0cee2] /home/sveta.smirnova/5.7.14/bin/mysqld[0xd6b3f6] /home/sveta.smirnova/5.7.14/bin/mysqld(_Z10sub_selectP4JOINP7QEP_TABb+0x276)[0xdd0c06] /home/sveta.smirnova/5.7.14/bin/mysqld(_ZN4JOIN4execEv+0x26f)[0xdcac4f] /home/sveta.smirnova/5.7.14/bin/mysqld(_Z12handle_queryP3THDP3LEXP12Query_resultyy+0x18d)[0xe2b75d /home/sveta.smirnova/5.7.14/bin/mysqld[0x909afa] /home/sveta.smirnova/5.7.14/bin/mysqld(_Z21mysql_execute_commandP3THDb+0x31a4)[0xdf62d4] /home/sveta.smirnova/5.7.14/bin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x5d5)[0xdf8a25] /home/sveta.smirnova/5.7.14/bin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0 /home/sveta.smirnova/5.7.14/bin/mysqld(_Z10do_commandP3THD+0x1cd)[0xdfa40d] /home/sveta.smirnova/5.7.14/bin/mysqld(handle_connection+0x2a4)[0xeaf8a4] /home/sveta.smirnova/5.7.14/bin/mysqld(pfs_spawn_thread+0x1b4)[0x101a3a4] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fe00f5e6184] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fe00d83cffd] 10
20 .Crash • Error Log Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (7fde2ba52030): SELECT DISTINCT c FROM sbtest1 WHERE id BETWEEN 5008118 AND 5008118+99 ORDER BY c Connection ID (thread ID): 234 10
21 .Crash • Error Log • General query log $ tail var/mysqld.1/mysqld.log 2018-01-22T21:23:21.493490Z 4 Query insert into big1 select * from a 2018-01-22T21:23:47.847059Z 4 Query insert into big2 select * from a 2018-01-22T21:24:22.695518Z 4 Query set global log_output=’file’ 2018-01-22T21:24:27.741938Z 4 Query set global general_log=0 /home/sveta/build/ps-5.7/bin/mysqld, Version: 5.7.20-19-debug-log. started with: Tcp port: 13001 Unix socket: /home/sveta/build/ps-5.7/mysql-test/var/tmp/mysqld.1.sock Time Id Command Argument 2018-01-22T21:24:38.289933Z 4 Query set session optimizer_switch=’duplicateweedout=off, materialization=on’ 2018-01-22T21:24:49.757008Z 4 Query select * from big1 where id in (select id from big2 ignore index (primary)) limit 10 10
22 .Crash • Error Log • General query log • Audit log $ tail -n 14 var/mysqld.1/data/audit.log <AUDIT_RECORD NAME="Query" RECORD="7727_2018-01-22T21:33:54" TIMESTAMP="2018-01-22T21:34:26 UTC" ... SQLTEXT="set session optimizer_switch=’duplicateweedout=off,materialization=on’" USER="root[root] @ localhost [127.0.0.1]" HOST="localhost" OS_USER="" IP="127.0.0.1" DB="test" 10 />
23 .Crash • Error Log • General query log • Audit log • May not include crashing statement! 10
24 .Crash • Error Log • General query log • Audit log • Your application logs 10
25 .Crash • Error Log • General query log • Audit log • Your application logs • System logs • Redirect from error and audit logs 10
26 .Typical Reasons for Crashes
27 .Major Groups • Data corruption • Hardware access failures • Hardware broken • Hardware not available No disk space Out of memory OS killed mysqld • Software bug 12
28 .Data Corruption • Always reported in the error log file 13
29 .Data Corruption • Always reported in the error log file InnoDB: but the tablespace does not exist or is just being dropped. 2017-03-27 20:04:36 7fd1c4d3e700 InnoDB: Error: trying to access tablespace 1731 page no. 3, InnoDB: but the tablespace does not exist or is just being dropped. InnoDB: Error: Unable to read tablespace 1731 page no 3 into the buffer pool after 100 attempts InnoDB: The most probable cause of this error may be that the table has been corrupted. InnoDB: You can try to fix this problem by using innodb_force_recovery. InnoDB: Please see reference manual for more details. InnoDB: Aborting... 2017-03-27 20:04:36 7fd1c4d3e700 InnoDB: Assertion failure in thread 140538927113984 in file buf0buf.cc line 2740 13