- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Advanced Disk Activity Tracking Tool-iotrace
展开查看详情
1 . ioTrace Another Disk Activity Tracing Tool Ahao Mu (ahao.mah@alibaba-inc.com) June 26, 2018
2 .Background • Requirement proposed by Alibaba’s business line: Process centralized disk activities. • Currently implemented tools can’t meet the requirement.
3 .Pain • The PID/TID are unknown in scenario of disk bandwidth is overhauled. • It brings difficulties to narrow down the problematic processes/threads.
4 .Disk IO Toolset • iotop – Written in Python language, read from /proc/<pid>/io and /proc/diskstats. – Missed DEVICE dimension. • iostat – Written in C language, read from /proc/diskstats, See Documentation/iostats.txt. – Regardless of processes. • blktrace – Written in C language, massive and bogus output. – Tremendous performance overhead. As above all are not the ideal way in our production environment.
5 .Goal of ioTrace • Aware of PID/TID and DEVICE dimensions. • Debugging and monitoring disk’s activities. • Light, agile and easy for daemonizing in production environment.
6 .IO Stack
7 .Techniques of ioTrace • Work on top of block generic layer. • Based on kernel blktrace API. • Built with kernel tracepoints.
8 .The API kernel provided The staTsTcs that ioTrace collects The stages of IO requests are and manipulates: represented by: struct blk_io_trace { enum { __u32 magic; /* MAGIC << 8 | version */ BLK_TC_READ = 1 << 0, /* reads */ __u32 sequence; /* event number */ BLK_TC_WRITE = 1 << 1, /* writes */ __u64 time; /* in nanoseconds */ BLK_TC_FLUSH = 1 << 2, /* flush */ __u64 sector; /* disk offset */ BLK_TC_SYNC = 1 << 3, /* sync */ __u32 bytes; /* transfer length */ BLK_TC_QUEUE = 1 << 4, /* queueing/merging */ __u32 ac(on; /* what happened */ BLK_TC_REQUEUE = 1 << 5, /* requeueing */ __u32 pid; /* who did it */ BLK_TC_ISSUE = 1 << 6, /* issue */ __u32 device; /* device identifier (dev_t) */ BLK_TC_COMPLETE = 1 << 7, /* completions */ __u32 cpu; /* on what cpu did it happen */ BLK_TC_FS = 1 << 8, /* fs requests */ __u16 error; /* completion error */ BLK_TC_PC = 1 << 9, /* pc requests */ __u16 pdu_len; /* length of data aZer this trace */ BLK_TC_NOTIFY = 1 << 10, /* special message */ }; BLK_TC_AHEAD = 1 << 11, /* readahead */ BLK_TC_META = 1 << 12, /* metadata */ BLK_TC_DISCARD = 1 << 13, /* discard requests */ BLK_TC_DRV_DATA = 1 << 14, /* binary driver data */ BLK_TC_FUA = 1 << 15, /* fua requests */ BLK_TC_END = 1 << 15, /* we've run out of bits! */ };
9 .The design of iotrace Key objects and components: 1. CPU List 2. Disk group 3. Epoll 4. Collect thread 5. Analyzer thread 6. Hash table record 7. Ranking logic
10 .Functions of ioTrace • Support TID, PID and DEVICE dimentions. • Collect read_iops, write_iops, read_bytes, write_bytes, total_counts. • Support prompt output to console and lagged json output to remote database. • Support deamonizing and crond’ing mode with systemd. • Support specifying target DEVICE name for monitoring.
11 .Usage Support mulTple arguments: target device, prompt output mode, daemoniziTon or crond running mode, ranking output. #iotrace Usage: iotrace [ -d <dev> | --dev=<dev> ] [ -m | --daemon ] [ -c | --cron ] [ -n <number> | --top_candidates=<pid top max>] [ -f <filename> | --file=<configure file> ] [ -v <version> | --version ] [ -l <live> | --live ] [ -i <interval> | --interval=<seconds> ] [ -p <thread> | --thread=<count> ] -d Used to specify device -m Used to specify daemonize running or not -c Used to specify cron running or not -n Used to specify top candidates, defaults is 3 -l Used to specify show data live or not -p Used to specify mulTple thread max count -i Used to specify interval(second) -f Path to iotrace configure file, defaults to /etc/iotrace/iotrace.conf e.g: #./iotrace -d all -li1 #./iotrace -d /dev/sda,/dev/sdc -li1 #./iotrace -c
12 .Data Accuracy ioTrace iostat Timestamp Metric ioTrace iostat Offset 20180529 r_bytes 2890KB 2737KB +5.5% 13:11:03 20180529 r_bytes 13542KB 14052KB -3.6% 13:11:04
13 .Case Output from ioTrace: Output from SAR: disk uTl 100% Consequence: Kworker is the obstacle
14 .Case Output from ioTrace: Output from SAR: Consequence: PID 125872 is suspecious
15 .Thanks & Questions
16 .