- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
【分会场三05-李钰 唐云】Flink中的两类新型状态存储
展开查看详情
1 . FlinkӾጱӷᔄෛࣳᇫாਂؙ Introduction of Two New StateBackends ݪلғᴨ᯾૬૬ ݪلғᴨ᯾૬૬ ᘳ֖ғṛᕆದӫਹ ᘳ֖ғṛᕆݎૡᑕ ᄍᦖᘏғ᰷ҁᕷᶮ҂ ᄍᦖᘏғࠈԯҁଗ҂ $OLEDED 6HQLRU([SHUW $OLEDED 6HQLRU6RIWZDUH(QJLQHHU $SDFKH+%DVH 30& $SDFKH)OLQN &RQWULEXWRU <X/LOL\X#DSDFKHRUJ <XQ7DQJFKDJDQW\#DOLEDEDLQFFRP
2 . य़ᕐ 2XWOLQH Ø )OLQNӾጱᇫாහഝٌ݊ਂࢧؙᶶ 5HYLHZRI)OLQN 6WDWHDQG6WDWH%DFNHQG Ø Ӟᐿӧ)XOO*&ጱٖਂᇫாਂؙᥴ٬ොໜ $+HDS.H\HG6WDWH%DFNHQG WKDW1HYHU)XOO*& Ø Ӟᐿᦇᓒਂړؙᐶጱᇫாਂؙᥴ٬ොໜ 'HFRXSOH6WRUDJHDQG &RPSXWHRI 6WDWH0DQDJHPHQWLQ6WUHDP&RPSXWLQJ
3 . )OLQNӾጱᇫாහഝ Review of Flink State Input Stream Output Stream Keyed vs. Operator Event Pattern ML Model State Historic Data Raw vs. Managed
4 . )OLQNጱᇫாਂؙ Review of Flink State Backends HeapKeyedStateBackend RocksDBKeyedStateBackend Ø State lives as Java objects on the heap Ø State lives as serialized bytes in offheap memory Ø Data is de/serialized only during state and on local disk snapshot and restore Ø Data is de/serialized on every read and update Ø Highest Performance Ø Lower Performance (~order of magnitude) Ø Memory overhead of representation Ø Relative low overhead of representation Ø State size limited by available heap memory Ø State size limited by available local disk space Ø Affected by GC overheads Ø Not affected by GC overheads Ø Currently no incremental checkpoints Ø Naturally supports incremental checkpoints
5 .ਁ֛ ӧ)XOO*&ጱᇫாਂؙᥴ٬ොໜ A HEAP KEYED STATEBACKEND THAT NEVER FULLGC
6 . ᭌೠᇫாਂؙ੫੩ጱᔿᕮ 7KH+HVLWDWLRQ:KHQ&KRRVLQJ%DFNHQGV ԅՋԍểܩአᇍڏҘ ጱᦡᦇአԭଫጱܧᴾ :K\XVLQJVOHGJHKDPPHUWRFUDFNDQXW" FRQVLGHUDWLRQIRUWKHFDVH
7 . ໐ஞದғٖਂᇫாፊഴ Key Tech: Heap Status Monitor Ø ྯӻ.*؉ٖਂᕹᦇҁ݇ᘍ/XQFHQH 5DP8VDJH(VWLPDWRU҂ 'RKHDSDFFRXQWLQJIRUHDFK.*VLPLODUWR/XQFHQHŇV 5DP8VDJH(VWLPDWRU Ø ਧ๗᭗ᬦ-DYD0;%HDQV឴ݐञٖਂֵአఘེ&*҅٭හ؊ᳵ ٖਂֵአఘ٭ፊഴ *HWKHDSXVHGJF FRXQWDQGSDXVHWLPHZLWKIL[HGLQWHUYDO Heap Status Monitor Ø ༄ັฎํވROGJFݎኞଚӬٖਂֵአ᩻ᬦᯈᗝᴀ꧊ &KHFNZKHWKHUKHDSXVHGH[FHHGVWKUHVKROGDIWHUROGJF Ø ༄ັฎํވJF؊ᳵ᩻ᬦᯈᗝᴀ꧊ &KHFNZKHWKHUJF SDXVHWLPHH[FHHGVWKUHVKROG
8 . ໐ஞದғහഝ៧ፏےᒽኼ Key Tech: Data Spill and Load Policy Ø ᴻٖਂᕹᦇक़҅ྯӻ.*؉ᦢᳯེ᷇ᕹᦇ 5HFRUGDQGFRPSXWHUHTXHVWUDWHIRUHDFK.* Ø ےݐଘ࣐ጱොୗᦇᓒӧݶ.*ጱ᯿ හഝ៧ፏےᒽኼ &RPSXWHWKHZHLJKWHGDYHUDJHRQWKHQRUPDOL]HG.*VL]HDQGUHTXHVWUDWH Data Spill and Load Policy Ø ៧ፏᭌೠᒽኼғսض៧ፏܛአٖਂ๋ग़҅᧗๋ጱ.* 6SLOOSROLF\FKRRVH.*ZLWKPRUHKHDSRFFXSDQF\DQGORZHUUHTXHVWUDWH Ø ےᭌೠᒽኼғսےض᧗๋᷇ᔺ҅ܛአٖਂ๋ጱ.* /RDGSROLF\FKRRVH.*ZLWKKLJKHUUHTXHVWUDWHDQGOHVVKHDSRFFXSDQF\
9 . ໐ஞದғ៧ፏහഝਂؙᕮ݊ीᰁFKHFNSRLQWඪ೮ Key Tech: Data Structure on Disk and Incremental Checkpoint Ø ֵአӞᐿᔲٻጱ᪡ᤒᕮ $FRPSDFWHG6NLS/LVW VWUXFWXUHIRUGDWDRQGLVN ᏺፏහഝਂؙᕮ Ø ᭗ᬦGHOWDFKDLQጱොୗඪ೮ٟ॔҅ګعፗളๅෛ 'DWD6WUXFWXUHRQ'LVN 8VHGHOWDFKDLQLQVWHDGRIXSGDWHLQSODFHWRVXSSRUWFRS\RQZULWH Ø ݇ᘍᦞғ1LWUR IURP9/'% 5HIHUULQJWRWKH1LWUR SDSHUIURP9/'% Ø ीےVQDSVKRWᇇଧݩ $GGDQGUHFRUGVQDSVKRWYHUVLRQ ीᰁFKHFNSRLQWඪ೮ Ø ࣁහഝӾीےᇇֵ҅௳מአGHOWDFKDLQጱොୗכኸVQDSVKRWӾጱහഝᇇ ,QFUHPHQWDO&KHFNSRLQW 6DYHYHUVLRQLQYDOXHDQGUHVHUYHGDWDLQVQDSVKRWWKURXJKGHOWDFKDLQ Ø ٟفளᆙ෯ᇇᬰᤈ༄ັႴቘ &KHFNGDWDLQGHOWDFKDLQIRUFOHDQXSGXULQJSXWRUVQDSVKRW
10 . ᚆྲრᦇښ Performance and Upstreaming Plan Ø :RUGFRXQWMREVWDWHVL]H 0%7HVWHGZLWK3&,H66' QPS (K/s) Note ᚆྲ Heap (all in memory) 400 -Xmx=10GB 3HUIRUPDQFH RocksDB (all in memory) 100 Cache=10GB Heap (memory + disk) 160 -Xmx=3GB (spill ratio=57%) RocksDB (memory + disk) 100 Cache=3GB Ø ੴᴴғFKHFNSRLQW๗ᳵ&38ၾᘙṛ რᦇښ 0RUH&38FRVWGXULQJFKHFNSRLQWEHFDXVHRIVFDQ 8SVWUHDPLQJ 3ODQ Ø ਖ਼ࣁᬪ๗ᨯሠࢧᐒ܄ :LOOVWDUWWKHXSVWUHDPLQJ VRRQ
11 .ਁ֛ Ӟᐿᦇᓒਂړؙᐶጱᇫாਂؙᥴ٬ොໜ A SLOUTION TO DECOUPLE STORAGE AND COMPUTE OF STATE MANAGERMENT IN STREAM COMPUTING
12 . %DFNJURXQG ਂᦇؙᓒᘠݳጱຝ The Compute-storage Coupled Arch Data locality was important when network is bottleneck, but things have already changed Year 2003 2018 Network 100Mbps 10Gbps+
13 . %DFNJURXQG ԯᦇᓒդ҅ਂᦇؙᓒᘠݳ૪ᕪӧٚᭇଫ୨ᦇᓒጱᵱ Decoupling storage resource is necessary for elastic computing Ø ਂᦇؙᓒړᐶጱ۠ Trend to Decouple Decoupling makes it easier for hardware budget Compute and Storage Ø Decoupling makes better resource utilization Ø Decoupling makes better power saving
14 . %DFNJURXQG Take RocksDB StateBackend for example ፓ)ڹOLQNጱၞᦇᓒ 6WDWHᓕቘګ &XUUHQW6WDWH 0DQDJHPHQW ,Q $SDFKH )OLQN
15 . 2YHUYLHZ Take Niagara* StateBackend for example ᦇᓒਂړؙᐶጱ 6WDWHᓕቘګ Storage-Compute Decoupled Arch ,- ,
16 .,PSOHPHQWDWLRQ Ø Checkpoint created and restored more smoothly in seconds for DFS ਂᦇؙᓒړ ᐶጱսᅩ Advantages
17 . ,PSOHPHQWDWLRQ Ø BFO Checkpoint created and restored more smoothly in seconds for DFS D a D ਂᦇؙᓒړ I c ᐶጱսᅩ Restore backend in seconds when rescaling, avoid long consumed time of IO and DB scan in current Flink arch Advantages
18 . ,PSOHPHQWDWLRQ Ø Checkpoint created and restored more smoothly in seconds for DFS ਂᦇؙᓒړ ᐶጱսᅩ Advantages
19 . ,PSOHPHQWDWLRQ Ø m it lf nru Avoid write latency for decoupled storage arch DO - - fN if e dc A l s ਂᦇؙᓒړᐶጱੴᴴ n b ga - S eI pF Constraints and Solution n o f P v By means of async-arch of Niagara on Pangu*, data would be flushed to DFS asynchronously - , * * * ** *, *
20 . ,PSOHPHQWDWLRQ Ø ha I n Le i lo p Avoid read latency for decoupled storage arch B DAbd Lem i M f kB c Seg ਂᦇؙᓒړ By means of DADI*, a local fast second-level cache for LSM-like DB, we could guarantee the ᐶጱੴᴴ read latency nearly to traditional arch. Constraints and Solution Integrated Decoupled ** . . * * * -. - * * * * * -* . * * . .
21 .,PSOHPHQWDWLRQ Ø F Clean useless directories/files on DFS eagerly ਂᦇؙᓒړ ᐶጱੴᴴ F D M Constraints J and Solution Introduce checkpoint trash cleaner mechanism in JobManager
22 .We are recruiting ஙמ ᰓᰓ WeChat DingDing
23 .THANKS