- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
利用人工智能构建自动优化查询器
展开查看详情
1 .Using&AI&to&Build&a&Self3 Driving&Query&Optimizer Shivnath Babu,,Adrian,Popescu #AI7SAIS
2 .Meet$the$speakers Shivnath Babu Adrian$Popescu • Cofounder*and*CTO*at*Unravel,*Adjunct* • Data*engineer*at*Unravel Professor*at*Duke*University • PhD*from*EPFL,*Switzerland • Focusing*on*ease>of>use*and*manageability* • 8+*years*of*experience*in*performance* of*data>intensive*systems monitoring*&*modeling*of*data*management* • Recipient*of*US*National*Science* systems Foundation*CAREER*Award,*three*IBM* • Focusing*on*tuning*and*optimization*of*Big* Faculty*Awards,*HP*Labs*Innovation* Data*apps Research*Award #AI7SAIS 2
3 .My#app#often#fails#with#Out#of# Memory… DATA#SCIENTIST #AI7SAIS 3
4 .How$can$I$make$it$more$reliable? DATA$SCIENTIST #AI7SAIS 4
5 .My#app#is#too#slow… DATA#ENGINEER #AI7SAIS 5
6 .I"need"to"make"it"faster… DATA"ENGINEER #AI7SAIS 6
7 .My#app#is#missing#SLA… DATA#PIPELINE#OWNER #AI7SAIS 7
8 .How$can$I$tune$my$app$to$guarantee$ SLAs? DATA$PIPELINE$OWNER #AI7SAIS 8
9 .This%rogue%app%is%wasting%resources% and%reducing%cluster%throughput OPERATIONS%TEAMS #AI7SAIS 9
10 .Can$this$app$use$less$resources$ while$finishing$on$time? OPERATIONS$TEAMS #AI7SAIS 10
11 .Current'approach:'Find'the'needle'in' the'hay'STACKS 1. Find"the"app."Review"Spark/YARN"UI"to"find"the"app 1"System 2. Review"metrics"on"web"UI 1"User 3. Review"stages"associated"with"the"app 1"App 4. Identify"executors"associated"with"the"stages" and"outliers 5. Deep"dive"into"“outlier”"stage 6. Identify"problematic"“executors” 7. Review"and"debug"container"logs 8. Rinse"&"repeat"across"other"executor/container"logs" to"identify"the"problem #AI7SAIS 11
12 .Now$imagine$the$problem$at$scale 10x 1$System Systems 100x 1$User Users 1000x 1 App Apps #AI7SAIS 12
13 .Imagine(a(new(world INPUTS 1. App(=(Spark(Query 2. Goal(=(Speedup “I#need#to#make#this#app#faster” #AI7SAIS
14 .Imagine(a(new(world APP(DURATION 90%( faster! TIME In#blink#of#an#eye,#user# As#user#finishes# User#comes#back#from# gets#recommendations#to# checking#email,#she# lunch.#A#verified#run#that# make#the#app#30%(faster has#a#verified#run# is#90%(faster that#is#60%(faster #AI7SAIS 14
15 . At#Unravel,#we#want#to#bring# the#new$world$to#you#today Introducing$Unravel$Sessions #AI7SAIS 15
16 .Demo #AI7SAIS 16
17 .Unravel(Sessions App SQL Program DAG Workload Goal Unravel( Efficiency Sessions( Reliability will(quickly( get(you( Speedup TIME there! SLA #AI7SAIS 17
18 .Demo #AI7SAIS 18
19 .How$Unravel$Sessions$work AI for$ Application$Performance$Management #AI7SAIS 19
20 .Unravel-Sessions-Architecture Recommendation Monitoring Algorithm Data App,Goal Probe-Algorithm Historic-Data & Xnext Probe-Data Orchestrator Cluster-Services-On=premises-and-Cloud #AI7SAIS 20
21 .spark.driver.cores 2 spark.executor.cores 10 PERFORMANCE … spark.sql.shuffle.partitions 300 spark.sql.autoBroadcastJoinThres 20MB hold … SKEW('orders',@'o_custId') true spark.catalog.cacheTable(“orders") true … We-represent-the-setting-as- X vector-X #AI7SAIS 21
22 .Given:&App&+&Goal PERFORMANCE • Goal:&Find&the&setting&of& X&that&best&meets&the&goal • Challenge:&Response& surface&y&=&ƒ(X)$is& unknown X #AI7SAIS 22
23 .Reinforcement'Learning Response'Surface'Methodology' #AI7SAIS 23 Credit:(http://people.csail.mit.edu/hongzi/
24 .Challenge:/Response/surface/ y/=/ƒ(X)/is/unknown Model&the&response&surface&as PERFORMANCE !t ! yˆ ( X ) = f ( X ) ! +Z ( X ) Here: !t ! f ( X )! is&a®ression&model Z(X ) is&the&residual&captured&as&a& Gaussian/Process The&Gaussian&Process&model&captures&the& uncertainty in&our¤t&knowledge&of&the& response&surface X #AI7SAIS 24
25 . Opportunity We#can#now#estimate#the#expected'improvement'EIP(X) from# doing#a'probe'at#any#setting#X PERFORMANCE p= y( X * ) EIP( X )= ! ( y( X ) " p ) pdf yˆ ( X ) ( p )dp * p = "# Improvement#at#any# Probability#density# setting#X#over#the#best# function#(uncertainty# performance#seen#so#far estimate) Gaussian#Process#model#helps#estimate#EIP(X) X #AI7SAIS 25
26 . Bootstrap 1 Get$initial$set$of$ monitoring$data$from$ history$or$via$ probes:$<X1,y1>,$ PERFORMANCE <X2,y2>,$…,$<Xn,yn>$ Probe-Algorithm 2 Select$next%probe Xnext based$on$all$ Until-the-- history$and$probe$data$ stopping- condition- available$so$far$to$ is- calculate$the$setting$ reached with$maximum%expected%% improvement%EIP(X) X #AI7SAIS 26
27 . Performance 8 6 y EIP(X) 4 2 ! ! ! 0 4 6 8 10 12 X x1 Xnext:"Do"next" This%approach% probe"here balances% Exploration%Vs. Exploration Exploitation Exploitation #AI7SAIS 27
28 . Data(Starved &(High(Uncertainty App,Goal Probe(Algorithm Xnext Data(Rich &(Low( Uncertainty #AI7SAIS 28 Credit:(https://discovery.rsm.nl/articles/detail/1309how9to9balance9exploration9and9exploitation9in9multinational9enterprises
29 .Unravel0Sessions0Architecture Recommendation Monitoring Algorithm Data App,Goal Probe-Algorithm Historic-Data & Xnext Probe-Data Orchestrator Cluster-Services-On=premises-and-Cloud #AI7SAIS 29