申请试用
HOT
登录
注册
 
构建大型可伸缩的机器学习算法框架
poppy
/
发布于
/
1792
人观看
在本次讲座中,我们将研究开发一种新的SCAP的过程。我们将从基础开始,考虑如何设计无监督学习技术的扩展并行实现。大部分的讨论将集中在需要了解的细节上,以便将算法设计转换为Spark上的高效并行实现。
展开查看详情

1 .Building parallel machine learning algorithms: scaling out and up William Benton
 willb@redhat.com • @willb

2 .Motivation

3 .Motivation

4 .Motivation

5 .Motivation

6 .Forecast Introducing our case study: self-organizing maps Parallel implementations for partitioned collections (in particular, RDDs) Beyond the RDD: data frames and ML pipelines Practical considerations and key takeaways

7 .Introducing our case study

8 .

9 .

10 .Training self-organizing maps

11 .Training self-organizing maps

12 .Training self-organizing maps

13 .Training self-organizing maps

14 .Training self-organizing maps while t < maxupdates: random.shuffle(examples) for ex in examples: t = t + 1 if t == maxupdates: break bestMatch = closest(somt, ex) for (unit, wt) in neighborhood(bestMatch, sigma(t)): somt+1[unit] = somt[unit] + (ex - somt[unit]) * alpha(t) * wt

15 .Training self-organizing maps process the training while t < maxupdates: set in random order random.shuffle(examples) for ex in examples: t = t + 1 if t == maxupdates: break bestMatch = closest(somt, ex) for (unit, wt) in neighborhood(bestMatch, sigma(t)): somt+1[unit] = somt[unit] + (ex - somt[unit]) * alpha(t) * wt

16 .Training self-organizing maps process the training while t < maxupdates: set in random order random.shuffle(examples) for ex in examples: the neighborhood size controls t = t + 1 how much of the map around if t == maxupdates: the BMU is affected break bestMatch = closest(somt, ex) for (unit, wt) in neighborhood(bestMatch, sigma(t)): somt+1[unit] = somt[unit] + (ex - somt[unit]) * alpha(t) * wt

17 .Training self-organizing maps process the training while t < maxupdates: set in random order the learning rate controls random.shuffle(examples) how much closer to the for ex in examples: example each unit gets the neighborhood size controls t = t + 1 how much of the map around if t == maxupdates: the BMU is affected break bestMatch = closest(somt, ex) for (unit, wt) in neighborhood(bestMatch, sigma(t)): somt+1[unit] = somt[unit] + (ex - somt[unit]) * alpha(t) * wt

18 .Parallel implementations for partitioned collections

19 .Historical aside: Amdahl’s Law 1 lim So = sp —> ∞ 1 - p

20 .What forces serial execution?

21 .What forces serial execution?

22 .What forces serial execution? state[t+1] = combine(state[t], x)

23 .What forces serial execution? state[t+1] = combine(state[t], x)

24 .What forces serial execution? f1: (T, T) => T f2: (T, U) => T

25 .What forces serial execution? f1: (T, T) => T f2: (T, U) => T

26 .What forces serial execution? f1: (T, T) => T f2: (T, U) => T

27 .How can we fix these? a⊕b=b⊕a (a ⊕ b) ⊕ c = a ⊕ (b ⊕ c)

28 .How can we fix these? a⊕b=b⊕a (a ⊕ b) ⊕ c = a ⊕ (b ⊕ c)

29 .How can we fix these? a⊕b=b⊕a (a ⊕ b) ⊕ c = a ⊕ (b ⊕ c)

3 点赞
0 收藏
0下载
相关文档
确认
3秒后跳转登录页面
去登陆