Apache Flink + Apache Beam: Expanding the horizons of Big Data

下载 2

快召唤伙伴们来围观吧
微博 QQ QQ空间 贴吧
文档嵌入链接
<iframe src="https://www.slidestalk.com/u5/ExpandingthehorizonsofBigData?embed" frame border="0" width="640" height="360" scrolling="no" allowfullscreen="true">复制
微信扫一扫分享
已成功复制到剪贴板

青色的海牛

发布于

6年前

5661

人观看

在过去的几个月里，Apache Flink和Apache Beam开源社区一直在忙于开发最先进的工业级解决方案，通过修改Beam的接口，构建相应的Flink Runner代码，同时简化Flink的分发和部署机制，让Python可以完成批量数据和实时流数据作业。有了这个工作基础，大数据领域杀手级程序就这么诞生了，让用户可以轻易集成超大数据处理结果，输入到TensorFlow的Pipeline中，完成基于大数据实时计算平台的深度学习和数据预测功能。这篇PPT将介绍一系列Google开发的基于Beam的Python SDK开源库，让超大规模的数据预处理更有效的和TensorFlow完美集成。

展开查看详情

2 . Rich history of collaboration Comprehensive Comprehensive Streaming SQL Unified streaming streaming& (w/ Apache Streaming SQL Batch Unified & Streaming semantics semantics & Calcite) (w/ Apache Calcite) Batch & Streaming correctness correctness © 2017 Google Inc. All rights reserved.

3 . Vertical Solutions via domain Time Machine ... Series Learning Genomics ... specific libraries & tools ... Java Python ... Flexible Big-Data Platform for Batch & Streaming Horizontal Framework in multiple languages © 2017 Google Inc. All rights reserved.

6 . Cross-language Portability Framework Language agnostic abstractions are at the core of the Beam Model Language A Language B Language C SDK SDK SDK The Beam Model Runner 1 Runner 2 Runner 3 The Beam Model Language A Language B Language C © 2017 Google Inc. All rights reserved.

7 .Prototype Flink Runner ❏ Works with Beam’s Python SDK ❏ Collaborators: Flink, Beam, Lyft, GetInData ❏ https://issues.apache.org/jira/browse/ BEAM-2889 ❏ For updates, please subscribe to Apache Flink and Apache Beam Blogs © 2017 Google Inc. All rights reserved.

13 ....you have to worry about so much more. Data Monitoring Verification Configuration Data Collection ML Analysis Tools Code Cod Serving Process Management Machine Infrastructure Tools Resource Feature Engineering Management © 2017 Google Inc. All rights reserved.

18 .Rich collection of pre-implemented transforms Scale to ... Bag of Words / N-Grams tf.string_split tft.scale_to_z_score tft.ngrams ... tft.string_to_int Bucketization Feature Crosses tft.quantiles tf.string_join tft.apply_buckets tft.string_to_int © 2017 Google Inc. All rights reserved.

19 .Rich collection of pre-implemented transforms Scale to ... Bag of Words / N-Grams tf.string_split tft.scale_to_z_score tft.ngrams ... tft.string_to_int Bucketization Feature Crosses tft.quantiles tf.string_join tft.apply_buckets tft.string_to_int Apply another TensorFlow Model tft.apply_saved_model © 2017 Google Inc. All rights reserved.

23 .Analyzing model mistakes by subgroup ROC Curve All groups Group A Sensitivity (True Positive Rate) Group B Specificity (False Positive Rate) Learn more at ml-fairness.com © 2017 Google Inc. All rights reserved.

25 . github.com/tensorflow/model-analysis https://medium.com/tensorflow/introducing-tensorflow-model-anal ysis-scaleable-sliced-and-full-pass-metrics-5cde7baf0b7b © 2017 Google Inc. All rights reserved.

9点赞

3收藏

2下载