主题:
Spark on Zeppelin
时间:
5月21日 19:00
参与方式:
扫描下方海报二维码加入钉钉群
或者
届时点击直播间直接观看(回看链接)
https://developer.aliyun.com/live/2871
讲师介绍:
章剑锋(简锋),开源界老兵,Apache Member,曾就职于 Hortonworks,目前在阿里巴巴计算平台事业部任高级技术专家,并同时担任 Apache Tez、Livy 、Zeppelin 三个开源项目的 PMC ,以及 Apache Pig 的 Committer。
直播简介:
Apache Zeppelin 是一个交互式的大数据开发Notebook,从一开始就是为Spark定制的。Zeppelin Notebook的开发环境与传统IDE开发环境相比有几大优势:不需要编译Jar,环境配置简单,交互式开发,数据结果可视化等等。本次直播将会介绍Spark on Zeppelin的一些基本使用方式以及应用场景。
阿里巴巴开源大数据EMR技术团队成立Apache Spark中国技术社区,定期打造国内Spark线上线下交流活动。请持续关注。
邀请你加入钉钉群聊阿里云E-MapReduce交流2群,点击进入查看详情 https://qr.dingtalk.com/action/joingroup?code=v1,k1,cNBcqHn4TvG0iHpN3cSc1B86D1831SGMdvGu7PW+sm4=&_dt_no_comment=1&origin=11
邀请你加入钉钉群聊Apache Spark中国技术交流社区,点击进入查看详情 https://qr.dingtalk.com/action/joingroup?code=v1,k1,X7S/0/QcrLMkK7QZ5sw2oTvoYW49u0g5dvGu7PW+sm4=&_dt_no_comment=1&origin=11
微信公众号:Apache Spark技术交流社区
展开查看详情
1. Zeppelin Zeppelin Spark We are hiring P7 jeffzhang.zjf@alibaba-inc.com
2.Spark on Zeppelin Jeff Zhang
3.Who is Jeff Zhang 2008 2011 2013 2014 2018
4.Pain points of Big Data ● Hard to Test ● Hard to move development to production ● Hard to integrate with downstream application ● Hard to integrate with other tools
5.What is Apache Zeppelin Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala, Python and more.
6.Usage Scenarios ● Ad-hoc analytics ● ETL ● Build dashboard ● Machine Learning
7.Zeppelin Architecture Spark Interpreter Zeppelin Server WebSocket RPC Interpreter Client Launcher Jdbc Interpreter Rest Notebook Interpreter Manager Manager Shell Interpreter
8.Multiple language support SparkContext / SparkSession
9.Multiple spark version support Spark 1 Spark 2 Zeppelin Server Spark 3
10.Multiple mode support ● Local ● Standalone ● Yarn ● K8s
11.Inline Visualization
12.Dynamic forms TextBox Select Checkbox
13.Other features ● Inline Configuration ● Hive Integration ● Multiple user support Impersonation ● ZeppelinContext
14.Demo ● Configure & Start Zeppelin ● Configure Spark Interpreter ● Run Spark Tutorial ● Rest API
15.Pain points of Big Data ● Hard to Test ● Hard to move development to production ● Hard to integrate with downstream application ● Hard to integrate with other tools
16. Zeppelin Zeppelin Spark We are hiring P7 jeffzhang.zjf@alibaba-inc.com