Databricks with R: Deep Dive

In this presentation we’ll explain how to use the R programming language with Spark using a Databricks notebook and the SparkR package. We’ll discuss how to push data wrangling to the Spark nodes for massive scale and how to bring it back to a single node so we can use open source packages on the data. We’ll demonstrate converting SQL tables into R distributed data frames and how to convert R data frames to SQL tables. We’ll also have a look at how to train predictive models using data distributed over the Spark nodes. Bring your popcorn. This is a fun and interesting presentation.
展开查看详情

1.WIFI SSID:SparkAISummit | Password: UnifiedAnalytics

2.Azure Databricks with R: Deep Dive Bryan Cafferky, Microsoft Subscribe to my YouTube channel Reach out on LinkedIn #UnifiedAnalytics #SparkAISummit

3.

4. APACHE SPARK A unified, open source, parallel, data processing framework for Big Data Analytics Spark SQL Spark MLlib Spark GraphX Interactive Machine Streaming Graph Queries Learning Stream processing Computation Spark Core Engine Standalone Yarn Mesos Spark Scheduler MLlib Spark Structured Machine Streaming Learning Stream processing

5.

6. Azure Big Data

7.GENERAL SPARK CLUSTER ARCHITECTURE Driver Program SparkContext Cluster Manager Worker Node Worker Node Worker Node Data Sources (HDFS, SQL, NoSQL, …)

8.

9.https://databricks.com/blog/2016/08/03/developing-apache-spark-applications-in-net-using-mobius.html

10.

11.

12.https://www.slideshare.net/frodriguezolivera/apache-spark-streaming

13.https://www.slideshare.net/databricks/parallelizing-existing-r-packages-with-sparkr

14.https://databricks.com/blog/2016/12/28/10-things-i-wish-i-knew-before-using-apache-sparkr.html

15.

16.https://spark.rstudio.com/ https://github.com/rstudio/sparklyr/issues/502

17.https://www.slideshare.net/databricks/parallelizing-existing-r-packages-with-sparkr

18.https://spark.apache.org/docs/latest/sparkr.html#machine-learning

19.https://spark.apache.org/docs/2.2.0/api/R/index.html

20.

21.

22.

23.#UnifiedAnalytics #SparkAISummit 23

24.DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT