Introducing .NET Bindings for Apache Spark

We present a new, free, open-source framework aimed at making Spark accessible to millions of .NET developers. In this session we will provide a high level overview of the .NET bindings for Spark effort, demonstrate some key capabilities on how you can use and get involved with the effort, and also cover how you can use the .NET bindings for Spark with other .NET frameworks like ML.NET for building E2E real-time analytics solutions. This will be one fun session with demos galore, so come join us as we get started on the .NET bindings for Spark journey!
展开查看详情

1.WIFI SSID:SparkAISummit | Password: UnifiedAnalytics

2.Introducing .NET Bindings for Apache Spark Rahul Potharaju, Terry Kim, Tyson Condie Microsoft #DotNetForSpark #UnifiedAnalytics #SparkAISummit

3.30 hours into it… huge thanks! https://github.com/dotnet/spark 115,000 Twitter Impressions 521 .NET for Apache Spark OSS Announcement @Rohan’s Keynote in Spark+AI Summit 2019 #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

4.30 hours into it… huge thanks! https://github.com/dotnet/spark 115,000 Twitter Impressions 521 .NET for Apache Spark OSS Announcement @Rohan’s Keynote in Spark+AI Summit 2019 #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

5.The Big Picture What is .NET? #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark 44

6..NET – A unified platform DESKTOP WEB CLOUD MOBILE GAMING IoT AI .NET STANDARD LIBRARIES INFRASTRUCTURE #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

7.• C# is a simple, modern, object-oriented, and type- safe programming language • Its roots in the C family of languages makes C# immediately familiar to C, C++, Java, and JavaScript programmers • F# is a cross-platform, open-source, functional programming language for .NET • It also includes object-oriented and imperative programming • Visual Basic is an approachable language with a simple syntax for building type-safe, object- oriented apps

8. .NET Open Source & Cross-Platform 750K +1M .NET Core developers New .NET developers in last year

9.Companies embracing .NET… dot.net/customers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

10..NET Developers 💖 Apache Spark but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

11..NET Developers 💖 Apache Spark but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

12..NET Developers 💖 Apache Spark but… Locked out from Big Data processing due to lack of .NET support in OSS Big Data solutions but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

13..NET Developers 💖 Apache Spark but… Locked out from Big Data processing due to lack of .NET support in OSS Big Data solutions but… … a lot of Big Data-usable business logic (millions of lines of code) is written in .NET! #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

14..NET Developers 💖 Apache Spark but… Locked out from Big Data processing due to lack of .NET support in OSS Big Data solutions but… … a lot of Big Data-usable business logic (millions of lines of code) is written in .NET! In a recently conducted .NET Developer survey (> 1000 developers), more than 70% expressed interest in Apache Spark! #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

15.Why Apache Spark should 💖 .NET Developers? #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

16.Why Apache Spark should 💖 .NET Developers? More people who learn Apache Spark #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

17.Why Apache Spark should 💖 .NET Developers? = More people who Solve harder learn Apache Spark challenges together #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

18.Why Apache Spark should 💖 .NET Developers? = = More people who Solve harder Make the world a learn Apache Spark challenges better place! together #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

19.Restating Our Intent… Goal: .NET for Apache Spark is aimed at providing .NET developers a first-class experience when working with Apache Spark. Non-Goal: Converting existing Scala/Python/Java Spark developers. #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

20.Team and Commitment Who? #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark 59

21.Microsoft is committed… #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

22.Microsoft is committed… • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

23.Microsoft is committed… • C# (and F#) language extensions using .NET • Performance benchmarking (Interop) • Portability aspects (e.g., cross-platform .NET Standard) • Tooling (e.g., Apache Jupyter, Visual Studio, Visual Studio Code) • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

24.Microsoft is committed… • Performance benchmarking (cluster) • Production workloads • C# (and F#) language extensions using .NET • Performance benchmarking (Interop) • Portability aspects (e.g., cross-platform .NET Standard) • Tooling (e.g., Apache Jupyter, Visual Studio, Visual Studio Code) • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

25.Microsoft is committed… • Technical Documentation, Blogs and Articles • End-to-end scenarios • Performance benchmarking (cluster) • Production workloads • C# (and F#) language extensions using .NET • Performance benchmarking (Interop) • Portability aspects (e.g., cross-platform .NET Standard) • Tooling (e.g., Apache Jupyter, Visual Studio, Visual Studio Code) • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

26.… and developing in the open! #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

27.… and developing in the open! .NET for Apache Spark was open sourced @Spark+AI Summit 2019 • Website: https://dot.net/spark • GitHub: https://github.com/dotnet/spark #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

28.… and developing in the open! .NET for Apache Spark was open sourced @Spark+AI Summit 2019 • Website: https://dot.net/spark • GitHub: https://github.com/dotnet/spark Spark Project Improvement Proposals: • Interop Support for Spark Language Extensions: SPARK-26257 • .NET bindings for Apache Spark: SPARK-27006 #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark

29.… and developing in the open! .NET for Apache Spark was open sourced @Spark+AI Summit 2019 • Website: https://dot.net/spark • GitHub: https://github.com/dotnet/spark Spark Project Improvement Proposals: • Interop Support for Spark Language Extensions: SPARK-26257 • .NET bindings for Apache Spark: SPARK-27006 Contributions to foundational OSS projects: • Apache Arrow: ARROW-4997, ARROW-5019, ARROW-4839, ARROW- 4502, ARROW-4737, ARROW-4543, ARROW-4435 • Pyrolite (Pickling Library): Improve pickling/unpickling performance, Add a Strong Name to Pyrolite #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark