Introducing .NET Bindings for Apache Spark
展开查看详情
1.WIFI SSID:SparkAISummit | Password: UnifiedAnalytics
2.Introducing .NET Bindings for Apache Spark Rahul Potharaju, Terry Kim, Tyson Condie Microsoft #DotNetForSpark #UnifiedAnalytics #SparkAISummit
3.30 hours into it… huge thanks! https://github.com/dotnet/spark 115,000 Twitter Impressions 521 .NET for Apache Spark OSS Announcement @Rohan’s Keynote in Spark+AI Summit 2019 #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
4.30 hours into it… huge thanks! https://github.com/dotnet/spark 115,000 Twitter Impressions 521 .NET for Apache Spark OSS Announcement @Rohan’s Keynote in Spark+AI Summit 2019 #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
5.The Big Picture What is .NET? #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark 44
6..NET – A unified platform DESKTOP WEB CLOUD MOBILE GAMING IoT AI .NET STANDARD LIBRARIES INFRASTRUCTURE #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
7.• C# is a simple, modern, object-oriented, and type- safe programming language • Its roots in the C family of languages makes C# immediately familiar to C, C++, Java, and JavaScript programmers • F# is a cross-platform, open-source, functional programming language for .NET • It also includes object-oriented and imperative programming • Visual Basic is an approachable language with a simple syntax for building type-safe, object- oriented apps
8. .NET Open Source & Cross-Platform 750K +1M .NET Core developers New .NET developers in last year
9.Companies embracing .NET… dot.net/customers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
10..NET Developers 💖 Apache Spark but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
11..NET Developers 💖 Apache Spark but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
12..NET Developers 💖 Apache Spark but… Locked out from Big Data processing due to lack of .NET support in OSS Big Data solutions but… #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
13..NET Developers 💖 Apache Spark but… Locked out from Big Data processing due to lack of .NET support in OSS Big Data solutions but… … a lot of Big Data-usable business logic (millions of lines of code) is written in .NET! #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
14..NET Developers 💖 Apache Spark but… Locked out from Big Data processing due to lack of .NET support in OSS Big Data solutions but… … a lot of Big Data-usable business logic (millions of lines of code) is written in .NET! In a recently conducted .NET Developer survey (> 1000 developers), more than 70% expressed interest in Apache Spark! #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
15.Why Apache Spark should 💖 .NET Developers? #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
16.Why Apache Spark should 💖 .NET Developers? More people who learn Apache Spark #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
17.Why Apache Spark should 💖 .NET Developers? = More people who Solve harder learn Apache Spark challenges together #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
18.Why Apache Spark should 💖 .NET Developers? = = More people who Solve harder Make the world a learn Apache Spark challenges better place! together #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
19.Restating Our Intent… Goal: .NET for Apache Spark is aimed at providing .NET developers a first-class experience when working with Apache Spark. Non-Goal: Converting existing Scala/Python/Java Spark developers. #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
20.Team and Commitment Who? #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark 59
21.Microsoft is committed… #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
22.Microsoft is committed… • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
23.Microsoft is committed… • C# (and F#) language extensions using .NET • Performance benchmarking (Interop) • Portability aspects (e.g., cross-platform .NET Standard) • Tooling (e.g., Apache Jupyter, Visual Studio, Visual Studio Code) • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
24.Microsoft is committed… • Performance benchmarking (cluster) • Production workloads • C# (and F#) language extensions using .NET • Performance benchmarking (Interop) • Portability aspects (e.g., cross-platform .NET Standard) • Tooling (e.g., Apache Jupyter, Visual Studio, Visual Studio Code) • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
25.Microsoft is committed… • Technical Documentation, Blogs and Articles • End-to-end scenarios • Performance benchmarking (cluster) • Production workloads • C# (and F#) language extensions using .NET • Performance benchmarking (Interop) • Portability aspects (e.g., cross-platform .NET Standard) • Tooling (e.g., Apache Jupyter, Visual Studio, Visual Studio Code) • Interop Layer for .NET (Scala-side) • Potentially Optimizing Python and R interop layers #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
26.… and developing in the open! #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
27.… and developing in the open! .NET for Apache Spark was open sourced @Spark+AI Summit 2019 • Website: https://dot.net/spark • GitHub: https://github.com/dotnet/spark #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
28.… and developing in the open! .NET for Apache Spark was open sourced @Spark+AI Summit 2019 • Website: https://dot.net/spark • GitHub: https://github.com/dotnet/spark Spark Project Improvement Proposals: • Interop Support for Spark Language Extensions: SPARK-26257 • .NET bindings for Apache Spark: SPARK-27006 #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark
29.… and developing in the open! .NET for Apache Spark was open sourced @Spark+AI Summit 2019 • Website: https://dot.net/spark • GitHub: https://github.com/dotnet/spark Spark Project Improvement Proposals: • Interop Support for Spark Language Extensions: SPARK-26257 • .NET bindings for Apache Spark: SPARK-27006 Contributions to foundational OSS projects: • Apache Arrow: ARROW-4997, ARROW-5019, ARROW-4839, ARROW- 4502, ARROW-4737, ARROW-4543, ARROW-4435 • Pyrolite (Pickling Library): Improve pickling/unpickling performance, Add a Strong Name to Pyrolite #DotNetForSpark #UnifiedAnalytics #SparkAISummit https://github.com/dotnet/spark