申请试用
HOT
登录
注册
 
Self-Service Apache Spark Structured Streaming Applications and Analytics

Self-Service Apache Spark Structured Streaming Applications and Analytics

Spark开源社区
/
发布于
/
8366
人观看
Organizations are increasingly building more and more Apache Spark Structured Streaming Applications for IoT analytics, real-time fraud detection, anomaly detection, analyzing streaming data from devices, turbines etc. However building the streaming applications and operationalizing them is challenging. There is a need for a self-serve platform on Spark Structured Streaming to enable many users to quickly build, deploy, run and monitor a variety of big data streaming use cases. At Sparkflows we built out a Self-Service Platform for building Structured Streaming Applications in minutes. Variety of users can log in with their Browser and build and deploy these applications seamlessly with drag and drop of 200+ Processors. They can also build charts on the streaming data and perform streaming analytics. In this talk we will dive deeper into our journey. We started with a workflow editor and workflow engine for building and running structured streaming jobs. We will explain how we built out the connectors to streaming sources for running in the designer mode, perform ML model scoring with real-time ingestion, streaming analytics, schema inference and propagation and displaying results in continuously moving charts. We will go over how we built self-serve streaming data preparation, lookup and analytics with SQL, Scala, Python etc. Finally, we will also discuss how we enabled deployment, operationalization and monitoring of the long running Structured Streaming jobs. We want to show how Spark can be used to enable very complex Self-Serve Big Data Streaming Applications and Analytics for Enterprises.
0 点赞
2 收藏
4下载
确认
3秒后跳转登录页面
去登陆