- 微博 QQ QQ空间 贴吧
Splice Machine’s use of Apache Spark and MLflow
收藏 0下载 2
Splice Machine is an ANSI-SQL Relational Database Management System (RDBMS) on Apache Spark. It has proven low-latency transactional processing (OLTP) as well as analytical processing (OLAP) at petabyte scale. It uses Spark for all analytical computations and leverages HBase for persistence. This talk highlights a new Native Spark Datasource – which enables seamless data movement between Spark Data Frames and Splice Machine tables without serialization and deserialization. This Spark Datasource makes machine learning libraries such as MLlib native to the Splice RDBMS . Splice Machine has now integrated MLflow into its data platform, creating a flexible Data Science Workbench with an RDBMS at its core. The transactional capabilities of Splice Machine integrated with the plethora of DataFrame-compatible libraries and MLflow capabilities manages a complete, real-time workflow of data-to-insights-to-action. In this presentation we will demonstrate Splice Machine’s Data Science Workbench and how it leverages Spark and MLflow to create powerful, full-cycle machine learning capabilities on an integrated platform, from transactional updates to data wrangling, experimentation, and deployment, and back again.