Rheos-SQL - Managed Streaming SQL Platform

Rheos-SQL - Managed Streaming SQL Platform

展开查看详情

1.Rheos-SQL: A Managed Streaming SQL Platform Rheos Nov. 2019 1

2. Outline ● Motivation ● Highlight ● Comparison ● Architecture ● Details ● Future Plan 2

3.3

4. Motivation Limited streaming processing support with SQL ● Flink, Beam, Spark Streaming, KSQL ● Time-varying relations, event time semantics and keyword extensions Managed platform on production ● Full life-cycle management ● Sufficient flexibility to meet most use cases ● End-to-end monitoring 4

5. Highlight Ansi SQL + Extensions SQL Plan + Configuration Extend DDL, DQL and DML to perform Explain the SQL in plan graph and config robust stream processing for performance tuning SQL-as-Stream Managed UDF Run a streaming job with only SQL Manage and integrate with users’ jar packages Restful Service + SDK Portal + Monitoring Provide both Restful service and SDK for Manage whole life cycle on the portal and developers of all levels monitor end-to-end metrics through dashboards 5

6. Streaming SQL V.S. Traditional SQL One SQL to Rule Them All[1] Watermark Window Join Tigger Aggregation Look up table Function [1] Begoli, Edmon, et al. "One SQL to Rule Them All-an Efficient and Syntactically Idiomatic Approach to Management of Streams and Tables." Proceedings of the 2019 International Conference on Management of Data. ACM, 2019. 6

7. Comparison Rheos-SQL Flink-SQL v1.9 Beam-SQL v2.15.0 Spark Streaming Rheos-SQL Source ✓ ✓ ✓ ✓ Side (Lookup) Join Side Table ✓ ✓ ✕ ✓ ✕ DDL View ✓ ✕ ✕ ✓ Window ✓ Sink ✓ ✓ ✓ ✕ Extensions Trigger ✓ DQL Queries UDF ✓ ✓ ✓ ✓ ✓ DML Insert Managed ✓ ✓ ✓ ✓ ✓ Platform Join Side Table ✓ ✕ ✓ ✕ Source ✓ Window ✓ ✓ ✓ ✕ DDL Side(Lookup) ✓ Extensions Trigger ✓ ✕ ✕ ✕ View ✓ UDF ✓ ✓ ✓ ✓ DQL Queries ✓ Managed Platform ✓ ✕ ✕ ✕ DML Sink ✓ 7

8. Architecture Restful Interface Management Service Core Resources SQL Script Udf Package SQL Job SQL Source Management Packages Management Core Service Layer Connects Core SQL Modules UDF Executor Provider SQL SDK Layer Job Management Quota Management Monitoring Streaming Platform Infrastructure(K8S) Infrastructure Layer 8

9. User Experience Domain User Check in Source Upload Customized Packages Verify Operation Onboard Submit & & Metadata Job Config Monitoring Code External Repository Rheos Sql Service Storage (Git) (Swift) Submit Expose Job Status Download Source Download Packages Streaming Platform 9

10. Details - Verify & Config Parallelism: 10 Parallelism: 6 CREATE TABLE SOURCE_TABLE ( `currentRecord.item_id` ALIAS item_id VARCHAR, item_full_id AS item_id || '-suffix', `JSON_VALUE(item_properties, 'address.country')` ALIAS country VARCHAR, `timestamp` BIGINT, WATERMARK wk FOR timestamp AS withOffset(timestamp, 1000)) WITH( CREATE VIEW TEMP_VIEW as type='kafka', SELECT topic='kafka.topic'); a.item_id as item_id, a.item_full_id as item_full_id, CREATE TABLE LOOK_UP_TABLE ( a.country as country, item_id VARCHAR, a.timestamp as timestamp, price INT,) b.price as price WITH( FROM SOURCE_TABLE a type='couchbase', JOIN LOOK_UP_TABLE b nodes='localhost', ON a.item_id = b.item_id bucketname='bucket'); GROUP BY a.item_id; CREATE TABLE SINK_TABLE ( INSERT INTO SINK_TABLE item_id VARCHAR, SELECT item_full_id VARCHAR, item_id, country VARCHAR, item_full_id, timestamp BIGINT, country, price INT) timestamp, WITH( price type='ELASTICSEARCH'); FROM TEMP_VIEW; Parallelism: 8 10

11. Details - Managed UDF Packages Job Manager Blob Server Bootstrap local <path>/<jobId>/<BlobKey> StreamExecutionEnvironment StreamGraph User Download jars JobGraph Download External Blob Blob Client Cache Storage Task Manager 11

12. Details - Monitoring Source Side Sink 12

13. Future Plan Auto tuning Hot deploy SQL syntax optimization 13

14. Q&A 14

ebay中国研发中心(ebay CCOE)成立于2004年,是ebay最早也是迄今为止最大的海外研发中心,目前将近有800位优秀技术人才。公司主要从事云基础架构研发、大数据智能分析平台、大规模分布式计算研发,以及基于人工智能的搜索引擎、互联网广告投放和ebay的全球支付研发。
关注他