讲师:李栋 Kyligence 技术合伙人兼生态合作技术总监
演讲概要:从v2.6.0版本开始,Apache Kylin 提供了一个Data Source SDK的功能,可以帮助开发者通过快速开发,实现Apache Kylin与新数据源的对接。通过JDBC接口,Apache Kylin可以从新数据源构建Cube、查询下压,满足企业在数据湖进行自助分析的需求。本次分享将对这一SDK的原理、最佳实践进行介绍。

注脚

展开查看详情

1. Apache Kylin lidong@apache.org Kyligence Apache Kylin PMC info@kyligence.io

2.Apache Kylin v1.6 KYLIN-1726: Scalable streaming cubing kylin.source.default=0 9 ,1. 1 1 2 01

3.Apache Kylin v2.1 KYLIN-1351: RDBMS as Data Source kylin.source.default=0 9 ,1. kylin.source.default=8 1 1 2 01

4.Apache Kylin v2.6 KYLIN-3552: Data Source SDK to ingest data from different JDBC sources kylin.source.default=0 9 ,1. kylin.source.default=16 1 1 2 01

5.Data Source SDK Analytics , .0 2 0 1 , . . 9 02 2 .1. . 2 19 Impala SDK Interface A 2 02 0 9 12 .

6. 1 21 List Databases Database info List Tables Table metadata List Columns Column info JDBC Engine . 0 0 10,. 0.

7. JDBC Cube 1. Create tmp flat table in source DB 1 21 0. Convert SQL according to dialect of jdbc engine 2. Load into Hive using 3. Input to MR or Spark job Sqoop 4. Clean up tmp tables in Hive and source DB . 0 0 10,. 0.

8. JDBC query 0 , 90 kylin.query.pushdown.runner-class-name=<impl-class-name> 0 0 SQL Pushdown Cube Access 0 0 0. ,, .0 .0 0 2 0 , Any JDBC Engine Convert SQL 20 .0 . 1 0 ,

9.• CSQL I 2 • SQL 2 • 9 K 1 0, . ,

10.Data Source SDK – API 2AbstractJdbcAdaptor 2 // Load metadata public abstract List<String> listDatabases() throws SQLException; public abstract List<String> listTables(String database) throws SQLException; public abstract CachedRowSet getTable(String database, String table) throws SQLException; public abstract CachedRowSet getTableColumns(String database, String table) throws SQLException; // SQL conversion public abstract String fixSql(String sql); public abstract String[] buildSqlToCreateTable(String identity, Map<String, String> columnInfo); public abstract String[] buildSqlToCreateView(String viewName, String sql); public abstract String fixIdentifierCaseSensitve(String identifier); public abstract String toKylinTypeName(int sourceTypeId); public abstract String toSourceTypeName(String kylinTypeName); 1 0, . ,

11.Data Source SDK – SQL SUM select CAST(l_quantity, decimal) l_returnflag, o_orderstatus, Convert CAST sum(convert(l_quantity, double)) as sum_qty, sum(l_extendedprice) as sum_base_price … Filter from v_lineitem inner join v_orders on l_orderkey = o_orderkey where Join l_shipdate <= '1998-09-16' group by l_returnflag, o_orderstatus Tables 1 0, . ,

12.Data Source SDK – 063C3B DA57H678 >3=71 R S U 61 LMNI S 0CG 7H678 61 U MOMY 7F A7BB >1 >C797A 0 A 7ACG >3=71 XS LMNI S VU MY ML MUI SML E3<D71 Y M 0CG 7H678 61 U 7F A7BB >1 >C 0 A 7ACG >3=71 XS ISSV UV VNN M E3<D71 Y M 0CG 7H678 61 O U 7F A7BB >1 9 >C 0 A 7ACG >3=71 XS ISSV NM P UV YV E3<D71 Y M 0CG 7H678 61 C U U 7F A7BB >1 C >G >C 0 A 7ACG >3=71 XS ISSV UV VYLMY P NM P E3<D71 Y M 0CG 7H678 61 BTISS U 7F A7BB >1 B=3<< >C 0 A 7ACG >3=71 XS RM VYL LMNI S M I M E3<D71 Y M 0CG 7H678 61 BPVY 7F A7BB >1 B AC 0 A 7ACG >3=71 XS RM VYL LMNI S MY I M E3<D71 Y M 0CG 7H678 61 <VUO 7F A7BB >1 < >9 0 A 7ACG >3=71 XS I M MU M E3<D71 NIS M 0 A 7ACG >3=71 TM ILI I MUI SM I PM E3<D71 NIS M 0CG 7H678 61 > TMY 7F A7BB >1 >D=7A 5"! $ ! 0 = U 0CG 7H678 61 6M TIS 7F A7BB >1 675 =3<"! $ ! 08D>5C >H678 61 ) 7F A7BB >1 = >"!( 0CG 7H678 61 AMIS 7F A7BB >1 A73< 0 =I_ 08D>5C >H678 61 7F A7BB >1 =3F"!( 0CG 7H678 61 6I M 7F A7BB >1 63C7 0 5 YYMU 6I M 0CG 7H678 61 C TM 7F A7BB >1 C =7 08D>5C >H678 61 7F A7BB >1 5DAA7>CH63C7 0CG 7H678 61 6I MC TM 7F A7BB >1 63C7C =7 0 5 YYMU 6I MC TM 0CG 7H678 61 C TMB IT 7F A7BB >1 C =7BC3= 08D>5C >H678 61 , 7F A7BB >1 5DAA7>CHC =7BC3= 0 63C3B DA57H678 0 6I M 08D>5C >H678 61 - 7F A7BB >1 53BC"!( 3B 63C7 0 6I N=VU P 08D>5C >H678 61 . 7F A7BB >1 7FCA35C"63G 8A = !( 0 6I NGMIY 08D>5C >H678 61 / 7F A7BB >1 63G 8G73A"!( 0 =VU P 08D>5C >H678 61 )( 7F A7BB >1 7FCA35C"= >C 8A = !( 1 0, . ,

13.MySQL Adaptor 2 SQL9 1 0, . ,

14.OLAP for Big Data LS D ) D 4 < 0 , , D : 2 B > 4 < 0 1 2 : 1 : 2 Impala QCI A LS J I KJ 2 02 ( 0 12 9 .

15.• 9 2More RDBMS / MPP DB / SQL on Hadoop • C • I 1 0, . ,

16.• http://kylin.apache.org/development/datasource_sdk.html • http://kylin.apache.org/blog/2019/01/16/introduce-data-source-sdk-v2.6.0 1 0, . ,

17. Our Contact Address . 112 Y1 405 Telephone 021-61060928 E-mail info@kyligence.io Website https://kyligence.io

18.THANK YOU 1 0, . ,

user picture
Kyligence (上海跬智信息技术有限公司)由首个来自中国的 Apache 软件基金会顶级开源项目 Apache Kylin 核心团队组建,是专注于大数据分析领域的数据科技公司,通过前沿数据技术的分析认知来加速用户关键商业决策是其使命。

相关文档