本次公开课由 Kyligence 大数据研发工程师吴逸飞介绍 Apache Calcite SQL优化器算法,可以让您了解到Calcite的基本架构与SQL优化的流程和原理,主要内容包括:Calcite的SQL处理流程、Optimizer的作用和应用场景、Volcano optimizer基本原理、对比基于规则的优化器。

Kyligence发布于2019/03/07

注脚

展开查看详情

1. Apache Calcite SQL 1

2. 1. Calcite SQL 2. Optimizer 3. Volcano Optimizer 4. Vs 2

3. Calcite SQL 1 3

4.4

5. Calcite SQL SQL SQLNode SqlNode To RelNode catalog SQL Optimize 5

6. Optimizer 2 6

7. Find the most efficient way to execute this query select p_partkey as min_p_partkey, min(ps_supplycost) as min_ps_cost from tpch.part, tpch.partsupp where p_partkey = ps_partkey and p_name = 'EUROPE group by 7 p_partkey

8. select select p_partkey as min_p_partkey, p_partkey as min_p_partkey, min(ps_supplycost) as min(ps_supplycost) as min_ps_cost min_ps_cost from from tpch.part, tpch.part inner join tpch.partsupp tpch.partsupp where on p_partkey = ps_partkey p_partkey = ps_partkey and p_name = 'EUROPE where group by p_name = 'EUROPE p_partkey group by p_partkey 8

9.9

10.10

11. Volcano 3 Optimizer 11

12. RelSet Traitset RelOptRule RelNode Pattern 12

13.13

14.1. 2. Apply Rule 3. 4. cost- 5. model 14

15. 1. select p_partkey as min_p_partkey, min(ps_supplycost) as min_ps_cost from tpch.part, tpch.partsupp where p_partkey = ps_partkey and p_name = 'EUROPE group by p_partkey 15

16. 1. RelSet RelSet BestCost #Set0 LogicalAggregate(subset=[rel#33:Subset#5.ENUMERABLE.[]], group=[{0}], MIN_PS_COST=[MIN($1)]): #Set1 LogicalProject(subset=[rel#26:Subset#4.NONE.[]], MIN_P_PARTKEY=[$0], PS_SUPPLYCOST=[$7]): #Set2 LogicalFilter(subset=[rel#24:Subset#3.NONE.[]], condition=[=($0, $9)]) #Set3 LogicalJoin(subset=[rel#22:Subset#2.NONE.[]], condition=[true], joinType=[inner]): #Set4 TableScan(subset=[rel#19:Subset#0.NONE.[]], table=[[TPCH, PART]], fields=[[0, 1, 2, 3, 4]]) #Set5 TableScan(subset=[rel#20:Subset#1.NONE.[]], table=[[TPCH, PARTSUPP]], fields=[[0, 1, 2, 3, 4]]) 2. Rule …. FilterJoinRule …. 16

17. Apply Rule 1. Rule FilterJoinRule 2. Pattern Rel NewJoinRel 3. newJoinRel Rel RelSet RelSet #Set2 LogicalFilter(subset=[rel#24:Subset#3.NONE.[]], condition=[=($0, $9)]) newLogicalJoin(subset=[rel#34:Subset#2.NONE.[]], condition=[($0, $9)], joinType=[inner]): 4. bestCost, Cost ParentSet cost Rule 17

18. 1. Rule 2. TargetCost cost 3. matc Rule 18

19. Cost-Model 1. CPU 2. IO 3. RowCount 19

20. Vs 4 20

21. COST 1. 3. 1. 1. 2. 21

22. K Kyligence Kyligence 22

user picture
Kyligence (上海跬智信息技术有限公司)由首个来自中国的 Apache 软件基金会顶级开源项目 Apache Kylin 核心团队组建,是专注于大数据分析领域的数据科技公司,通过前沿数据技术的分析认知来加速用户关键商业决策是其使命。

相关文档