- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Apache Kylin大数据OLAP利器 部分1
展开查看详情
1 .Apache Kylin: OLAP ݪلғKyligence ᘳ֖ғṛᕆᎸݎૡᑕ ᄍᦖᘏғช௮
2 .Agenda Ø Apache Kylin ᓌՕ Ø Apache Kylin Brief Ø Kylin ໐ஞܻቘ Ø Kylin Core Ø َࣳአಁໜֺ Ø Use cases Ø Kylin ܲݎᑕ Ø Kylin development history Ø Kylin 3.0 roadmap Ø Kylin 3.0 roadmap
3 . Apache Kylinғقቖᶾضጱय़හഝړຉದ OLAP-on-Hadoop Apache Kylin Leading OLAP-on-Hadoop Engine • ᶮᕆᶱፓ Top Level Project • ኞாᐒ ܄Community Apache KylinᒫӞӻᛔӾࢵጱ $SDFKHᶮᕆრᶱፓ ၚጱᐒ҅܄ռग़አಁ݊ݎᘏ҅ଠာጱრ̵ࠟ Apache Kylin, the first TLP from China ӱ֢ݳվ֎֛ᔮ Active user and developer community, diverse users. • ᤈӱᦊ ݢRecognition 2015/16 ᬳᖅӷଙឍ឴InfoWorldʼn๋֯რय़හഝૡٍॹŊ 2015/16 Bossie Awards by InfoWorld. • ದս۠ Technology चԭᶼᦇᓒଚᤈᦇᓒڜୗਂؙᒵս۸ದ҅ਫሿ ၹᰁහഝṛଚݎԵᑁᕆߥଫጱਫහഝړຉଘݣ • አಁᦊ ݢAdoption Built with the leading big data technologies, Kylin قቖ᩻ᬦ1000ਹᶾضմӱֵአKylinय़හഝړຉଘݣᥴ٬ොໜ can support massive data, high concurrency and More than 1000 deployments in the world sub-second latency.
4 .Kyligence = Kylin + Intelligence Apache Kylin մӱᕆ Founded by Apache Kylin origin team ӫӱ๐ۓ Ծߝ Professional Enterprise ü 50% Apache Kylin PMC Service Product More than 50% Apache Kylin PMC ü 90% Kylin ୌᶾضጱ Contribute 90% source code ᓕቘӨᛔۖ۸ قቖრᐒ܄ Automation & Build a global open Kylin Monitoring source community Enterprise product powered by Kylin ü Kyligence Enterpriseғմӱᕆ OLAP ଘݣ ᤈӱ Kyligence Enterprise Intelligent analytics platform ԯᦇᓒ ü Kyligence Cloudԯᦇᓒय़හഝฬᚆᬩᖌ ᥴ٬ොໜ Cloud Kyligence Cloud: Analytics in the cloud Solution
5 .ᶱፓܲݎݪلᑕ Kylin and Kyligence milestones 2017.5 2018.7 2014.11 Kyligence ᗦ Kyligence ឴ේ᭲ᩒ فےApache ਔ۸ 2016.3 2016.9 ࢵݪلړ౮ 1500ӡᗦ زB ᣟᩒ ҅Apache Kylin ྋ Kyligence ୌᒈ҅ ԫེ឴ ឴ᕁᅩڠಭහጯӡॠֵಭᩒ InfoWorld๋֯ ᒈ ୗრ Series B financing 15 რय़හഝૡٍॹ Kyligence US million USD led by Join Apache Kyligence founded branch Eight Road capital incubator Kylin open angel investments from Red Win InfoWorld Point Ventures Bossie Award founded sourced 2015.11 2016.8 2017.4 2017.12 2018.9 ླӱ౮ԅApache ݎմӱᕆฬᚆय़හഝ ਠ౮Aᣟᩒ 800 Kyligence Apache Kylin v2.5 ݎ ᶮᕆᶱፓ ᥴ٬ොໜ ӡᗦᰂ ҅ኧ਼ଃ Cloudݎ Kyligence Enterprise ᩒ̵ᶲԅᩒᶾ Graduated to ಭ҅ᕁᅩӾࢵ᪙ಭ Announce Apache Kylin v2.5 Apache Top Level Announce Kyligence released Project Kyligence Enterprise Series A financing Cloud 8 million USD led by Broadband capital, Shunwei Capital, Red Point
6 .:KDWLV$SDFKH.\OLQ - 3 ӡՊහഝ < 1 ᑁ ັᧃ᬴ BI @१ࢵٖᒫӞෛᩒᦔapp Visualization Interactive Reporting Dashboard 3 trillion data, < 1 s latency @toutiao, top news feed app in China - 60+ ᖌଶጱCube OLAP Engine @CPIC 60+ dimensions @CPIC, top3 insurance company Hive / HDFS / Hadoop MapReduce HBase/Parquet - JDBC / ODBC / RestAPI Kafka - BI integration Tableau Excel Cognos Superset, Redash, Qlik
7 . 2/$3DQG2/$3&XEH • Online analytical processing, • OLAP Cube is the core or OLAP, is an approach to answering multi-dimensional • OLAP cube is a data structure optimized for analytical (MDA) queries swiftly very quick data analysis. in computing. – Wikipedia • च֢ Basic operations Ӥ ܫRoll-up ӥ᱂ Drill-down ڔᇆ Slice and dice Pivot
8 .ᑮᳵഘᳵғCube चᏐܻቘ Cube: balance between space and time ᶼضᬰᤈ ̵ړᔄ̵ഭଧ Classification, aggregation, and sorting ᖌଶཛྷࣳ හഝᒈො֛ Multiple Dimensional OLAP Cube model
9 .Cube ᶼᦇᓒฎ Kylin ໐ஞದቘஷ Cube is the core concept in Kylin
10 .$SDFKH.\OLQ चԭԆၞय़හഝದ $SDFKH.\OLQ LVEXLOWZLWKPDLQVWUHDPELJGDWDWHFKQRORJLHV Data Analyst, BI Tools, Web App… ᐶᕚᦇᓒ SQL Offline calculation ࣁᕚᦇᓒ Online calculation Optimize & Rewrite Scan & filter Extract Load Compute
11 . ෫ᶼᦇᓒጱ 64/ಗᤈᦇښ 64/H[HFXWLRQSODQZLWKRXW&XEH ֺғ ړຉӞྦྷᳵٖ҅ӧ ݶʼnreturnflagŊ ʼnorderstatusŊ ଫጱᲀࠓఘ٭ Sample Check the order return and order status relationship in a time range select l_returnflag, o_orderstatus, sum(l_quantity) as sum_qty, Sort sum(l_extendedprice) as sum_base_price from Aggr v_lineitem . inner join v_orders on l_orderkey = o_orderkey Filter ෫ᶼᦇᓒ҅ق᮱ሿ࣋ᦇ where ᓒ҅I/O ग़᬴҅ṛ l_shipdate <= '1998-09-16' Join group by No cube, all need online l_returnflag, calculations, CPU and IO Tables o_orderstatus intensive, latency is O(N) order by remarkable. l_returnflag, o_orderstatus;
12 .ํᶼᦇᓒጱ 64/ಗᤈ 64/H[HFXWLRQSODQZLWK&XEH ํᶼᦇᓒ҅चԭ Sort Sort Cube ڊᕮຎ҅I/O ҅ᦇᓒ᬴҅֗ Aggr Filter . Directly from aggregated data Filter Cube (cube) with index; ڹᔱ౮ Much less CPU and IO. Cube Join Latency is small. The table join and aggregation are Cube Tables completed offline. O(N) O(flag x status x days) = O(1)