1.HareQL:快速HBase查詢工具的發展過程 Development of HBase Client and HareQL Mon-Fong Mike Jiang 江孟峰 Kuan-Yu Hubert Fan-Chiang 范姜冠宇 Tienyu Rebecca Lin 林恬伃

2. About Us  Providing IT solutions  Big data Product since 2011 • System development for big data solutions •Hare Data platform • Smart manufacturing related services •2013 HSP Innovative Product Award • Financial data systems •2014 Golden Award of the TOP 100 Innovative Products • Telecommunication data systems •Cloudera Certified Technology (Only one in Taiwan) • We are the Cloudera certificated professional services team 2

3.What is Hare It’s a NoSQL Database which is based on HBase Support SQL to HBase directly Provide DBMS-like Web UI Provide JDBC/ODBC and Restful Service

4.Why Hare ? Easy • Click and start to use it • Friendly user interface • To involve your big data rapidly Comfortable • SQL language supported • Data type management • Multi-Cluster in one client Faster • Quick access to the data in HBase • Powerful query engine for better performance Compatible • Based on the Hadoop/HBase System • Highly compatible in ecosystem

5.Features  Easy use (Web UI)  Easy install  Friendly UI  One Client ; Many Clusters (Connection Manager)  Bulkload UI  Meta Manager (Schema Manager)  Relation between HBase Table and Hive Table  HareQL (High Speed SQL Query in HBase)  JDBC Driver  ODBC Driver (not support sentry)  Restful Services

6.Software Stack Solr Cloud Security JDBC HDFS Client HBase Restful ODBC Client Service (No Indexing Sentry) Spark Monitor Hare Core Sentry Kerberos Hive HBase Spark Hadoop

7.System Architecture

8.HareQL Hive: MapReduce We replace MapReduces in Hive to HBase coprocessors. We call the language “HareQL”. HareQL has some advantages as below.  Low- latency  Query HBase table directly  High performance


10.HareQL Architecture Hive Parser make us support HiveQL

11.From Hive to Hare Hive Parser Flow AST Semantic Analyzer Hare Advance QB Logical Flow Parser QB Plan Gen. OP Tree Logical HareSemantic Optimizer OP Analyzer Task Tree Physical Plan Hare Plan Gen. Task Coprocessor Tree Physical Execution Plan Optimizer Task Execution Tree Map Result Final Reducer Result

12.Metadata  As we know, anything that can be converted to an array of bytes can be stored in HBase. However, we have to convert the data back correctly, or we can’t recognize the data.  We integrated meta-store of Hive to HBase Client. We call the data type of HBase column “Meta data”.

13.When to get Metadata ? Advance Parser QB HareSemanticAnalyzer Task Plan Hare Coprocessor Hive Meta Hare Meta

14.Hare Restful Service Table manipulation Row manipulation Bulkload data Sending SQL Scanning Metadata manipulation

15.Semiconductor application Structured Data SQM SPC EDA PLM System System System System SCM MES FDC ERP System System System System Un-Structured Data 15



18. Schema Design in HBase • Designed the row key according to the access request 18

19.Application – Yieldata - Easy Selection - Smart Filter - Clear View of the Dataset

20.Yieldata – Root Cause Ranking

21.Thank you • is-land Systems Inc. • Company:www.is-land.com.tw • Big Data:www.HareDB.com • Email:service@haredb.com • Addr : 新竹科學園區展業二路4號3樓