Coursors in Apache Phoenix

Coursors_in_Apache_Phoenix
展开查看详情

1. Cursors in Apache Phoenix HBaseCon West 2017 June 12, 2017 Anirudha Jadhav ajadhav2@bloomberg.net Biju Nair bnair10@bloomberg.net © 2017 Bloomberg Finance L.P. All rights reserved.

2. Bloomberg Leading data and analytics provider for the financial industry © 2017 Bloomberg Finance L.P. All rights reserved.

3.Bloomberg is a data company

4. Reality of working with data • The data model changes over time • Users querying the data model don’t necessarily change • Alternate query patterns for the same dataset • Data infrastructure usage needs to be simple © 2017 Bloomberg Finance L.P. All rights reserved.

5. Apache Phoenix • Recipes of best practices for using HBase over a familiar SQL’ish grammar • It is so much more than SQL o User defined functions for push-down o Secondary indices o Statistics collections, optimizations based on heuristics o ORM libraries o JDBC, ODBC support with Query servers o Integrations: Spark, Kafka, MR and others © 2017 Bloomberg Finance L.P. All rights reserved.

6. Extending Apache Phoenix • A very active and helpful community • Our ongoing work o Apache Calcite o Distributed tests and nightly performance build o Multi-DC replication o Deep paging with cursor implementation © 2017 Bloomberg Finance L.P. All rights reserved.

7. HBase Application ZooKeeper HBase HBase Client Quorum Master RegionServer RegionServer RegionServer HDFS HDFS HDFS DataNode DataNode DataNode © 2017 Bloomberg Finance L.P. All rights reserved.

8. Phoenix Application Phoenix Client ZooKeeper HBase HBase Client Quorum Master RegionServer RegionServer RegionServer SYSTEM.CATALOG SYSTEM.STATS Phoenix RPC Phoenix RPC Phoenix endpoint endpoint Coprocessors HDFS HDFS HDFS DataNode DataNode DataNode https://www.slideshare.net/enissoz/apache-phoenix-past-present-and-future-of-sql-over-hbase http://phoenix.apache.org/presentations/OC-HUG-2014-10-4x3.pdf © 2017 Bloomberg Finance L.P. All rights reserved.

9. Phoenix Client HBase Connection Management Client HBase Authentication Client SQL Parsing ANTLR4 Phoenix Client Query rewrite/ Hints/Rules Optimization Query Plan Generation Rules Transaction Management Tephra © 2017 Bloomberg Finance L.P. All rights reserved.

10. Phoenix query execution Connection con = DriverManager.getConnection("jdbc:phoenix:zkquorum:2181:/hbase:principal:keytabfile") ; … PreparedStatement statement = con.prepareStatement("select * from TBL"); … ResultSet rset = statement.executeQuery(); … while (rset.next() != null) … rset.close() … © 2017 Bloomberg Finance L.P. All rights reserved.

11. Phoenix query execution Connect to HBase getConnection Parse SQL Statement prepareStatement Read/Cache Metadata Validate SQL statement Create query plan executeQuery Optimize query plan Create Result Iterator Create Phoenix Result Set Close ResultSet close() © 2017 Bloomberg Finance L.P. All rights reserved.

12. Phoenix Server Application Phoenix Client HBase Client Meta Data Write Read Request Request Request RegionServer RegionServer RegionServer SYSTEM.CATALOG USER_TABLE USER_TABLE MetaDataEndPointImpl UngroupedAggregateRO UngroupedAggregateRO GroupedAggregateRO GroupedAggregateRO MetaDataRegionObserver ScanRegionObserver ScanRegionObserver Indexer Index ServerCachingEndpointImpl © 2017 Bloomberg Finance L.P. All rights reserved.

13. Cursors • To support row pagination o Should support forward and backward traversal • Support required for select queries only • Data needs to be consistent during traversal © 2017 Bloomberg Finance L.P. All rights reserved.

14. Cursors • DECLARE tCursor CURSOR FOR SELECT * FROM TBL • OPEN tCursor • FETCH NEXT 10 ROWS FROM tCursor • FETCH PRIOR 5 ROWS FROM tCursor • CLOSE tCursor © 2017 Bloomberg Finance L.P. All rights reserved.

15. Implementation options • PHOENIX-2606 • Use row value constructors o Query rewrite and complex • Wrapper over available query Resultsets o Can leverage Resultsets and so relatively simple © 2017 Bloomberg Finance L.P. All rights reserved.

16. Cursor Lifecycle PreparedStatement statement = con.prepareStatement("DECLARE tCursor CURSOR FOR SELECT * FROM TBL"); statement.execute(); … statement = con.prepareStatement("OPEN tCursor"); statement = con.prepareStatement("FETCH NEXT FROM tCursor"); ResultSet rset = statement.execute(); while (rset.next != null) … statement = con.prepareStatement(“CLOSE tCursor"); statement.execute(); © 2017 Bloomberg Finance L.P. All rights reserved.

17. Cursor lifecycle Parse SQL Statement Create/Optimize QueryPlan DECLARE CURSOR Create CursorWrapper Set Cursor Status to Open OPEN CURSOR Execute CursorFetchPlan Create CursorResultIterator FETCH Create Phoenix ResultSet Close Cursor CLOSE © 2017 Bloomberg Finance L.P. All rights reserved.

18. Cursor Challenges • Data Consistency o Query start timestamp provides snapshot consistency • Optimization o Use Scan object for non aggregate queries • Cache sizing o Dynamic sizing © 2017 Bloomberg Finance L.P. All rights reserved.

19.Contributors • Gabriel Jimenez (MIT) • Anirudha Jadhav (Bloomberg) • Biju Nair (Bloomberg) • Ankit Singhal (Hortonworks)

20.Q&A Thank You Hardening Apache Phoenix @PhoenixCon tomorrow 2 PM PT © 2017 Bloomberg Finance L.P. All rights reserved.

为了让众多HBase相关从业人员及爱好者有一个自由交流HBase相关技术的社区,阿里巴巴、小米、华为、网易、京东、滴滴、知乎等公司的HBase技术研究人员共同发起了组建中国HBase技术社区。