- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Keynote-1_HBa...1534498927
展开查看详情
1 .Project Status HBaseConAsia2018, Beijing Michael Stack <stack@apache.org> Yu Li <liyu@apache.org>
2 .2.0.0
3 .Pervasive... ...distributed, scalable, big data store
4 .In a nutshell... ● ...15,409 commits made by 311 contributors ● ...representing 800,490 lines of code ● ...mostly written in Java ● ...has a well established, mature codebase ● ...maintained by a very large development team ● ...with stable Y-O-Y commits ● ...took an estimated 222 years of effort (COCOMO model) ● ...starting with its first commit in April, 2007 (>10 years old!) Source https://www.openhub.net/p/hbase
5 . LOC Source https://www.openhub.net/p/hbase
6 .Issues
7 . Commits per month Source https://www.openhub.net/p/hbase
8 . Contributors Source https://www.openhub.net/p/hbase
9 .New PMC Chairperson! “The HBase project represents solid, useful computer science that solves problems and runs businesses every day. To keep that going, we need to keep bringing new ideas and approaches into the project...we need to continue to attract people from all backgrounds and parts of the world. I'd love to see more women, more people of color, and even more worldwide diversity. I'd like to see more contributions from people not employed by big data platform companies. “If we continue to strive for a diversity of ideas and experiences, we'll keep innovating so that HBase remains relevant for years to come.” Misty Linville, Vice-President of the Apache HBase Project
10 . 2.0.0 Our project ● Apache HBase is an Open Source Apache project. ● It’s what we want to make of it. ● No owners! ● Anyone can help! ● All welcome! ● The more, the merrier!
11 . 2.0.0 Active branches Active Branches Latest Branche Release Release Manager branch-1.2 1.2.6.1 (EOL’d) Sean Busbey branch-1.3 1.3.2.1 (Yahoo) Francis Liu branch-1.4 1.4.6 (Current Stable) Andrew Purtell branch-1.5 Coming... Andrew Purtell branch-2.0 2.0.1 Michael Stack branch-2.1 2.1.0 Duo Zhang branch-2.2 <null> <null> branch-3 <null> <null>
12 . 2.0.0 hbase-2.0.0
13 . 2.0.0 2.0.0: Long-time coming ● Branched four years ago ● Released end-of-April, 2018 ● Took > 1 year to stabilize ○ hbase-2.0.0 released, April 29th, 2018 ○ hbase-2.0.0-beta2 released, March 22nd, 2018 ○ hbase-2.0.0-beta1 released January, 16th, 2018 ○ hbase-2.0.0-alpha4 released November 4th, 2017 ○ hbase-2.0.0-alpha3 released September 17th, 2017 ○ hbase-2.0.0-alpha2 released August 21st, 2017 ○ hbase-2.0.0-alpha1 released June 22nd, 2017 ● Multiple Release Managers ○ Matteo Bertozzi, Stephen Yuan Jiang, yours truly...
14 . 2.0.0 Lets not do this again! ● Backed-up mountains of “Tech Debt” ○ Rotted Unit Tests ■ “...99/100 it was the test, not Apache Infra” ○ Performance regressions ■ Not out of the woods yet…
15 .2.0.0 S
16 . 2.0.0 Goals: Compatibility ● Double-down on Semantic Versioning, semver ○ Adopted in hbase-1.0.0 ○ MAJOR.MINOR.PATCH[-IDENTIFIER] ■ E.g. 2.0.0-alpha1
17 . 2.0.0 Goals: Compatibility ● But… ○ Semantic Versioning is about API only ■ What about…. ● Internal/External Interfaces ○ Where is Client Interface when Spark/MapReduce ● It’s complicated...
18 . 2.0.0 Goals: Compatibility ● From Hadoop… Yetus, annotations ○ InterfaceAudience.Public ■ Get/Put/Scan/Connection ○ InterfaceAudience.LimitedPrivate ■ Coprocessors, Replication, etc. ○ InterfaceAudience.Private ■ Internal only ● What about… ○ Source/Binary compatibility ○ Serializations ■ Wire ■ Formats in HDFS/Zookeeper ○ Dependencies ● See refguide semver section ○ http://hbase.apache.org/book.html#hbase.versioning
19 . 2.0.0 Goals: Compatibility ● Grey areas… ○ Coprocessors ■ Free access HBase core ■ Change to hbase internals => broken Coprocessors ■ InterfaceAudience.LimitedPrivate ○ Published metrics/jmx ○ Protobufs ■ hbase-protocol/hbase-protocol-shaded
20 . 2.0.0 Goals: Compatibility in 2.0.0 ● We adhere to SemVer for DML in 2.x ○ Not for DDL ● hbase-1.x client can work against hbase-2.x cluster ○ Even 1.x Coprocessor Endpoints work on an hbase-2.x cluster ○ Read-only DDL/Admin of hbase-2.x from hbase-1.x client ○ Replication 1 2 works ● Extensive curation of what is public/private ● Purged Guava/Protobuf from API ● Coprocessors ○ Revamped ● No Singularity! No downtime! Rolling upgrade from hbase1! ○ Experimental! From 1.4.x to 2.1.x has been tested.
21 . 2.0.0 Goals: Compatibility ● Still plenty to do ○ Ongoing effort... ○ 3.0.0!
22 . 2.0.0 Goals: Others ● Scale ○ More Regions, bigger clusters ● Performance ○ Inline read/write but also macro-aspect: restart, assign, etc. ○ Better resource utilization ■ I/O, RAM ● Fix primary root of operational woes/bugs ○ Master Region Assignment
23 . 2.0.0 Insides ● Scale ○ More Regions, bigger clusters ● Performance ○ Inline read/write but also macro restart, assign, etc. ○ Better resource utilization ■ I/O, RAM ● Fix primary root of operational woes/bugs ○ Master Region Assignment ● Cleanup ○ Spark narrative ○ Interfaces
24 . 2.0.0 Insides ● Currently >4500 issues resolved ○ ~3k exclusive to 2.0.0+
25 . 2.0.0 Insides: Prerequisites ● JDK8 only ● Hadoop-2.7.7 minimum* ○ Works against the coming Hadoop-3.x *Be wary of “...not stable / production ready” Hadoops
26 . 2.0.0 .0.0 Insides: Features ● New Master Core (A.K.A AMv2) ● Off-heap Read/Write path ● In-memory Compaction (“Accordion”) ● And more...
27 . 2.0.0 Insides: Assignment Manager ● New Master Core (A.K.A AMv2) ○ Assignment Manager v1 (AMv1) root of many operational headaches ● Prompt assign of millions of Regions, faster startup, larger scale ● Scrutable/Standalone Testable ● One hbase:meta writer only, the Master ● No more intermediate state in ZK ○ At other end of an RPC... ○ Only final state published to hbase:meta ○ No more distributed state: some in Master memory, some in ZK, some in HDFS. ● New degree of Resilience
28 . 2.0.0 Insides: Off-heap ● Smaller JVM heaps, less copying ○ But more accounting! ● Off-heap Read Path ○ ○ HDFS=>BucketCache=>Outbound Socket ~latency OFF ■ Cache more ■ Less GC, less erratic ● Off-heap Write Path ○ RPC=>HDFS data kept off-heap ■ Async DFS WAL Client ● Off-heap ○ Socket Socket ○ Off-heap fragmentation anyone? ○ On by default?
29 . 2.0.0 Insides: Offheap Before: After: