1. hosted by Kerberos-based Big Data Security Solution and Practice in Alibaba Cloud HBase Jiajia Li @ Intel Chao Guo @ Alibaba August 17,2018
2. hosted by Content Hadoop Authentication 01 Service 02 Security Practice in ApsaraDB for HBase
3. hosted by Hadoop Authentication Service 01 Jiajia Li Intel
4. hosted by Content 1.1 Background 1.2 Introduction to HAS 1.3 Outlook and Summary
5. hosted by 1.1 Background
6. hosted by Background Motivations • Jan 2017, The ransomware attacks on poorly secured MongoDB （Over 27000 MongoDB Databases Held For Ransom Within A Week), cyber crooks have started targeting unprotected Hadoop Clusters as well (Hadoop, CounchDB Next Targets in Wave of Database Attacks), until now, large amounts of Hadoop cluster data is still exposed in the public network (Insecure Hadoop Clusters Expose Over 5,000 Terabytes of Data).
7. hosted by Background Multiple Ways to Attack Insecure Hadoop Cluster Service User I’m Datanode login Hadoop* Cluster with hadoop hadoop fs -rmr / HDFS with ACL superuser: hadoop
8. hosted by Background How to Secure a Hadoop Cluster Authentication Authorization Authentication • Kerberos is the right approach adopted for Hadoop security • MIT Kerberos, Azure AD, HAS Authorization • Apache Sentry(Cloudera), Apache Ranger(Hortonworks)
9. hosted by 1.2 Introduction to HAS
10. hosted by Introduction to HAS Challenges for the Existing Solution Hard to integrate existing identity management systems of enterprises to Kerbero Over the past few years, multiple cloud providers have introduced Hadoop-as-a-Service and more organizations consider the cloud as a component of their Hadoop deployments Java lacks a comprehensive Kerberos library. The Kerberos support in Java/JRE i • Limited, lacking full encryption and checksum types • Hidden from GSSAPI/SASL layers • Evolving slow Very difficult in Kerberos cluster deployment Kerberos is essentially a protocol, or secure channel, doesn’t have to be that complex to most or normal users, hiding the details
11. hosted by Introduction to HAS HAS System Architecture Active Directory 1 2 Token LDAP HAS Client HAS is a secure and Authority extensible 5 RDMS authentication framework for 3 4 6 addressing the problem OAuth of integration of enterprise identity SSO (SAML) with Kerberos centric HAS KDC Server Kerberos Hadoop ecosystem. (Kerby) Client PKI Kerberos Hadoop OTP HAS Token Enterprise Identity Kerberos Token
12. hosted by Introduction to HAS Key points of HAS implementation t Hadoop services continuously use Hadoop users can also the original Kerberos continue to log in using a authentication mechanism. familiar authentication mode. In the new authentication Based on the new authentication mechanism, the combination of mechanism, security your plugin and the existing administrators don’t need to authentication system can be synchronize user account customized and implemented. information to the Kerberos database.
13. hosted by Introduction to HAS HAS protocol flow
14. hosted by Introduction to HAS HAS plugins
15. hosted by 1.3 Outlook and Summary
16. hosted by Outlook and Summary • The new authentication mechanism (Kerberos-based Token Authentication) provided in HAS supports most components in the Hadoop ecosystem and makes little or no change to the components. • HAS open source in the branch of Apache Kerby project (https://github.com/apache/directory- kerby/tree/has-project). • HAS will be merged to trunk in the master JIRA (https://issues.apache.org/jira/browse/DIRKRB-671). • According to the community plan, the HAS feature will be released in Kerby 2.0.0, and Kerby 2.0.0 will be released in the near future.
17. hosted by 02 Security Practice in ApsaraDB for HBase Chao guo Aliyun
18. hosted by Content Introduction to Apache HBase Security 2.1 and Security of ApsaraDB for HBase ApsaraDB for HBase Optimization 2.2 base on HAS 2.3 Outlook and summary
19. hosted by 2.1 Introduction to Apache HBase Security and Security of ApsaraDB for HBase
20.Introduction to Apache HBase Securityhosted and by Security of ApsaraDB for HBase Introduce to Apache HBase Security Apache HBase Security includes : 1. Access control Label base on Access Controller coprocessor; 2. Using Secure HTTP for Web UI. 3. Kerberos authentication for RPC; User Auth client Https WEB Apach HBase ACL SASL RPC HTTPS HDFS zookeeper DN NN
21.Introduction to Apache HBase Security andhosted by Security of ApsaraDB for HBase Introduction to ApsaraDB for Hbase security Public networker user Classic network user White Plugin authentication Other mechanism list User VPC/Classic Network Isolation ApsaraDB for HBase text Authenticatio ACL HTTPS HAS n AUDIT QUOTA ENCRYPT ApsaraDB for HBase main function: • Network Isolation and white list; • HAS authentication; • …
22.Introduction to Apache HBase Security hosted by and Security of ApsaraDB for HBase Introduction to ApsaraDB for HBase Autnentication • Instal MIT kerberos at the locality; • Set up local krb5.conf，the Kerberos service address，realm and so on of the HBase; • User name and local system user name should be the same, If got no user, need create； • Set up security configuration in hbase-site.xml,core-site.xml,hdfs- site.xml.For example:hadoop.security.authentication， dfs.namenode.kerberos.principal; • Do kinit, run kinit throw user passwor mode or keytab, get the legal user tiket; • Access to hbase throw hbase shell . VS text • Set up hbase zookeeper address, user’s password and name; • Access to hbase throw hbase shell .
23.Introduction to Apache HBase Security hosted by and Security of ApsaraDB for HBase User experience User’s needs start basically from security then low cost then good experience User experience Affinity Simplified configuration；easy to use Cost Reduce the cost of operation and maintenance, perfermance Security network isolation/whitelist/Authentication
24. hosted by 2.2 ApsaraDB for HBase Optimization base on HAS
25. hosted by ApsaraDB for HBase Optimization base on HAS Basical Introduction Why choice HAS for us? • Kerberos is the only means to enable hadoop security • Compatible with existting security mechanism :ACL、 keberos … • Easy to deploy • Easy to use for client • Good scalability • Performance is ok • Low operating cost； • …
26. hosted by ApsaraDB for HBase Optimization base on HAS ApsaraDB for HBase’s authentication improvement Security and practice Account password management Whitelist for hosts and other securities Using plugin for password and account management High availability backend Impement zookeeper like backend for storage of user name/password/whitelist host/kdcconf
27.ApsaraDB for HBase Optimization base on hosted by HAS Security and practice • HA for HAS server • White list for host access • Plugins use HTTPS(Initialiazation/usage) • Configurable salting algorithm • Backward compatible kerberos and all Hbase security • Ops tools : one button to deploy
28. hosted by ApsaraDB for HBase Optimization base on HAS Account password management We have done some service on has like: Aliyun Client/Server plugin mode • Client can initialize with their user/password/hosts • Client pass user/password throw https • Server plugin verify from storage with salt • The entire process is safe • Configure free,easy to use; Other plugin mode:
29. hosted by ApsaraDB for HBase Optimization base on HAS High availability backend Mysql not fit for cloud environment Single mysql is not ha , 2 or 3 node ha mysql not fit for cloud environment deployment , k-v format, zk is enough High availability for backend storage At least 3 replicas ,more available than mysql Less resource and enough performance Most value is less than 50B , less memory than mysql and enough performance