使用Analytics Zoo和Intel SGX实现可扩展的隐私保护机器学习系统 Part II

本次讲座主要介绍如何通过Analytics Zoo和Graphene-SGX,在Intel SGX上实现可扩展的隐私保护机器学习系统。其中包括安全可信的Cluster Serving、隐私保护的共享学习以及联邦学习。

Key points:

  1. Analytics-Zoo
  2. Intel SGX & Graphene-SGX
  3. Secured Cluster Serving
  4. Shared Machine Learning & Federated Learning

Part II PPML & Federated Learning (42’40”)

  • TEE
  • Intel SGX
  • Graphene-SGX
  • Analytics Zoo
  • Analytics Zoo Secured Cluster Serving on Graphene-SGX

添加微信:slidestalk_bot,留言:az。即可入技术交流群。

展开查看详情

1.Analytics Zoo and PPML Dongjie Shi

2. Agenda • TEE • Intel SGX • Graphene-SGX • Analytics Zoo • Analytics Zoo Secured Cluster Serving on Graphene-SGX *Other names and brands may be claimed as the property of others.

3. Comparison of PPML Technologies Measures Good ---> bad Security HE MPC TEE Clear Text Latency Clear Text TEE MPC HE I/O cost Clear Text TEE MPC HE PPML Technologies TEE MPC HE Clear Text Implementation CPU HW dominated by Current CPU dominate, CPU/GPU instruction sets + GPU but with GPU SW SDK acceleration optimization Computation 10%~10X slower 10~100X slower ~10000x slower 1x (Baseline) Speed *Other names and brands may be claimed as the property of others.

4. TEE: Trusted Execution Environment • TEE is a tamper resistant processing environment that runs on a separation kernel. • Goals of TEE • Isolated Execution • TEE/Normal OS may be malicious • Secure Storage • Integrity, Confidentiality, Freshness • Remote Attestation • determine the level of trust in the integrity of attestator • Secure Provisioning • remotely manage and update its data in a secure way • Trusted I/O Path • protects authenticity, and optionally confidentiality, of communication between TEE and peripherals *Other names and brands may be claimed as the property of others. 4

5. Intel® SGX (Intel® Software Guard Extensions) • SGX protects selected code and data from disclosure or modification. • Enhances confidentiality and integrity • Low learning curve • Remotely attest and provision • Help Significantly reduce attack surface • The primary SGX abstraction is an enclave: an isolated execution environment within the virtual address space of a process. *Other names and brands may be claimed as the property of others. 5

6. Graphene-SGX: A Practical Library OS for Unmodified Applications on SGX Case Studies​ Big Data AI Big Data + AI Properties Refactoring Challenges​ • Mature, widely deployed code​ • Takes time and effort​ • Complex with millions of lines of code​ • May introduce bugs in mature code ​ • Written in C, Python​, Java • Need a separate SDK for every language ​ • Use OS services​ • Developer expertise to know how to refactor​ • No access to source code of 3rd party apps​ *Other names and brands may be claimed as the property of others. 6

7. Graphene-SGX: Objectives & Solution • Objectives • Develop a tool to secure an application that:​ • Provides isolation without any code modification​ • Supports policy-based security enforcement​ (Manifest, trusted, allowed) • Does not compromise performance • Provides transparent support for attestation and secure OS services​ • Solution • Graphene: A Library OS for running unmodified Linux applications inside SGX enclaves *Other names and brands may be claimed as the property of others. 7

8. Library OS Background​ • Approach was championed by several OS designs in early 90’s with a focus on performance​ • Runs a part of OS functionality as a library in application​ • Communication to the host OS with a small fixed set of abstractions ​ • Offers a secure alternative to large traditional OS system call interface​ App App App App App App Bin/Lib Bin/Lib Bin/Lib Bin/Lib Bin/Lib Bin/Lib Lib OS Lib OS Lib OS Large System call Small System call Interface​ 300+ Interface​ 40+ Host OS OS functions Host OS OS functions *Other names and brands may be claimed as the property of others. 8

9. Graphene-SGX Architecture *Other names and brands may be claimed as the property of others. 9

10. AI on Big Data *Other names and brands may be claimed as the property of others. 10

11. Integrated Big Data Analytics and AI Seamless Scaling from Laptop to Distributed Big Data Prototype on laptop Experiment on clusters Production deployment w/ using sample data with history data distributed data pipeline Production Data pipeline • Easily prototype end-to-end pipelines that apply AI models to big data • “Zero” code change from laptop to distributed cluster • Seamlessly deployed on production Hadoop/K8s clusters • Automate the process of applying machine learning to big data *Other names and brands may be claimed as the property of others. 11

12. Analytics Zoo Unified Data Analytics and AI Platform Models & Recommendation Time Series Computer Vision NLP Algorithms Automated ML AutoML for Time Series Automatic Cluster Serving Workflow Integrated Distributed TensorFlow & PyTorch on Spark RayOnSpark Analytics & AI Pipelines Spark Dataframes & ML Pipelines for DL InferenceModel Laptop K8s Cluster Hadoop Cluster Spark Cluster Compute Environment DL Frameworks Distributed Analytics Python Libraries (TF/PyTorch/OpenVINO/…) (Spark/Flink/Ray/…) (Numpy/Pandas/sklearn/…) Powered by oneAPI https://github.com/intel-analytics/analytics-zoo *Other names and brands may be claimed as the property of others. 12

13. Analytics Zoo Cluster Serving http request HTTP Server Model Network http response connection Input Queue for requests Docker container R5 R3 R1 R4 R2 P1 P3 P5 Hadoop*/YARN* (or K8S*) cluster P2 P4 Simple Python script Output Queue for prediction results *Other names and brands may be claimed as the property of others. 13

14. 1. Install and prepare Cluster Serving environment on a local node 2. Launch the Cluster Serving service 3. Distributed, real-time (streaming) inference *Other names and brands may be claimed as the property of others. 14

15. Secured Cluster Serving on Graphene-SGX Flink Cluster 3 Flink Task Manager TLS Enabled 4 TLS Redis PUT/GET Redis 5 TLS Data Plane 2 TLS Redis PUT/GET TLS HTTPS Flink Task Manager HTTP(S) Frontend TLS RPC/BLOB 1 Flink Job Manager 5 (Dispatcher, Resource HTTPS Manager) 6 1. Https protocol for customers to send requests Cluster serving jar 2.4. Put and Get data from Redis Flink Client 3. TLS enabled Redis 7 Graphene- 5. Flink internal communication with TLS enabled plain 6. Flink external communication with TLS enabled SGX Encrypted model files 7. Encrypted model loading *Other names and brands may be claimed as the property of others. 15

16. Secured HTTP Frontend & Secured Redis • Secured HTTP Frontend • TLS HTTPS enabled REST requests • TLS enabled Jedis GET/PUT to Redis • TLS Enabled Redis • Rebuild Redis with BUILD_TLS=yes • HTTP Frontend TLS enabled Jedis GET/PUT to Redis • Flink Source/Sink TLS enabled Jedis GET/PUT to Redis *Other names and brands may be claimed as the property of others. 16

17. Secured Flink Cluster • Flink SSL Setup and Internal and External Connectivity • Internal • nano flink-conf.yaml xxxxxxxx xxxxxxxx xxxxxxxx • External *Other names and brands may be claimed as the property of others. 17

18. Encrypted Model loading • Encrypted Model loading • trait EncryptSupportive • def encryptWithAES256(content: String, secret: String, salt: String): String • def decryptWithAES256(content: String, secret: String, salt: String): String • def encryptFileWithAES256(filePath: String, secret: String, salt: String, outputFile: String, encoding: String = "UTF-8") • def decryptFileWithAES256(filePath: String, secret: String, salt: String): String • def decryptFileWithAES256(filePath: String, secret: String, salt: String, outputFile: String) • InferenceModel • def doLoadEncryptedOpenVINO(modelPath: String, weightPath: String, secret: String, salt: String, batchSize: Int = 0) • ClusterServing • secret/salt = jedis.hget(Conventions.MODEL_SECURED_KEY, Conventions.MODEL_SECURED_SECRET/…) • model.doLoadEncryptedOpenVINO(defPath, weightPath, secret, salt, coreNum) *Other names and brands may be claimed as the property of others. 18

19.Run Executable on Graphene-SGX executable executable trusted_children (loader.exec) executable trusted_children manifest manifest executable executable trusted_children Trusted Libs Trusted Libs Manifest Template manifest.sgx manifest.sgx Graphene executable trusted_children Trusted Files Trusted Files Makefile token token executable trusted_children executable trusted_children SGX_SIGNER_KEY manifest.sgx Manifest.sgx sig sig Enclave trusted_children Enclave Configure & measurement pal_loader 19

20. Run Analytics Zoo Secured Cluster Serving on Graphene-SGX • Two Executables(loader.exec) • loader.exec = file:redis-server • loader.exec = file:/bin/bash • child enclaves: sgx.trusted_children.ls, sgx.trusted_children.cat, sgx.trusted_children.rm, …, sgx.trusted_children.java = file:/usr/bin/java • Trusted Files: sgx.trusted_files.keys_keystore_*** = file:/home/sgx/keys/*** • Allowed Files: sgx.allowed_files.jvm = file:/usr/lib/jvm • Enlarge enclave size: sgx.enclave_size = 32G • Enlarge thread num: sgx.thread_num = 1024 • Run the commands • SGX=1 ./pal_loader redis-server … • SGX=1 ./pal_loader bash.manifest -c “………..” *Other names and brands may be claimed as the property of others. 20

21.Thanks! 21