2. 基于英特尔CPU加速隐私计算的案例与实践-俞巍

播放视频

视频文档

2. 基于英特尔CPU加速隐私计算的案例与实践-俞巍

下载 14

快召唤伙伴们来围观吧
微博 QQ QQ空间 贴吧
视频嵌入链接文档嵌入链接
<iframe src="https://www.slidestalk.com/AIProgrammingDay/CPUfinal73642?embed&video" frame border="0" width="640" height="360" scrolling="no" allowfullscreen="true">复制
微信扫一扫分享
已成功复制到剪贴板

英特尔AI实践日

发布于

3年前

939

人观看

#信息技术

英特尔第三代至强可扩展处理器在提升计算性能特别是针对隐私场景的计算性能上有像SGX这样的新特性来提升隐私计算的效率。通过可信执行环境可以大幅提升隐私计算的性能。此外，这一代至强可扩展处理器上的AVX512指令集也可以用来加速同态加密算法。本次演讲介绍了英特尔和合作伙伴在2021年基于英特尔第三代至强可扩展处理器上加速隐私计算的几个典型案例与实践，充分展示了这一代CPU在隐私计算上的高效率。

俞巍，博士，英特尔亚太和中国区数据中心AI销售和技术支持部人工智能架构师。曾在半导体行业从事图像处理，模式识别，机器学习等开发和研究工作10多年，2017年加入英特尔后支持并参与金融，医疗，互联网和制造业等多个行业的多个AI研究及应用落地项目。

展开查看详情

1 .基于英特尔CPU加速隐私计算的案例与实践俞巍｜英特尔人工智能架构师

2 .法律声明 • 关于性能和基准测试程序结果的更多信息，请访问www.intel.com/benchmarks。 • 在特定系统的特殊测试中测试组件性能。硬件、软件或配置的差异将影响实际性能。当您考虑采购时，请查阅其他信息来源评估性能。关于性能和基准测试程序结果的更多信息，请访问www.intel.com/benchmarks。 • 英特尔技术特性和优势取决于系统配置，并可能需要支持的硬件、软件或服务得以激活。产品性能会基于系统配置有所变化。没有任何产品或组件是绝对安全的。更多信息请从原始设备制造商或零售商处获得，或请见intel.com。 • 预测或模拟结果使用英特尔内部分析或架构模拟或建模，该等结果仅供您参考。系统硬件、软件或配置中的任何差异将可能影响您的实际性能。 • 英特尔并不控制或审计第三方数据。请您审查该内容，咨询其他来源，并确认提及数据是否准确。 • 优化声明：英特尔编译器针对英特尔微处理器的优化程度可能与针对非英特尔微处理器的优化程度不同。这些优化包括 SSE2、SSE3 和 SSSE3 指令集和其他优化。对于非英特尔微处理器上的任何优化是否存在、其功能或效力，英特尔不做任何保证。 • 本产品中取决于微处理器的优化是针对英特尔微处理器。不具体针对英特尔微架构的特定优化为英特尔微处理器保留。请参考适用的产品用户与参考指南，获取有关本声明中具体指令集的更多信息。 • 本文中提供的所有信息可在不通知的情况下随时发生变更。关于英特尔最新的产品规格和路线图，请联系您的英特尔代表。 • 本文并未（明示或默示、或通过禁止反言或以其他方式）授予任何知识产权许可。 • 描述的产品可能包含可能导致产品与公布的技术规格有所偏差的、被称为非重要错误的设计瑕疵或错误。一经要求，我们将提供当前描述的非重要错误。 • 英特尔、英特尔标识以及其他英特尔商标是英特尔公司或其子公司在美国和/或其他国家的商标。其他的名称和品牌可能是其他所有者的资产。 ©英特尔公司版权所有

3 . A fundamental conflict：Privacy vs More data usages >5k 30% 63ZB 60% Data breaches 2021 Data volumes Next 5 years The growth of global data breaches The growth of global data 3 IDC: Data Age 2025

4 . Agenda • Background • Confidential computing by TEE/SGX on ICX • Confidential computing by HE on ICX • Take away 4

5 . Background 5

6 . History and outlook of confidential computing in China History and outlook 2020 Year One of technology for confidential computing 2021 Year One of the commercial landing of confidential computing … 2025 Half of large companies to adopt privacy-enhancing computation (Gartner predicts) China 2021 40+ ~60 5+ 2 China Products in Standardization New laws Companies China organization 6 Data from CAICT

7 . Confidential computing compared to AI Confidential Computing Artificial Intelligence Important: Heterogeneous Computing power Important：mainly on CPU today computing Algorithms Important Important Data General data and less important Specific data and important 7

8 .List of Terms ▪ PPML: Privacy Preserving Machine Learning ▪ FL: Federated (Machine) Learning ▪ HFL: Horizontal Federated Learning ▪ VFL: Vertical Federated Learning ▪ TEE: Trusted Execution Environment ▪ HE: Homomorphic Encryption ▪ PHE: Partially Homomorphic Encryption ▪ MPC: Multi-Party Computation ▪ DP: Differential privacy

9 .Trade-off in confidential computing Privacy Efficiency vs Privacy: Accuracy vs Privacy: • Do all applications need 100% privacy? • Does all data need to be o Send partial data? encrypted? Choose HE or TEE? o Dummy data instead of real data? o Differential privacy work? Balance Accuracy Efficiency Efficiency vs Accuracy: • Do we need to use data from all parties? For toC applications, could be >1000 parties 9 9

10 .Overview of confidential computing/PPML TEE Industries … FSI HLS CSP Retail MFG Federated Learning AI model protection PPML Applications Privacy MPC Techniques TEE MPC HE DP Heterogeneous privacy CPU FPGA GPU ASIC computing platform TEE

11 . Confidential computing by TEE/SGX 11

12 . Intel® SGX Enclave Page Cache (EPC) Growth In the Data center Improvement Scalability Performance Bigger enclaves Significantly up to 1TB improved 3rd Gen Intel Xeon Scalable Processors Ease of Protected Intel® Xeon® Intel® SGX Card Intel® Xeon® E Deployment Single Socket uses 3 Xeon E3 Mehlow-Refresh 512GB EPC per socket offload from enclaves to HW Ecosystem (2 socket = up to 1TB EPC) services E3: 128MB EPC 3 x 128MB EPC 256MB EPC accelerators 2018 2019 2020 2021 Larger memory enclaves to support Data Center workloads For additional information see: www.intel.com/trustsgx Note: Enclave Page Cache (EPC)

13 . Software stack from TEE/SGX to application - From SDK to Lib OS Porting to SGX Execution TensorFlow BigDL PyTorch OpenVINOTM Compile Applications (Python/C++/Java) Source SGX SDK Applications Applications Acceleration libs: oneDNN/oneAAL/oneMKL Source codes after running in codes porting enclave Lib OS Tensorflow Gramine SGX Environment Pytorch OpenVINO SGX Interface Xgboost Applications Applications Manifest Applications BigDL files running in TEE Environment … enclave

14 . SGX enabling work on ICX Graphene v1.1 2020 Jul TensorFlow 2.6 PyTorch 1.10.1 … OpenVINOTM 2020.4 commit 023e7c2 Graphene commit 6ff2f12 2020 Dec Gramine v1.0 Graphene v1.2-RC1 2021 Jul SGX driver in Linux kernel 2021 Oct Gramine v1.0 2022 Feb Gramine v1.1 Ubuntu 18.04 Kernel >= 5.11.0-051100-generic Gramine master 14

15 . End to End PPML solution: BigDL 15

16 . Trusted FL of BigDL 16

17 .Bridge the gap between data application and privacy protection BigDL PPML use Intel SGX to build E2E PPML and FL solution Maintain data security and break down to provide richer data sources for AI FSI need to address BigDL PPML BigDL PPML provides the privacy protection and helps existing Big Data federated learning capability data silos in Big Data and and AI applications to run help FSI customers break down AI applications to meet directly in enclave for data silos and achieve joint Big Data AI Federated increasingly stringent secure and trusted data modeling with user privacy Analytic Application Learning regulation and evolving analysis and machine protection business learning With the development of big data and AI technology, FSI is BigDL PPML paying more and more attention to data privacy and security, while regulation is Privacy Protection being strengthened. How to BigDL PPML Enables E2E Privacy-Preserving Machine Learning and protect user privacy in data Intel® SGX Federated Learning with SGX and LibOS analysis and AI applications has App App become a hot concern in FSI. LibOS LibOS LibOS based on SGX allows existing applications to run in enclave BigDL PPML implements Enclave Enclave 飞地 without code changes privacy-preserving machine learning and federated Hardware-based Intel® SGX protects data through an enclave in memory learning based on Intel SGX, which can effectively fill the Large scale data gap in privacy protection and data application, and help FSI customers break data silos and BigDL PPML Intel® Software Guard Extensions (Intel ® SGX) realize data empowerment. 3rd generation Intel® Xeon® Scalable processors

18 .AI inference protection on TEE Secure biometric identification solution on SGX  Registration and recognition working in enclave by libOS（Gramine)  Data transmission and storage are all encrypted  1:N performance are in the same level as plain text  No need multiple parties and no algorithm smooth upgrading issue Unionpay face processing system TEE Face application Encrypted REE TEE system Secure transmission Face Storage TEE registration module Face KMS application Face Secure 人脸识别系统 system registration Storage Face recognition system Confidential computing User module Bank Biz service system system Bank face KMS 人脸识别系统 processing system Face Confidential computing recognition service system system

19 . Ehualu data bank solution ML training efficiency with SGX Under government regulation • Data Requester • Data Acquisition Data Realization Data query efficiency with SGX 120% 100% 100% 100% 83% 81% 80% Efficiency 60% Data Bank Data Owner Ecosystem 40% • Data Storage Service Provider 20% 0% • Data Realization • Capability improvement 100k 1M 5M 10M • Service Realization Data volume 19

20 . Intel®Xeon®SP power Fudata Avatar system Avatar system from Fudata is based on the 3rd gen Intel ® Xeon® SP. The integration of Fudata’s self- developed algorithms, and the acceleration of hardware orientation can help in scenarios such as precision marketing, intelligent risk control and joint asset pricing. Local Data1 X3 Share Mask Share Enclave3 Local Local Data share Data share Data2 Data3 X Y X1 Share X2 Share guest host Enclave1 Enclave2 Help Fudata Avatar boost Federated Help Avatar MPC module for learning training performance coalition-resistance security Intel® Software Guard Extensions (Intel ® SGX) 3rd generation Intel® Xeon® Scalable processors

21 .Confidential computing by HE

22 .Homomorphic Encryption Broad Types Descriptions Pros Cons Has existed for many years, is a type of Partially HE encryption that allows either addition or Relatively simple and easy Support only one （ addition or (PHE) multiplications (but not both) over implemented multiplications） encrypted data without ever decrypting Similar to FHE but a little more Support both addition or Leveled HE restrictive as it allows computation to be multiplications and more mature Limited steps (LHE) performed up to a given depth over than PHE. Higher efficiency than encrypted data FHE Described first by Craig Gentry in 2009, is a type of encryption that allows for Support both addition or Full HE generic computation to be performed multiplications and no limit on Low computation efficiency (FHE) on encrypted data without ever steps decrypting the data.

23 .Homomorphic Encryption for Cross-Silo Federated Learning 1) Each client computes the local gradient updates 2) Encrypts them with the public key 3) Transfers the results to the aggregator. Adds them up and dispatches the results to all clients 4) A client then decrypts the aggregated gradients 5) Use the aggregated gradients to update the local model Chengliang Zhang et al. BatchCrypt: Efficient Homomorphic Encryption for Cross-Silo Federated Learning. In ATC 2020.

24 . PHE widely used in FL including FATE from WeBank FL overall performance limited by HE performance 24

25 . Using AVX512-IFMA (IPP-Crypto) to speed up PHE Accelerating Modular Exponentiation Operation in Partial Homomorphic Encryption with Multi-buffer Function Paillier performance is heavily affected by modular exponentiation 25

26 . Accelerating Secure Computing for Federated Learning Federated learning IPP Crypto provides Optimization to is an important optimization for basic Modular Results of Partial Homomorphic Encryption solution to privacy secure operations, exponentiation based protection and data and the widely used on AVX512 largely for Federated Learning on ICX island in big data and modular improves e2e PHE AI applications exponentiation performance improves 4.7x The 3rd Gen Intel® Xeon ® SP significantly improved support for security，new AVX512 improves the performance of confidential computing IPP-Crypto makes fully use of AVX512 instruction in the 3rd Gen Intel® Xeon ® SP to improve partially homomorphic encryption performance WeBank FATE provides a commercial open-source solution for federated learning, and its PHE algorithm is accelerated by the 3rd Gen Intel ® Xeon ® SP Intel® IPP Crypto Intel® AVX512 3rd generation Intel® Xeon® Scalable processors

27 .HE acceleration: Building the HE ecosystem Industries Finance Health GOV … HE solutions HE ISV Cloud System Integrators Frameworks / DSLs Tensor Flow PyTorch Spark SQL MSFT IBM HE Libs Intel palliar PALISADE SEAL TFHE HElib Math Libs HEXL MKL IPP Compute Xeon FPGA ASIC Now Future

28 . Take away ➢ Efficiency is one of key factors to confidential computing. ICX has SGX/AVX512 to improve it for confidential computing • SGX is a very efficient privacy protection technology. The efficiency of all our cases is at least >50% • SGX development can be simplified through lib OS gramine-SGX and support complex applications • AVX512 can speed up HE which is another major direction of confidential computing ➢ 2021’s achievements for IA confidential computing ecosystem • 1 BKM (AI+SGX) • 2 software modules (BigDL PPML, Ipp Crypto) • 3 white paper (Unionpay, Ehualu, WeBank) 28

29 .

0点赞

0收藏

14下载