申请试用
HOT
登录
注册
 
LC3-阿里云弹性人工智能 v1.2

LC3-阿里云弹性人工智能 v1.2

圆圆
/
发布于
/
2444
人观看
We will introduce the recommended technical architecture for content recommendation scenario on Alibaba Cloud and we will introduce the performance optimization work and the results for this scenario on large-scale distributed GPU VM nodes in Alibaba Cloud. We need to train about 20 billion samples within an hour. The model has high communication-computing ratio and is implemented with Tensorflow, which has very bad scalability for large-scale distributed nodes and especially bad on the Cloud Computing virtual network. What’s more, We optimized the performance both on communication and IO aspects and get over 14x speedup on 64 GPU VMs than the original implementation and finally trained over 20 billion samples within an hour on the 64 GPU VMs in Alibaba Cloud.
1 点赞
1 收藏
0下载
确认
3秒后跳转登录页面
去登陆