申请试用
HOT
登录
注册
 

Listening at the Cocktail Party with Deep Neural Networks and TensorFlow

Spark开源社区
/
发布于
/
3569
人观看

Many people are amazing at focusing their attention on one person or one voice in a multi speaker scenario, and ‘muting’ other people and background noise. This is known as the cocktail party effect. For other people it is a challenge to separate audio sources.

In this presentation I will focus on solving this problem with deep neural networks and TensorFlow. I will share technical and implementation details with the audience, and talk about gains, pains points, and merits of the solutions as it relates to:

  • Preparing, transforming and augmenting relevant data for speech separation and noise removal.
  • Creating, training and optimizing various neural network architectures.
  • Hardware options for running networks on tiny devices.
  • And the end goal : Real-time speech separation on a small embedded platform.

I will present a vision of future smart air pods, smart headsets and smart hearing aids that will be running deep neural networks .

Participants will get an insight into some of the latest advances and limitations in speech separation with deep neural networks on embedded devices in regards to:

  • Data transformation and augmentation.
  • Deep neural network models for speech separation and for removing noise.
  • Training smaller and faster neural networks.
  • Creating a real-time speech separation pipeline.
7点赞
3收藏
0下载
确认
3秒后跳转登录页面
去登陆