- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
Listening at the Cocktail Party with Deep Neural Networks and TensorFlow
Many people are amazing at focusing their attention on one person or one voice in a multi speaker scenario, and ‘muting’ other people and background noise. This is known as the cocktail party effect. For other people it is a challenge to separate audio sources.
In this presentation I will focus on solving this problem with deep neural networks and TensorFlow. I will share technical and implementation details with the audience, and talk about gains, pains points, and merits of the solutions as it relates to:
- Preparing, transforming and augmenting relevant data for speech separation and noise removal.
- Creating, training and optimizing various neural network architectures.
- Hardware options for running networks on tiny devices.
- And the end goal : Real-time speech separation on a small embedded platform.
I will present a vision of future smart air pods, smart headsets and smart hearing aids that will be running deep neural networks .
Participants will get an insight into some of the latest advances and limitations in speech separation with deep neural networks on embedded devices in regards to:
- Data transformation and augmentation.
- Deep neural network models for speech separation and for removing noise.
- Training smaller and faster neural networks.
- Creating a real-time speech separation pipeline.