TensorFlow Summit Extended

在过去几年间,TensorFlow 一直在助力推动机器学习在众多子领域中的最前沿发展。
2020 年的 TensorFlow 开发者峰会刚刚结束不久,我们在本次大会中介绍了产品的很多更新与发布。

《一起阅读 TinyML》王玉成
《基于深度学习的离题检测技术分享》查叶飞

展开查看详情

1. Deep Off-Topic Approach in LAIX 查叶⻜ Leo Zha 2020.4.25 Confidential & Proprietary

2. About TFDevSummit 2020 Release TF2.2 Emphasis on performance Performance Profiler Stability in the core library Compatibility with the rest of the TF ecosystem Keras Tuner TF Hub TensorBoard.Dev Confidential & Proprietary

3. Outline Introduction Approach & Experiments Insight Future Work About LAIX Takeaways Confidential & Proprietary

4. Introduction Off-topic Detection Task An Example Prompt What kind of flowers do you like? On-topic I like iris and it has different meaning of it a wide is the white and um and the size of a as a ride is means the ride means love but I can not speak. Off-topic Sometimes I would like to invite my friends to my home and we can play the Chinese chess dishes this is my favorite games at what I was child. Confidential & Proprietary

5. Introduction Off-topic Detection Task Application: Educational Product Pre-module of automatic assessment system Assessing whether the response is off-topic for the corresponding prompt Motivation Making an automatic assessment system more reliable and more robust Confidential & Proprietary

6. Introduction Related work Classification task in DL RNN + Attention-based model Siamese CNN Deep CNN(Similarity Grid + Inception-v4) Weakness The cold-start problem of off-topic questions Little attention to the vital on-topic false-alarm problem for a production system Confidential & Proprietary

7. Approach Insight Regarding off-topic detection as the semantic matching problem Semantic Matching Problem Information Retrieval Query-Document Question Answering Reading comprehension Confidential & Proprietary

8. Approach Reading Comprehension Classic Models Bert-based QA-Net R-Net Confidential & Proprietary

9. Reading Comprehension Models QANet R-Net Confidential & Proprietary

10.Gated Convolutional Bi-Attention based Model Embedding Layer Pre-trained Glove Contextual Encoder Layer Extracting n-gram features Multi-convolutional layers Max-pooling layer Confidential & Proprietary

11.Gated Convolutional Bi-Attention based Model Attention Layer Prompt-to-Response Response-to-Prompt Relevance Layer Gated Unit Confidential & Proprietary

12.Gated Convolutional Bi-Attention based Model Output Layer Layer Norm + Dropout Dense * 2 + Softmax Cross-Entropy Loss Confidential & Proprietary

13. Experiments Data IELTS practice speaking test 流利说®雅思 Train data=1.12M(1.3K Prompts) Test seen=33.6K(156 Prompts), unseen=10.1K(Prompts) Metric Principle Both on unseen and seen prompts Based on on-topic recall > 99.9% Average Off-topic Recall(AOR) Prompt Ratio over Recall0.3(PRR3) Confidential & Proprietary

14. Experiments Results Compared with RNN-AM baseline, our model achieve impressive improvements both on seen and unseen benchmark Ablation Studies Confidential & Proprietary

15. Insight Bi-Attention Visualization Prompt Capturing “what” + “spare time” On-topic Capturing “usually watch movies” “shopping” Off-topic Capturing “change name” “satisfied” & “name” Confidential & Proprietary

16. Insight Semantic Matching Representation Visualization The output vector of relevance layer On Clear topic prompt “Describe a special meal xxxx” On divergent prompt “what do you do in your spare time” Confidential & Proprietary

17. Data Augmentation Trends of AOR(Average Off-topic Recall) To study the impact of training data size The larger the training data, the better the performance is Negative Sampling Method Two negative samples for one response The first one is chosen randomly as before The second one consists of words shuffled from the first one Confidential & Proprietary

18. Future work Focus on divergent prompts Off-topic in writing application Other scenarios Confidential & Proprietary

19. About LAIX Spoken & Writing Assessment Essay Scorer Grammar Error Correction Word Recommendation Argument Recommendation Automated Course Content Generation ASR & TTS Dialogue Recommendation Confidential & Proprietary

20. Takeaways Off-topic task Approach Insight Our paper: https://arxiv.org/abs/2004.09036 Confidential & Proprietary

21. Thanks & QA Confidential & Proprietary