Deep Learning-Based Opinion Mini

将AI生态系统领域引入企业的应用说明。IBM的每个人都有一部电话,每个人都知道如何使用她的电话,而IBM不是一家电话公司。我们如何将人工智能带入相同的普遍性标准——公司中的每个人都可以访问AI,并且知道如何使用AI;但是公司不是AI公司? 在这次谈话中,我们将打破一个领域专家今天面临的挑战,把人工智能应用到现实世界的问题。我们将讨论领域专家需要克服的挑战,以便从“我知道存在这种类型的模型”到“我可以告诉应用程序开发人员如何将这个模型应用到我的领域”。

1.Deep Learning based Opinion Mining for Digital Currency Forecasting Joyesh Mishra Shibani Singh Intel Corporation #AISAIS15

2.Agenda • Current Trends • Workflow overview • Execution & Dataflow • Model Evaluation & Learnings • Further research • Key Takeaways #AISAIS15 2

3.What this talk is not about! • It is not going to make you super rich. • It is not going to teach you about picking stocks or timing the market. • It is not making any predictions whatsoever. This talk is primarily for showcasing practical applications in Sentiment Analysis & NLP using Deep Learning Techniques. #AISAIS15 3

4.Current Trends & Platforms • Sentiment Analysis moves organizations, businesses, people and countries • Maturity of large scale machine learning and democratization of deep learning techniques • State of AI in Sentiment Analysis #AISAIS15 4

5.Dataset • Data Collection • Subreddit selection & Twitter hash tags filters Sources: 1 2 #AISAIS15 5

6.Workflow Overview Natural Language Processing • Statistical & rule based techniques Natural Machine Learning • Deep Learning (RNN, Language Processing LSTM, GRU) Deep Learning #AISAIS15 6

7.Workflow Overview LSTM #AISAIS15 7

8.Workflow Overview – Data Acquisition LSTM #AISAIS15 8

9.Workflow Overview – NLP LSTM #AISAIS15 9

10.Workflow Overview – Merged Dataset LSTM #AISAIS15 10

11. Workflow Overview – Sentiment Modeling NLP – Approach 1 • Vader1: “lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media” • Available with NLTK 1 Hutto,C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014. #AISAIS15 11

12.Workflow Overview – Sentiment Modeling NLP – Approach 2 Sentiment Treebank1 • Stanford CoreNLP • Deep Recursive Model(RNTN) trained on Sentiment Treebank Recursive Neural Tensor Networks have a tree structure with a neural net at each node 1 RecursiveDeep Models for Semantic Compositionality Over a Sentiment Treebank. Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng and Christopher Potts Stanford University, Stanford, CA 94305, USA #AISAIS15 12

13.Workflow Overview – Sentiment Modeling NLP – Approach 2 (Continued) • Label all data for financial Subreddit comments & tweets (excluding BTC) using Stanford NLP Treebank • Train an LSTM on the data above • Predict sentiments – using Trained LSTM Model #AISAIS15 13

14.Complete Pipeline • Embedding + LSTM for Sentiment Modeling • LSTM with Recurrent Dropout for Price Prediction and Time Series Modeling1 1A Theoretically Grounded Application of Dropout in Recurrent Neural Networks Yarin Gal, Zoubin Ghahramani arXiv:1512.05287v5 [stat.ML] #AISAIS15 14

15.Execution Iteration 1 - Spark, Keras (TF) & Spark-Tensorflow- Connector with Vader for Sentiment Analysis #AISAIS15 15

16.Execution Iteration 2 - Spark, Keras (TF) & Spark-Tensorflow-Connector with Keras(TF/LSTM/Stanford Treebank) for Sentiment Analysis #AISAIS15 16

17.Execution Combined Workflow with BigDL + Keras Style APIs using Spark (Proposed Work in Progress) • BigDL is a distributed deep learning library for Apache Spark • Efficient Scale out leveraging Spark and Spark ecosystem, Synchronous SGD, All-Reduce • Rich DL API/Layers Support (Native). Ability to re-use Keras Models (via Export/Import) • Provides Keras Style APIs in Python/Scala for users familiar with Keras (Based on Keras 1.2.2) #AISAIS15 17

18.Model Evaluation & Learnings • Adding transaction volume per time interval improves the accuracy • Normalize all features to reduce any feature importance bias • Different window lengths could be used to experiment multiple models (Best observed look back window ~ 22 hours) • Further fine tuning of the layers could improvise predictions. #AISAIS15 18

19. Model Evaluation & Learnings LSTM based Sentiment + LSTM Rule Based Sentiment (Vader) (value prediction) + LSTM (value prediction) Mean Absolute Error 0.0084 Mean Absolute Error 0.0110 (Normalized data) (Normalized data) Loss 0.0001 Loss 0.0002 The workflow inclusive of deep learning based sentiment prediction slightly outperforms the one where a rule based approach is followed for sentiment evaluation. #AISAIS15 19

20.Further Research in Sentiment Analysis using LSTM and RecNN • Document Level Sentiment Classification • Words Embedding à Dense Document Vectors à LSTM • Use Attention Mechanism and Non-Neural Classifiers (SVM) • Sentence Level Sentiment Classification • Subjectivity Classification • RNTN, TG-RNN, TE-RNN, DCNN, CharSCNN • Aspect Level Sentiment Classification • Aspect Extraction, Entity Extraction • AdaRNN, TD-LSTM/TC-LSTM • Emotional Analysis, Sarcasm Detection, Multi-lingual Sentiment Analysis • Multi-Modal (Combining Textual, Visual, Acoustic etc.) Citation: Deep Learning for Sentiment Analysis : A Survey, 2018, Lei Zhang, Shuai Wang, Bing Liu, #AISAIS15 20

21.Key Takeaways • Deep Learning based NLP techniques are comparable to statistical and traditional models • Iterated training and model tuning could be achieved on large datasets due to maturity of DL frameworks being able to leverage each other (Keras, Tensorflow, Spark, BigDL) • Model specialization and domain based training is really helpful in improving learning #AISAIS15 21

22.Libraries • Spark Tensorflow Connector • park/spark-tensorflow-connector • BigDL • #AISAIS15 22

23.Thank You! #AISAIS15 23