- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
深度学习的比特币价格预测意见和挖掘
展开查看详情
1 .Deep Learning based Opinion Mining for Digital Currency Forecasting Joyesh Mishra Shibani Singh Intel Corporation #AISAIS15
2 .Agenda • Current Trends • Workflow overview • Execution & Dataflow • Model Evaluation & Learnings • Further research • Key Takeaways #AISAIS15 2
3 .What this talk is not about! • It is not going to make you super rich. • It is not going to teach you about picking stocks or timing the market. • It is not making any predictions whatsoever. This talk is primarily for showcasing practical applications in Sentiment Analysis & NLP using Deep Learning Techniques. #AISAIS15 3
4 .Current Trends & Platforms • Sentiment Analysis moves organizations, businesses, people and countries • Maturity of large scale machine learning and democratization of deep learning techniques • State of AI in Sentiment Analysis #AISAIS15 4
5 .Dataset • Data Collection • Subreddit selection & Twitter hash tags filters Sources: 1 https://www.reddit.com/r/coolguides/comments/8fnj6g/financial_subreddits_guide/ 2 https://ritetag.com/best-hashtags-for/investing #AISAIS15 5
6 .Workflow Overview Natural Language Processing • Statistical & rule based techniques Natural Machine Learning • Deep Learning (RNN, Language Processing LSTM, GRU) Deep Learning #AISAIS15 6
7 .Workflow Overview LSTM #AISAIS15 7
8 .Workflow Overview – Data Acquisition LSTM #AISAIS15 8
9 .Workflow Overview – NLP LSTM #AISAIS15 9
10 .Workflow Overview – Merged Dataset LSTM #AISAIS15 10
11 . Workflow Overview – Sentiment Modeling NLP – Approach 1 • Vader1: “lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media” • Available with NLTK 1 Hutto,C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014. #AISAIS15 11
12 .Workflow Overview – Sentiment Modeling NLP – Approach 2 Sentiment Treebank1 • Stanford CoreNLP • Deep Recursive Model(RNTN) trained on Sentiment Treebank Recursive Neural Tensor Networks have a tree structure with a neural net at each node 1 RecursiveDeep Models for Semantic Compositionality Over a Sentiment Treebank. Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng and Christopher Potts Stanford University, Stanford, CA 94305, USA #AISAIS15 12
13 .Workflow Overview – Sentiment Modeling NLP – Approach 2 (Continued) • Label all data for financial Subreddit comments & tweets (excluding BTC) using Stanford NLP Treebank • Train an LSTM on the data above • Predict sentiments – using Trained LSTM Model #AISAIS15 13
14 .Complete Pipeline • Embedding + LSTM for Sentiment Modeling • LSTM with Recurrent Dropout for Price Prediction and Time Series Modeling1 1A Theoretically Grounded Application of Dropout in Recurrent Neural Networks Yarin Gal, Zoubin Ghahramani arXiv:1512.05287v5 [stat.ML] #AISAIS15 14
15 .Execution Iteration 1 - Spark, Keras (TF) & Spark-Tensorflow- Connector with Vader for Sentiment Analysis #AISAIS15 15
16 .Execution Iteration 2 - Spark, Keras (TF) & Spark-Tensorflow-Connector with Keras(TF/LSTM/Stanford Treebank) for Sentiment Analysis #AISAIS15 16
17 .Execution Combined Workflow with BigDL + Keras Style APIs using Spark (Proposed Work in Progress) • BigDL is a distributed deep learning library for Apache Spark • Efficient Scale out leveraging Spark and Spark ecosystem, Synchronous SGD, All-Reduce • Rich DL API/Layers Support (Native). Ability to re-use Keras Models (via Export/Import) • Provides Keras Style APIs in Python/Scala for users familiar with Keras (Based on Keras 1.2.2) #AISAIS15 17
18 .Model Evaluation & Learnings • Adding transaction volume per time interval improves the accuracy • Normalize all features to reduce any feature importance bias • Different window lengths could be used to experiment multiple models (Best observed look back window ~ 22 hours) • Further fine tuning of the layers could improvise predictions. #AISAIS15 18
19 . Model Evaluation & Learnings LSTM based Sentiment + LSTM Rule Based Sentiment (Vader) (value prediction) + LSTM (value prediction) Mean Absolute Error 0.0084 Mean Absolute Error 0.0110 (Normalized data) (Normalized data) Loss 0.0001 Loss 0.0002 The workflow inclusive of deep learning based sentiment prediction slightly outperforms the one where a rule based approach is followed for sentiment evaluation. #AISAIS15 19
20 .Further Research in Sentiment Analysis using LSTM and RecNN • Document Level Sentiment Classification • Words Embedding à Dense Document Vectors à LSTM • Use Attention Mechanism and Non-Neural Classifiers (SVM) • Sentence Level Sentiment Classification • Subjectivity Classification • RNTN, TG-RNN, TE-RNN, DCNN, CharSCNN • Aspect Level Sentiment Classification • Aspect Extraction, Entity Extraction • AdaRNN, TD-LSTM/TC-LSTM • Emotional Analysis, Sarcasm Detection, Multi-lingual Sentiment Analysis • Multi-Modal (Combining Textual, Visual, Acoustic etc.) Citation: Deep Learning for Sentiment Analysis : A Survey, 2018, Lei Zhang, Shuai Wang, Bing Liu, http://arxiv.org/abs/1801.07883 #AISAIS15 20
21 .Key Takeaways • Deep Learning based NLP techniques are comparable to statistical and traditional models • Iterated training and model tuning could be achieved on large datasets due to maturity of DL frameworks being able to leverage each other (Keras, Tensorflow, Spark, BigDL) • Model specialization and domain based training is really helpful in improving learning #AISAIS15 21
22 .Libraries • Spark Tensorflow Connector • https://github.com/tensorflow/ecosystem/tree/master/s park/spark-tensorflow-connector • BigDL • https://github.com/intel-analytics/BigDL #AISAIS15 22
23 .Thank You! #AISAIS15 23