- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
Scalable Time Series Forecasting and Monitoring using Apache Spark and ElasticSearch
Adyen enables integrating companies to accept payments from their customers using any payment method over any sales channel. We have designed and implemented a time series forecasting algorithm that allows us to predict the volume for each integration with confidence and thus be able to flag anomalies such as traffic drop or abnormally low traffic. We are using Apache Spark as our computational engine both to make this data available to the training process as well as to train over years of data in a scalable way. The prediction performances are benchmarked and the models are served in production through custom real-time monitoring and alerting infrastructure that uses ElasticSearch as hot storage. With this state-of-the-art solution, Adyen knows whether a problem happened and can alert the operational teams accordingly in a record time.
‘This presentation will cover the journey we took with focus on the mathematical concepts, the present time constraints, the prediction performances, and the architecture needed to make this happen. We’ll go over lessons learned, pitfalls, and best practices discovered on modeling time series datasets with Apache Spark. Data Scientists would be able to gain insights on applying effective and real-life seasonality modeling techniques. We’ll share our approaches used for sub-millisecond model serving that would inspire Data Engineers who work on related problems.