- 微博 QQ QQ空间 贴吧
Near Real-Time Netflix Recommendations Using Apache Spark Streaming
收藏 4下载 4
As a data driven company, we use Machine learning based algos and A/B tests to drive all of the content recommendations for our members. Traditionally, these recommendations are precomputed in a batch processing fashion, but such a model cannot react quickly based on member interactions, title interests, popularity etc. With an ever-growing Netflix catalog, finding the right content for our audience in near real-time would provide the best personalized experience. We’ll take a deep dive into our realtime Spark Streaming ecosystem at Netflix. Both it’s infrastructure and business use cases. On the infrastructure front, we will delve into scale challenges, state management, data persistence, resiliency considerations, metrics, operations and auto-remediation. We will talk about a few use cases that leverage real-time data for model training, such as providing the right personalized videos in a member’s Billboard and choosing the right personalized image soon after the launch of the show. We will also reflect on the lessons learnt while building such high volume infrastructure.