16/09 Netflix Recommendations Using Spark + Cassandra - Cassan

Netflix Recommendations Using Spark + Cassandra - Cassandra Summit 2016
展开查看详情

1.Netflix Recommendations using Spark + Cassandra Prasanna Padmanabhan Roopa Tangirala

2. Turn on Netflix and the absolute best content for you would automatically start playing

3.Netflix Recommendations

4.Netflix Recommendations

5.Everything is a Recommendation Ranking Over 80% of what members watch comes from our recommendations Rows Recommendations are driven by Machine Learning Algorithms

6. Data Driven Algorithmic Page Trending Now Generation Offline Experiment Success Success Rollout Feature to Online using Historical A/B Testing ALL members Data Fail

7.Offline Experimentation

8. Algorithmic Page Generation Personalizing the ordering of rows on the homepage

9. Algorithmic Page Generation Drawbacks Diversity of the Page Affinity for specific rows Without Algorithmic Page Generation With Algorithmic Page Generation

10. Algorithmic Page Generation Production

11. Algorithmic Page Generation Production Variant 1

12. Algorithmic Page Generation Production Variant 1 Variant 2 Row Distribution TV/Movie Ratio

13. Algorithmic Page Generation Production Variant 1 Variant 2 Actual Evaluate best variant Plays: based on the plays

14. Algorithmic Page Generation Production Variant 1 Variant 2 Actual Evaluate best variant Plays: based on the plays

15. Algorithmic Page Generation Production Variant 1 Variant 2 Actual Evaluate best variant Plays: based on the plays

16. Algorithmic Page Generation Production Variant 1 Variant 2 Actual Evaluate best variant Plays: based on the plays

17. Offline Experiment Architecture Runs once a day S3 Member Data Snapshot Selection Snapshots Forklift Snapshot Generate Snapshot Store Pages Evaluate Viewing Metrics Ratings … … MyList History Service Service Service A/B Test

18. Data Model - Requirements • Need for historical service data • Optimize for Batch Writes and Point Reads

19. Data Model COLUMN DATE_MEMBER_ID MyList 20161009_1001 BLOB R COLUMN FAMILY: O MYLIST W S MyList 20161009_1002 BLOB

20. Data Model COLUMN DATE_MEMBER_ID ViewingData 20161009_1001 BLOB R COLUMN FAMILY: O VIEWING-HISTORY W S ViewingData 20161009_1002 BLOB

21. Data Model COLUMN DATE_MEMBERID_IDX ViewingData 20161009_1001_0 BLOB R COLUMN FAMILY: ViewingData O 20161009_1001_1 VIEWING-HISTORY W BLOB S ViewingData 20161009_1001_2 BLOB

22.Online A/B Testing

23. Trending Now Videos that are Trending and Personalized for you

24.Trending Now It’s 7 PM on a Monday

25.Trending Now It’s 10 PM on a Saturday

26.Trending Now Pokeman

27.Fast Feedback Loop

28.Trending Now - Data Infrastructure Captures videos shown Impression in view port Service Compute Trends UI Trends Store Captures Viewing Viewing videos History History .. .. Ratings played by Service Service members Model Online Publish Training Services Models

29.State Management in Cassandra Video Number of Plays Stranger Things 100 Narcos 200 Orange is the new Black 300