Apache Spark在现实世界中实现大规模频繁模式发掘的理论

Salesforce Einstein是人工智能层,它根据客户独特的业务流程和数据提供预测和建议。爱因斯坦之旅洞察力是Salesforce DMP提供的关键产品之一,用于帮助营销者和出版商利用人工智能分析消费者旅程中的数十亿个接触点,并发现转换的最佳路径,包括洞察哪些频道、消息和事件。
展开查看详情

1.

2.

3.

4.

5.

6.

7.

8.

9.Number of combinations 256 8 Number of items in set

10.Number of combinations 1,048,576 256 8 20 Number of items in set

11. ??? Number of combinations 1,048,576 256 8 20 140,000 Number of items in set

12.

13.Theory Meets Reality Large Scale Frequent Pattern Mining with Apache Spark in the Real World Kexin Xie, Architect of Marketing Cloud Einstein Wanderley Liu, Senior Data Science Engineer kexin.xie@salesforce.com, @realstraw wanderley.liu@salesforce.com

14.Marketing Cloud Einstein Journey Insights GA Learn how customers are actually interacting with your brand Track the entire consumer journey Gather online and offline interactions to stitch together a complete view of the consumer Discover the optimal path to conversion Use AI to analyze all journey permutations and automatically recommend the best channels, offers and sequences that lead to conversion

15. What is Frequent Pattern Mining Mine Shaft Mural Painting by Frank Wilson

16.

17.a b c d e

18.User Items u-1 a, b u-2 b, c, d u-3 a, c, d, e u-4 a, d, e u-5 a, b, c u-6 a, b, c, d u-7 a u-8 a, b, c u-9 a, b, d u-10 b, c, e

19. item support a 8 b 7 User Items c 6 u-1 a, b d 5 u-2 b, c, d e 3 u-3 a, c, d, e u-4 a, d, e u-5 a, b, c u-6 a, b, c, d u-7 a u-8 a, b, c u-9 a, b, d u-10 b, c, e

20. item support a 8 b 7 User Items c 6 u-1 a, b d 5 u-2 b, c, d e 3 u-3 a, c, d, e u-4 a, d, e u-5 a, b, c item support u-6 a, b, c, d a, b 5 u-7 a a, c 4 u-8 a, b, c a, d 4 u-9 a, b, d a, e 2 u-10 b, c, e ... ...

21. item support a 8 b 7 User Items c 6 u-1 a, b d 5 u-2 b, c, d e 3 u-3 a, c, d, e u-4 a, d, e Min Support = 4 u-5 a, b, c item support u-6 a, b, c, d a, b 5 u-7 a a, c 4 u-8 a, b, c a, d 4 u-9 a, b, d a, e 2 u-10 b, c, e ... ...

22. item support item support a 8 a 8 b 7 User Items b 7 c 6 u-1 a, b c 6 d 5 u-2 b, c, d d 5 e 3 u-3 a, c, d, e e 3 u-4 a, d, e Min Support = 4 u-5 a, b, c item support item support u-6 a, b, c, d a, b 5 a, b 5 u-7 a a, c 4 a, c 4 u-8 a, b, c a, d 4 a, d 4 u-9 a, b, d a, e 2 a, e 2 u-10 b, c, e ... ... ... ...

23. item support item support a 8 a 8 b 7 User Items b 7 L1 Patterns c 6 u-1 a, b c 6 d 5 u-2 b, c, d d 5 e 3 u-3 a, c, d, e e 3 u-4 a, d, e Min Support = 4 u-5 a, b, c item support item support u-6 a, b, c, d a, b 5 a, b 5 u-7 a a, c 4 a, c 4 L2 Patterns u-8 a, b, c a, d 4 a, d 4 u-9 a, b, d a, e 2 a, e 2 u-10 b, c, e ... ... ... ...

24. A-priori Principle “All sub-patterns of a frequent pattern are frequent” A Priori in Berkeley, CA

25. item support a 8 b 7 c 6 d 5 e 3 Min Support = 4 item support a, b ? a, c ? a, d ? a, e ? ... ...

26. item support item support a 8 a 8 b 7 b 7 c 6 c 6 d 5 d 5 e 3 e 3 Min Support = 4 Min Support = 6 item support item support a, b ? a, b ? a, c ? a, c ? a, d ? a, d ? a, e ? a, e ? ... ... ... ...

27.FP-Growth

28. root Header Table a: 8 b: 2 item support a 8 b: 5 b 7 c: 2 c 6 c: 3 c: 1 d 5 d: 1 e: 1 e 3 d: 1 d: 1 d: 1 d: 1 e: 1 e: 1

29. root Header Table a: 8 b: 2 item support a 8 b: 5 b 7 c: 2 c 6 c: 3 c: 1 d 5 d: 1 e: 1 e 3 d: 1 d: 1 d: 1 d: 1 e: 1 e: 1