- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Apache Spark在现实世界中实现大规模频繁模式发掘的理论
展开查看详情
1 .
2 .
3 .
4 .
5 .
6 .
7 .
8 .
9 .Number of combinations 256 8 Number of items in set
10 .Number of combinations 1,048,576 256 8 20 Number of items in set
11 . ??? Number of combinations 1,048,576 256 8 20 140,000 Number of items in set
12 .
13 .Theory Meets Reality Large Scale Frequent Pattern Mining with Apache Spark in the Real World Kexin Xie, Architect of Marketing Cloud Einstein Wanderley Liu, Senior Data Science Engineer kexin.xie@salesforce.com, @realstraw wanderley.liu@salesforce.com
14 .Marketing Cloud Einstein Journey Insights GA Learn how customers are actually interacting with your brand Track the entire consumer journey Gather online and offline interactions to stitch together a complete view of the consumer Discover the optimal path to conversion Use AI to analyze all journey permutations and automatically recommend the best channels, offers and sequences that lead to conversion
15 . What is Frequent Pattern Mining Mine Shaft Mural Painting by Frank Wilson
16 .
17 .a b c d e
18 .User Items u-1 a, b u-2 b, c, d u-3 a, c, d, e u-4 a, d, e u-5 a, b, c u-6 a, b, c, d u-7 a u-8 a, b, c u-9 a, b, d u-10 b, c, e
19 . item support a 8 b 7 User Items c 6 u-1 a, b d 5 u-2 b, c, d e 3 u-3 a, c, d, e u-4 a, d, e u-5 a, b, c u-6 a, b, c, d u-7 a u-8 a, b, c u-9 a, b, d u-10 b, c, e
20 . item support a 8 b 7 User Items c 6 u-1 a, b d 5 u-2 b, c, d e 3 u-3 a, c, d, e u-4 a, d, e u-5 a, b, c item support u-6 a, b, c, d a, b 5 u-7 a a, c 4 u-8 a, b, c a, d 4 u-9 a, b, d a, e 2 u-10 b, c, e ... ...
21 . item support a 8 b 7 User Items c 6 u-1 a, b d 5 u-2 b, c, d e 3 u-3 a, c, d, e u-4 a, d, e Min Support = 4 u-5 a, b, c item support u-6 a, b, c, d a, b 5 u-7 a a, c 4 u-8 a, b, c a, d 4 u-9 a, b, d a, e 2 u-10 b, c, e ... ...
22 . item support item support a 8 a 8 b 7 User Items b 7 c 6 u-1 a, b c 6 d 5 u-2 b, c, d d 5 e 3 u-3 a, c, d, e e 3 u-4 a, d, e Min Support = 4 u-5 a, b, c item support item support u-6 a, b, c, d a, b 5 a, b 5 u-7 a a, c 4 a, c 4 u-8 a, b, c a, d 4 a, d 4 u-9 a, b, d a, e 2 a, e 2 u-10 b, c, e ... ... ... ...
23 . item support item support a 8 a 8 b 7 User Items b 7 L1 Patterns c 6 u-1 a, b c 6 d 5 u-2 b, c, d d 5 e 3 u-3 a, c, d, e e 3 u-4 a, d, e Min Support = 4 u-5 a, b, c item support item support u-6 a, b, c, d a, b 5 a, b 5 u-7 a a, c 4 a, c 4 L2 Patterns u-8 a, b, c a, d 4 a, d 4 u-9 a, b, d a, e 2 a, e 2 u-10 b, c, e ... ... ... ...
24 . A-priori Principle “All sub-patterns of a frequent pattern are frequent” A Priori in Berkeley, CA
25 . item support a 8 b 7 c 6 d 5 e 3 Min Support = 4 item support a, b ? a, c ? a, d ? a, e ? ... ...
26 . item support item support a 8 a 8 b 7 b 7 c 6 c 6 d 5 d 5 e 3 e 3 Min Support = 4 Min Support = 6 item support item support a, b ? a, b ? a, c ? a, c ? a, d ? a, d ? a, e ? a, e ? ... ... ... ...
27 .FP-Growth
28 . root Header Table a: 8 b: 2 item support a 8 b: 5 b 7 c: 2 c 6 c: 3 c: 1 d 5 d: 1 e: 1 e 3 d: 1 d: 1 d: 1 d: 1 e: 1 e: 1
29 . root Header Table a: 8 b: 2 item support a 8 b: 5 b 7 c: 2 c 6 c: 3 c: 1 d 5 d: 1 e: 1 e 3 d: 1 d: 1 d: 1 d: 1 e: 1 e: 1