- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
利用Apache Spark检测药品索赔中的欺诈行为
展开查看详情
1 .Pharmacy Claims – Fraud Detection Giridharan Gurumoorthy – Kavi Global Rajesh Inbasekaran – Kavi Global #DS9SAIS
2 .Outline – Background • Fraud Waste and Abuse • Current Status and Challenges – Project • Objective • Steps – Implementation – Initial Results – Next steps – Q&A #DS9SAIS 2
3 .Background $3.35 Trillion 67,000 $10K 700 Healthcare spend Pharmacies 4 billion 950K 11.9% $6.5B Prescriptions Physicians #DS9SAIS 3
4 .Background Physician Insurance • Prescribes Company • Pays Claim Patient Pharmacy • Fills up • Submits Claim #DS9SAIS 4
5 .Fraud Waste and Abuse • 3% to 10% of healthcare spend • Possible Actors ( Alone or in collusion) – Patient – Physician – Pharmacy #DS9SAIS 5
6 .Current Status and Challenges • Transaction level checks are in place • Actor level checks are primitive • Rule based checks are based on historical fraud – Fraudsters are innovative • False positives are expensive #DS9SAIS 6
7 .Project Objective What did we do? • Identify anomalous / possible • Identify and rank anomalous fraudulent actors behavior • Rank them • Identify and rank anomalous – Prioritize investigations relationships between actors • Generate consolidated scores to rank #DS9SAIS 7
8 .Steps Consolidated Score Relationship Scoring Behavior Scoring Data Summary Data Cleaning #DS9SAIS 8
9 .Data Summary Categories Metrics Patient & Patient Physician Pharmacy Claim Patient Physician Pharmacy Count Count Count Volume Pharmacy Patient & Cost Quantity Patient & Physician & Drug Spread Physician & Utilized Dispensed Physician Pharmacy Pharmacy Timeframe Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Q1 Q2 Q3 Q4 #DS9SAIS 9
10 .Anomalous behavior • Grouping similar behavior • Group Density • Distance from Group Center #DS9SAIS 10
11 .K Means Model #DS9SAIS 11
12 .Optimal ‘K’ • Elbow Method – Compute cluster algorithm for different values of K – Calculate WSS for each K and plot the curve – Location of the Bend will be the optimal value of ‘K’ #DS9SAIS 12
13 .Anomaly Score • Density Factor = Size of the cluster / Total Size • Distance Factor = Distance between data point and center of cluster / Distance of the farthest point in the cluster • Score = Max of two scores • Actor Score = Sum of level wise scores with weightages #DS9SAIS 13
14 .Example Patient Level Score card Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Patient ID Group 1 Group 1 Group 2 Group 2 Group 3 Group 3 Group 4 Group 4 Anomaly_Score Patient - 1 1.0000 1.0000 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9992 Patient - 2 1.0000 0.9985 0.9974 0.9997 0.9979 1.0000 0.9979 1.0000 0.9989 Patient - 3 0.9971 0.9997 0.9974 0.9997 0.9979 0.9998 0.9979 0.9998 0.9987 Patient - 4 0.9971 0.9990 0.9974 0.9997 0.9979 0.9998 0.9979 0.9998 0.9986 Patient - 5 0.9971 0.9972 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9984 Patient - 6 0.9971 0.9962 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9983 Physician Level Score card Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Physician ID Group 1 Group 1 Group 2 Group 2 Group 3 Group 3 Group 4 Group 4 Anomaly_Score Physician - 1 0.9965 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 2 0.9997 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 3 0.9965 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 4 0.9997 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 5 0.9965 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 0.9999 Physician - 6 0.9965 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 0.9999 Pharmacy Level Score card Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Pharmacy ID Group 1 Group 1 Group 2 Group 2 Group 3 Group 3 Group 4 Group 4 Anomaly_Score Pharmacy - 1 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000 Pharmacy - 2 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000 Pharmacy - 3 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000 Pharmacy - 4 0.9994 0.9998 0.9998 1.0000 0.9998 1.0000 0.9999 1.0000 1.0000 Pharmacy - 5 0.9776 0.9998 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 0.9999 Pharmacy - 6 0.9776 0.9998 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 0.9999 #DS9SAIS 14
15 .Anomalous Relationship • Analyze – # of Connected Neighbors – # of Neighbors’ Neighbors #DS9SAIS 15
16 .GraphX • Create a graph with multiple node types – Physician – Patient – Pharmacy #DS9SAIS 16
17 .Calculate Neighbor Count • First level Neighbor • Compute Immediate neighbor’s degree from the each vertex • Second Level Neighbor • Identify Neighbor’s Neighbor degree and bring to the parent vertex • Third Level Neighbor • Identify Second Level Neighbor’s Neighbor degree and bring to the parent vertex #DS9SAIS 17
18 .Anomaly Score • Based on position in each count • Consolidated score = Max of all the counts #DS9SAIS 18
19 .Consolidate Score • Sum of Behavior Score and Relationship Score with configurable weightages #DS9SAIS 19
20 .Implementation – How it works? • Web Application for the use of Analysts • Features – Upload Claim Data – Run models – View Results – Actor wise ranks – Action – Tag as False Positive or Initiate Case – Feedback – Input Investigate status #DS9SAIS 20
21 .Initial Results • Many findings would have escaped Rule based checks • Initial investigation results prove less false positives • Ranking weightages might need some tuning #DS9SAIS 21
22 .Next steps • Supervised techniques with investigation results • Use additional data – Social media – Twitter, Facebook, Review data – Address and Property data - Zillow #DS9SAIS 22
23 .Q&A #DS9SAIS 23
24 .Rajesh Inbasekaran : Rajesh@kaviglobal.com Giridharan Gurumoorthy : Giridharan@kaviglobal.com #DS9SAIS 24