利用Apache Spark检测药品索赔中的欺诈行为

据估计,向健康计划和保险公司提交的药品索赔多达10%是欺诈性的。一个两步的过程包括无监督和有监督的学习技术的组合,可用于有效地识别制药行业的欺诈、浪费和滥用。药房向保险公司/健康计划提交的索赔提供了丰富有价值的见解的数据,从而能够预测欺诈、浪费和滥用。通过使用无监督技术,如聚类、单变量和多变量离群分析、链接分析、模拟欺诈签等,可以检测数据中的异常,从而识别可疑活动。
展开查看详情

1.Pharmacy Claims – Fraud Detection Giridharan Gurumoorthy – Kavi Global Rajesh Inbasekaran – Kavi Global #DS9SAIS

2.Outline – Background • Fraud Waste and Abuse • Current Status and Challenges – Project • Objective • Steps – Implementation – Initial Results – Next steps – Q&A #DS9SAIS 2

3.Background $3.35 Trillion 67,000 $10K 700 Healthcare spend Pharmacies 4 billion 950K 11.9% $6.5B Prescriptions Physicians #DS9SAIS 3

4.Background Physician Insurance • Prescribes Company • Pays Claim Patient Pharmacy • Fills up • Submits Claim #DS9SAIS 4

5.Fraud Waste and Abuse • 3% to 10% of healthcare spend • Possible Actors ( Alone or in collusion) – Patient – Physician – Pharmacy #DS9SAIS 5

6.Current Status and Challenges • Transaction level checks are in place • Actor level checks are primitive • Rule based checks are based on historical fraud – Fraudsters are innovative • False positives are expensive #DS9SAIS 6

7.Project Objective What did we do? • Identify anomalous / possible • Identify and rank anomalous fraudulent actors behavior • Rank them • Identify and rank anomalous – Prioritize investigations relationships between actors • Generate consolidated scores to rank #DS9SAIS 7

8.Steps Consolidated Score Relationship Scoring Behavior Scoring Data Summary Data Cleaning #DS9SAIS 8

9.Data Summary Categories Metrics Patient & Patient Physician Pharmacy Claim Patient Physician Pharmacy Count Count Count Volume Pharmacy Patient & Cost Quantity Patient & Physician & Drug Spread Physician & Utilized Dispensed Physician Pharmacy Pharmacy Timeframe Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Q1 Q2 Q3 Q4 #DS9SAIS 9

10.Anomalous behavior • Grouping similar behavior • Group Density • Distance from Group Center #DS9SAIS 10

11.K Means Model #DS9SAIS 11

12.Optimal ‘K’ • Elbow Method – Compute cluster algorithm for different values of K – Calculate WSS for each K and plot the curve – Location of the Bend will be the optimal value of ‘K’ #DS9SAIS 12

13.Anomaly Score • Density Factor = Size of the cluster / Total Size • Distance Factor = Distance between data point and center of cluster / Distance of the farthest point in the cluster • Score = Max of two scores • Actor Score = Sum of level wise scores with weightages #DS9SAIS 13

14.Example Patient Level Score card Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Patient ID Group 1 Group 1 Group 2 Group 2 Group 3 Group 3 Group 4 Group 4 Anomaly_Score Patient - 1 1.0000 1.0000 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9992 Patient - 2 1.0000 0.9985 0.9974 0.9997 0.9979 1.0000 0.9979 1.0000 0.9989 Patient - 3 0.9971 0.9997 0.9974 0.9997 0.9979 0.9998 0.9979 0.9998 0.9987 Patient - 4 0.9971 0.9990 0.9974 0.9997 0.9979 0.9998 0.9979 0.9998 0.9986 Patient - 5 0.9971 0.9972 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9984 Patient - 6 0.9971 0.9962 0.9974 1.0000 0.9979 1.0000 0.9979 1.0000 0.9983 Physician Level Score card Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Physician ID Group 1 Group 1 Group 2 Group 2 Group 3 Group 3 Group 4 Group 4 Anomaly_Score Physician - 1 0.9965 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 2 0.9997 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 3 0.9965 1.0000 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 4 0.9997 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 1.0000 Physician - 5 0.9965 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 0.9999 Physician - 6 0.9965 0.9998 0.9973 1.0000 0.9979 1.0000 0.9979 1.0000 0.9999 Pharmacy Level Score card Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Cluster_Size_Factor Cluster_Distance_Factor Pharmacy ID Group 1 Group 1 Group 2 Group 2 Group 3 Group 3 Group 4 Group 4 Anomaly_Score Pharmacy - 1 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000 Pharmacy - 2 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000 Pharmacy - 3 0.9776 1.0000 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 1.0000 Pharmacy - 4 0.9994 0.9998 0.9998 1.0000 0.9998 1.0000 0.9999 1.0000 1.0000 Pharmacy - 5 0.9776 0.9998 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 0.9999 Pharmacy - 6 0.9776 0.9998 0.9974 1.0000 0.9973 1.0000 0.9979 1.0000 0.9999 #DS9SAIS 14

15.Anomalous Relationship • Analyze – # of Connected Neighbors – # of Neighbors’ Neighbors #DS9SAIS 15

16.GraphX • Create a graph with multiple node types – Physician – Patient – Pharmacy #DS9SAIS 16

17.Calculate Neighbor Count • First level Neighbor • Compute Immediate neighbor’s degree from the each vertex • Second Level Neighbor • Identify Neighbor’s Neighbor degree and bring to the parent vertex • Third Level Neighbor • Identify Second Level Neighbor’s Neighbor degree and bring to the parent vertex #DS9SAIS 17

18.Anomaly Score • Based on position in each count • Consolidated score = Max of all the counts #DS9SAIS 18

19.Consolidate Score • Sum of Behavior Score and Relationship Score with configurable weightages #DS9SAIS 19

20.Implementation – How it works? • Web Application for the use of Analysts • Features – Upload Claim Data – Run models – View Results – Actor wise ranks – Action – Tag as False Positive or Initiate Case – Feedback – Input Investigate status #DS9SAIS 20

21.Initial Results • Many findings would have escaped Rule based checks • Initial investigation results prove less false positives • Ranking weightages might need some tuning #DS9SAIS 21

22.Next steps • Supervised techniques with investigation results • Use additional data – Social media – Twitter, Facebook, Review data – Address and Property data - Zillow #DS9SAIS 22

23.Q&A #DS9SAIS 23

24.Rajesh Inbasekaran : Rajesh@kaviglobal.com Giridharan Gurumoorthy : Giridharan@kaviglobal.com #DS9SAIS 24