- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
Automating Loss Prevention Using NLP with FastAI on Azure Databricks
PetSmart, with over 1,600 stores in North America, is the largest specialty pet retailer of services and solutions for the lifetime needs of pets. The Advanced Analytics Group is a small team of highly business-oriented strategy and data science professionals that uses various data and modeling methodologies to generate breakthrough insights for various business units throughout the company to deliver top- and bottom-line growth for PetSmart.
As a retailer, PetSmart has many years of transaction data, loss prevention store reports, customer feedback, labor schedules, supply chain, and other data. Loss prevention deals with reduction of preventable losses whether it be from, theft, fraud, vandalism, waste, abuse, incidents, accidents, or misconduct.
Store leaders at PetSmart locations submit free text reports to the Loss Prevention team of investigators which must be prioritized for further resolution. Most reports are of low priority and are reported as a matter of policy fulfillment but some require further investigation by this team. However, the team must still read each report in order to filter out low priority reports and then spend time investigating the higher priority reports. The Advanced Analytics Group was asked if we could help automatically prioritize these reports. Developing a prioritization system with performance high enough to automatically prioritize would require near-human performance. To achieve that level (96% accuracy) we utilized FastAI’s ULMFiT NLP classifier.
FastAI is not natively supported on Azure Databricks so setup required special configuration. Azure Databricks newly released ML Beta and GPU clusters were instrumental in enabling the setup. Other challenges included actually extracting the data from a legacy reporting system. Without the flexibility Azure Databricks provides, the iterations, training, and eventual operationalization of the model would have taken much longer and at a greater cost.