Insurance Fraud Detection: Leveraging Positive and Unlabeled Learning in Imbalanced Data
Insurance fraud data is highly imbalanced and it can be difficult to define what constitutes fraud within the data. In addition, the labeling process of insurance claims occurs as a result of fraud investigations, where only highly suspicious claims are investigated. As a result, the collected data can be categorized as a positively labeled and unlabeled dataset, where positively labeled data poin
