Eighty thousand Kindle users. Sixty-five million Tumblr users. What do they have in common? Both groups had their login credentials breached, courtesy of hackers. While these attacks didn't directly target financial accounts,the information contained in these breaches is likely being sold on the Dark Web and being used to build a larger profile that will grant fraudsters access to sensitive financial accounts. Given the number of accounts potentially at stake from just these breaches, it's no wonder that financial organizations and retailers are putting significant resources into uncovering aberrations in user behavior in order to stay ahead of the game. Fraud prevention solutions are critical to that success and those who are able to accurately identify fraud patterns and develop systems to accurately predict and mitigate them are in the driver's seat.
What Does a Fraud Data Scientist Do?
Data science is a term that can currently be heard everywhere. Similar to the term “Big Data,” if 100 people were asked what data science means they would give 100 different definitions. Even though data scientists are the most in-demand (and well paid) professionals, it is still difficult to understand what a data scientist does.
The consensus among practitioners is that a data scientist is a part developer, part statistician, with an MBA throw in for good measure. The most common way to define this new profile is by using Drew Conway’s Data Science Venn Diagram.
Data science is the intersection of hacking skills, math and statistics knowledge, and substantive expertise. The knowledge in those fields is applied to the data science pillars of computing, statistics, mathematics and quantitative disciplines, combined to analyze data for better decision making. It is crucial to know and understand the importance of all the pillars. In order to arrive at a successful solution, you need the skills to extract, process, model and deploy a system that can solve an actual business problem.
Data Science and Fraud Prevention
Now that I have defined what a data scientist does, let’s explore how the science can be applied to fraud prevention. Fraud is a huge problem. According to the European Central Bank, card fraud has grown 19 percent between 2010 and 2015. Moreover, fraud has diversified to different digital channels, including mobile and online payments, creating new challenges as innovative fraud patterns emerge. Hence, it is still a challenge to find effective methods to mitigate fraud.
An effective fraud data scientist must be able to tackle different issues specific to fraud. First, fraud is a cost-sensitive problem as the financial cost of misclassifying a fraudulent transaction is different than wrongly blocking a legitimate transaction. Moreover, there is an enormous amount of transactions to sort through, and since only a tiny portion of them are fraudulent, a huge class imbalance can occur. Additionally, a real fraud detection system requires a response in milliseconds so as not to slow down legitimate transactions. This criterion needs to be taken into account in the modeling process in order for the system to be successfully implemented.
Tackling this problem requires advanced knowledge of statistics, machine learning, domain insight and a deep understanding of a fraudster’s behaviors and motivations. Computer architecture is also important to create a successful fraud detection algorithm. The response time of a system is measured in milliseconds, and expert implementation is vital to proactively prevent fraud. If you want to learn more about data science, see my recent presentation on Modern Data Science.
If you want to learn more about how data science is applied to fraud prevention, click here.