Deep learning for large-scale online fraud detection

Online fraud is often orchestrated by organized crime rings, who use malicious user accounts to actively target modern online services for financial gain. Ting-Fang Yen shares a real-time, scalable fraud detection solution backed by deep learning and built on Spark and TensorFlow and demonstrates how the system outperforms traditional solutions such as blacklists and machine learning.


Talk Title	Deep learning for large-scale online fraud detection
Speakers	Ting-Fang Yen (DataVisor)
Conference	Artificial Intelligence Conference
Conf Tag	Put AI to Work
Location	San Francisco, California
Date	September 5-7, 2018
URL	Talk Page
Slides	Talk Slides
Video

Online fraud is often orchestrated by organized crime rings, who use coordinated malicious user accounts, either created anew or obtained via user hijacking, to actively target modern online services for financial gain. Existing fraud solutions either rely on reputation lists for blocking known suspicious activities or require extensive feature engineering by human analysts for model training. These approaches don’t adapt well to changing fraud patterns; nor are they able to scale to large data volumes. DataVisor analyzes activities from billions of accounts across global online services to detect fraud and abuse, giving the company unique insights into the online fraud landscape that allow it to tackle the coordinated fraud attacks holistically. Ting-Fang Yen shares DataVisor’s real-time, scalable fraud detection solution, which is backed by deep learning and built on Spark and TensorFlow and demonstrates how the system significantly outperforms traditional solutions such as blacklists and machine learning at terabyte-data scale. The solution represents one of the few production examples where deep learning models are applied to security problems and is based on digital information commonly collected by online services, including IP addresses, user-agent strings, email domains, and user nicknames. The general fraud detection framework can identify fraudulent activities in log data that contain (all or a subnet of) this common digital information. By leveraging common digital information, the model is agnostic to the specific application or service from which data queries originate.

Deep learning for large-scale online fraud detection

Scaling the AI hierarchy of needs with TensorFlow, Spark, and Hops

Distributed TensorFlow on Hops

Distributed systems for stream processing: Apache Kafka and Spark Streaming

Improving user-merchant propensity modeling using neural collaborative filtering and wide and deep models on Spark BigDL at scale

Using Kubernetes to Offer Scalable Deep Learning on Alibaba Cloud

Machine learning at scale with Kubernetes