TY - JOUR
KW - banking transactions
KW - Fraud Detection
KW - diffusion models
KW - synthetic dataset generation
AU - Yurii Pushkarenko
AU - Volodymyr Zaslavskyi
AB - <p>Detection of fraudulent transactions in payment and banking systems using credit cards is a significant challenge, primarily due to the limitations in accessing real-world data necessary for training models and developing algorithms to analyze transaction streams for accuracy. Real data related to contractual relationships between financial systems and their clients is confidential, which influences both the formation of the data recorded in transactions and the analysis of transaction flows to identify fraudulent activities.</p><p>This paper explores the potential of using diffusion models to generate realistic synthetic transaction data aimed at improving the performance of fraud detection algorithms. Particular emphasis is placed on processing datasets that contain a mix of categorical (textual) and numerical attributes and exhibit a pronounced class imbalance between legitimate and fraudulent transactions.&nbsp;</p><p>A comparison is presented between the effectiveness of traditional fraud detection methods on real transaction data and the proposed approach, which actively employs synthetic data generated using diffusion models. The results demonstrate significant improvements in the reliability of models in accurately detecting fraud, highlighting the potential of diffusion models as a powerful tool in the development of more effective fraud detection systems.</p>
BT - Information & Security: An International Journal
DO - https://doi.org/10.11610/isij.5534
IS - 2
N2 - <p>Detection of fraudulent transactions in payment and banking systems using credit cards is a significant challenge, primarily due to the limitations in accessing real-world data necessary for training models and developing algorithms to analyze transaction streams for accuracy. Real data related to contractual relationships between financial systems and their clients is confidential, which influences both the formation of the data recorded in transactions and the analysis of transaction flows to identify fraudulent activities.</p><p>This paper explores the potential of using diffusion models to generate realistic synthetic transaction data aimed at improving the performance of fraud detection algorithms. Particular emphasis is placed on processing datasets that contain a mix of categorical (textual) and numerical attributes and exhibit a pronounced class imbalance between legitimate and fraudulent transactions.&nbsp;</p><p>A comparison is presented between the effectiveness of traditional fraud detection methods on real transaction data and the proposed approach, which actively employs synthetic data generated using diffusion models. The results demonstrate significant improvements in the reliability of models in accurately detecting fraud, highlighting the potential of diffusion models as a powerful tool in the development of more effective fraud detection systems.</p>
PY - 2024
SE - 185
SP - 185
EP - 198
T2 - Information & Security: An International Journal
TI - Synthetic Data Generation for Fraud Detection Using Diffusion Models
VL - 55
ER -