BCCC-DeFiFraudTrans-2025 is a large-scale, Ethereum-based benchmark designed explicitly for profiling fraudulent and legitimate DeFi transactions. It contains 1,026,867 annotated transaction samples spanning from 2017 to 2024, drawn from 9,374 unique wallet addresses. The dataset integrates both wallet-level and transaction-level attributes, with 79 features extracted via the DeFiTransLyzer-V1.0 analyzer. These features are organized into categories, including gas usage, cumulative gas consumption, token transfers, nonce behavior, block identifiers, transaction status, and error rates, providing a fine-grained view of transaction dynamics. Unlike many prior datasets, BCCC-FraudDefi-2025 was curated to be balanced, feature-rich, and validated: fraudulent labels were assigned based on Etherscan tags and then cross-checked with anomaly detection, consistency verification, and duplicate removal. The dataset enables evaluation of advanced fraud detection methods, including zero-day attacks, by eliminating reliance on prior wallet history and focusing purely on transaction-level behavior.
The full research paper outlining the details of the dataset and its underlying principles:
"Advanced Genetic Algorithm and Penalty Fitness Function for Enhancing DeFi Security and Detecting Ethereum Fraud Transactions"
Arash Habibi Lashkari, Sepideh Hajihosseinkhani, Joshua Duarte, Isabella Lopez, Ziba Habibi Lashkari, Sergio Rios-Aguilar, Blockchain: Research and Applications, 2025
Download Dataset:
