Skip to main content Skip to local navigation
Home » Posts tagged 'Public' (Page 2)

Public

Vulnerable Smart Contracts (BCCC-VulSCs-2023)

The BCCC-VulSCs-2023 dataset is a substantial collection for Solidity Smart Contracts (SCs) analysis, comprising 36,670 samples, each enriched with 70 feature columns. These features include the raw source code of the smart contract, a hashed version of the source code for secure referencing, and a binary label that indicates a contract as secure (0) or […]

Tabular IoT Attack Dataset (CIC-BCCC-NRC TabularIoTAttack-2024)

The CIC-BCCC-NRC TabularIoTAttack-2024 dataset is a comprehensive collection of IoT network traffic data generated as part of an advanced effort to create a reliable source for training and testing AI-powered IoT cybersecurity models. This dataset is designed to address modern challenges in detecting and identifying IoT-specific cyberattacks, offering a rich and diverse set of labeled […]

Malicious DNS and Attacks (BCCC-CIC-Bell-DNS-2024)

Using ALFlowLyzer, we successfully generated an augmented dataset, "BCCC-CIC-Bell-DNS-2024," from two existing datasets: "CIC-Bell-DNS-2021" and "CIC-Bell-DNS-EXF-2021." ALFlowLyzer enabled the extraction of essential flows from raw network traffic data, resulting in CSV files that integrate DNS metadata and application layer features. This new dataset combines light and heavy data exfiltration traffic into six unique sub-categories, providing […]

Smart Contracts Vulnerabilities (BCCC-SCsVuls-2024)

The BCCC-SCsVuls-2024 dataset is a comprehensive resource for analyzing and detecting vulnerabilities in Solidity-based smart contracts, featuring 111,897 meticulously labeled samples across 11 vulnerabilities such as Re-entrancy (17,698), IntegerUO (16,740), DenialOfService (12,394), and Secure contracts (26,914). The dataset was curated from reputable sources like Smart Bugs, Ethereum SCs, and SmartScan-Dataset, ensuring diverse and representative vulnerability […]

Source Code Authorship Attribution (YU-SCAA-2022)

Source Code Authorship Attribution (SCAA) is a technique used to identify the actual author of source code within a corpus. Although it poses a privacy threat to open-source programmers, it is significantly helpful in developing forensic-based applications, such as ghostwriting detection, copyright dispute settlements, identifying authors of malicious applications using source code, and other code […]

SQL Injection Attack (BCCC-SFU-SQLInj-2023)

This dataset consists of 11,012 evasive or sophisticated malicious SQL queries. These queries are generated using a genetic algorithm applied to the Kaggle malicious SQL dataset. The goal of the genetic algorithm is to enhance the evasiveness and sophistication of the original malicious queries. The full research paper outlining the details of the dataset and […]

Intrusion Detection Dataset (BCCC-CIC-IDS2017)

Using NLFlowLyzer, we successfully generated the “BCCC-CIC-IDS2017” dataset by extracting key flows from raw network traffic data of CIC-IDS2017, resulting in CSV files integrating essential network and transport layer features. This new dataset offers a structured approach for analyzing intrusion detection, combining diverse traffic types into multiple sub-categories. The “BCCC-CIC-IDS2017” dataset enriches the depth and […]

Large-Scale Intrusion Detection Dataset (BCCC-CSE-CIC-IDS2018)

The BCCC-CSE-CIC-IDS2018 dataset is an enhanced version of CSE-CIC-IDS2018 with 46 million labelled records and 300 features, addressing key issues to improve data quality and reliability for behavioral profiling in IDS research. Labeling inconsistencies, particularly for DoS attacks, were corrected by aligning attack labels with attacker IPs instead of timestamps. NTLFlowLyzer, a new network traffic […]

Large-Scale Multisources Malware Analysis Dataset using Network Traffic and Memory (BCCC-Mal-NetMem-2025)

The BCCC-Mal-NetMem-2025 dataset comprises over 7.7 million labeled records from controlled experiments involving 15 malware categories and 32 individual malware samples. These categories include ransomware, Trojan downloaders, coin miners, remote access tools (RATs), spyware, backdoors, and worms. The data was collected by executing each malware in isolated Windows environments equipped with real-time network and memory […]

Encrypted Traffic Dataset (BCCC-DarkNet-2025)

BCCC-DarkNet-2025 is an augmented, research-driven dataset that supports encrypted traffic analysis and threat detection across anonymized communication networks. It integrates and extends two benchmark datasets, CIC-Darknet2020 and Darknet-Dataset-2020, selected for their robust coverage of encryption protocols and darknet-specific traffic behaviors. The dataset includes diverse encrypted traffic types like VPN, Tor, I2P, Freenet, and ZeroNet, with multi-class labeling and protocol-specific annotations. These […]