整理自一度苦于找数据集的我。开个坑整理一下公开数据集。
希望有一天能填平(大概)。本文大概会同步到zhihu。
1、数据集集合
Canadian Institute for Cybersecurity datasets
来自加拿大网络安全研究所整理的数据集,包含下列数据集:
- Android Malware dataset (InvesAndMal2019)
- DDoS dataset (CICDDoS2019)
- IPS/IDS dataset on AWS (CSE-CIC-IDS2018)
- IPS/IDS dataset (CICIDS2017)
- Android Malware dataset (CICAndMal2017)
- Android Adware dataset (CICAAGM2017)
- DoS dataset (application-layer) 2017
- VPN-nonVPN traffic dataset (ISCXVPN2016)
- Tor-nonTor dataset (ISCXTor2016)
- ISCX-URL dataset (ISCX-URL-2016)
- ISCX Android Botnet dataset 2015
- ISCX Botnet dataset 2014
- ISCX Android Validation dataset 2014
- ISCX IDS dataset 2012
- ISCX NSL-KDD dataset 2009
数据挖掘与网络安全资源网
包含以下数据集:
- [入侵检测] DARPA入侵检测数据集
- [入侵检测] KDD Cup 99数据集
- [入侵检测] NSL-KDD数据集
- [黑客攻击数据集] Honeynet数据集(数据集包括从2000年4月到2011年2月,累计11个月的Snort报警数据,每月大概60-3000多条Snort报警记录,其网络由8个IP地址通过ISDN连接到ISP)
- [日志数据] Challenge 2013数据集(提供了某虚构的跨国公司内部网络两周的运行日志,日志类型有3种,分别是网络流量Netflow日志数据和Big Brother 网络健康和状态数据,日志包括:第一、二周的Netflow和Big Brother日志,第二周的入侵预防系统日志数据,通过日志的分析可以找出网络中存在的异常,网络包含的主机和服务器约1100 台,原始日志量接近10 GB,记录数超过9000万行)
- 恶意软件数据集
Vizsec
该网站包含下列数据集:
- UGR’16: A New Dataset for the Evaluation of Cyclostationarity-Based Network IDSs
- Stanford Large Network Dataset Collection (SNAP):
- APTnotes
- Open Malware
- Shadow Server Malware Data site
- Darpa CGC (known vulnerabilities)
- DNS data
- SecRepo
- malware-traffic-analysis
- NETRESEC Data
- CTU Data
- Digital Corpora
- Impact
- Kyoto: Traffic Data from Kyoto University’s Honeypots.
- The Honeynet Project: Many different types of data for each of their challenges, including pcap, malware, logs.
- VAST Challenge 2013: Mini-challenge 3 is related to cybersecurity and includes network flow data, network status data (via big brother), and intrusion prevention system data.
- VAST Challenge 2012: This challenge has two mini-challenges, one related to situation awareness (metadata and periodic status reports from all computing equipment) and one to forensics (Firewall and IDS logs).
- VAST Challenge 2011: Mini-challenge 2 is related to Cybersecurity – Situational Awareness in Computer Networks (Firewall and IDS logs).
- DARPA Intrusion Detection Data: This data set has numerous issues that have been documented in the literature.
- ORNL Auto-labeled corpus: A corpus of automatically labeled text data in the cyber security domain.
- Industrial Control System (ICS) Cyber Attack Data Set: Data from MSU. The dataset is made up of tuples of timestamp, network protocol (MODBUS), and system information (measurements and settings), and attack attributes.
(算了剩下的部分再找时间写吧)
文章知识点与官方知识档案匹配,可进一步学习相关知识网络技能树跨区域网络的通信学习网络层的作用22160 人正在系统学习中
来源:柠檬橘子百香果
声明:本站部分文章及图片转载于互联网,内容版权归原作者所有,如本站任何资料有侵权请您尽早请联系jinwei@zod.com.cn进行处理,非常感谢!