SkillAgentSearch skills...

Diversevul

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection (RAID 2023) https://surrealyz.github.io/files/pubs/raid23-diversevul.pdf

Install / Use

/learn @wagner-group/Diversevul
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection

Yizheng Chen, Zhoujie Ding, Lamya Alowain, Xinyun Chen, David Wagner

https://surrealyz.github.io/files/pubs/raid23-diversevul.pdf

Dataset

Our DiverseVul dataset can be downloaded from this URL: https://drive.google.com/file/d/12IWKhmLhq7qn5B_iXgn5YerOQtkH-6RG/view?usp=sharing

The metadata of the dataset is available here: https://drive.google.com/file/d/19cJ7avNtsziaYkrrYuW7FeFdvgrxoNLc/view?usp=sharing The meta data contains commit URLs and repository URLs for 7,512 commits in the DiverseVul dataset. Note that the metadata file is missing 3 commit URLs compared to the extract dataset above.

The following spreadsheet contains the data for our label noise analysis experiment in Section 5: https://docs.google.com/spreadsheets/d/1Tns31RHeozRJF9e5Ie-Iw7nRIKJhrA2xvUUjTmFf5ec/edit?usp=sharing

The splits of merged datasets (including DiverseVul, Devign, ReVeal, BigVul, CrossVul, and CVEfixes) are available here: https://drive.google.com/drive/folders/1BeX33sgLOWLBnJ_vjcYitzz87F1kFZWi?usp=drive_link

View on GitHub
GitHub Stars176
CategoryEducation
Updated19d ago
Forks7

Security Score

80/100

Audited on Mar 14, 2026

No findings