SkillAgentSearch skills...

Loghub

A large collection of system log datasets for AI-driven log analytics [ISSRE'23]

Install / Use

/learn @logpai/Loghub
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center"> <a href="https://github.com/logpai"> <img src="https://cdn.jsdelivr.net/gh/logpai/logpai.github.io@master/img/logpai_logo.jpg" width="480"></a></p> <div> <a href="https://github.com/logpai/loghub/stargazers"><img src="http://bytecrank.com/nastyox/reporoster/php/stargazersSVG.php?user=logpai&repo=loghub" width="600"/><a/> </div>

Loghub

Loghub maintains a collection of system logs, which are freely accessible for AI-driven log analytics research. Some of the logs are production data released from previous studies, while some others are collected from real systems in our lab environment. Wherever possible, the logs are NOT sanitized, anonymized or modified in any way. These log datasets are freely available for research or academic work.

🤗 We proudly announce that the loghub datasets have attained total <a href="https://doi.org/10.5281/zenodo.1144100"><img src="https://img.shields.io/endpoint?&url=https://cdn.jsdelivr.net/gh/logpai/loghub@zenodo/downloads.json&labelColor=1AE&color=DDEEFF&style=flat&label=Downloads"></a> by more than 450 organizations from both industry and academia.

Logs currently available

🔗 Get raw logs via hyperlinks in the Download column.

| Dataset | Description | Labeled | Time Span | #Lines | Raw Size | Download | | :---------------------------- | :--------| :--------: | --------: | ---------: | ------: | :------: | |<tr><th colspan=7 align="center">:open_file_folder: Distributed systems</th></tr>| | HDFS_v1 | Hadoop distributed file system log | :heavy_check_mark: | 38.7 hours | 11,175,629 | 1.47GiB | :link: |
| HDFS_v2 | Hadoop distributed file system log| | N.A. | 71,118,073 | 16.06GiB | :link: | | HDFS_v3 | Instrumented HDFS trace log (TraceBench) | :heavy_check_mark: | N.A. | 14,778,079 | 2.96GiB | :link: | | Hadoop | Hadoop mapreduce job log | :heavy_check_mark: (Check #56) | N.A. | 394,308 | 48.61MiB | :link: | | Spark | Spark job log || N.A. | 33,236,604 | 2.75GiB | :link: |
| Zookeeper | ZooKeeper service log | | 26.7 days | 74,380 | 9.95MiB | :link: | | OpenStack | OpenStack infrastructure log | :heavy_check_mark: | N.A. | 207,820 | 58.61MiB | :link: |
|<tr><th colspan=7 align="center">:open_file_folder: Super computers</th></tr>| | BGL | Blue Gene/L supercomputer log | :heavy_check_mark: | 214.7 days | 4,747,963 | 708.76MiB | :link: | | HPC | High performance cluster log | | N.A. | 433,489 | 32.00MiB | :link: |
| Thunderbird | Thunderbird supercomputer log | :heavy_check_mark: | 244 days | 211,212,192 | 29.60GiB | :link: | |<tr><th colspan=7 align="center">:open_file_folder: Operating systems</th></tr>|
| Windows | Windows event log | | 226.7 days | 114,608,388 | 26.09GiB | :link: |
| Linux | Linux system log | | 263.9 days | 25,567 | 2.25MiB | :link: | | Mac | Mac OS log | | 7.0 days | 117,283 | 16.09MiB | :link: | |<tr><th colspan=7 align="center">:open_file_folder: Mobile systems</th></tr>|
| Android_v1 | Android framework log | | N.A. | 1,555,005 | 183.37MiB | :link: | | Android_v2 | Android framework log | | N.A. | 30,348,042 | 3.38GiB | :link: | | HealthApp | Health app log | | 10.5 days | 253,395 | 22.44MiB | :link: | |<tr><th colspan=7 align="center">:open_file_folder: Server applications</th></tr>|
| Apache | Apache web server error log | | 263.9 days | 56,481 | 4.90MiB | :link: |
| OpenSSH | OpenSSH server log | | 28.4 days | 655,146 | 70.02MiB | :link: | |<tr><th colspan=7 align="center">:open_file_folder: Standalone software</th></tr>|
| Proxifier | Proxifier software log | | N.A. | 21,329 | 2.42MiB | :link: |

🔥 Citation

Please cite the following two papers if you use the loghub datasets in your research.

🌈 License

The datasets are freely available for research or academic work. For any usage or distribution of the datasets, please refer to the loghub repository URL https://github.com/logpai/loghub and cite the loghub paper where applicable.

🙋 Discussion

Welcome to open a discussion here for any question and discussion.

View on GitHub
GitHub Stars2.6k
CategoryData
Updated1h ago
Forks753

Security Score

85/100

Audited on Apr 3, 2026

No findings