Spell is a structured Streaming Parser for Event Logs using an LCS (longest common subsequence) based approach. Spell parses unstructured log messages into structured message templates and parameters in an online streaming fashion. We implemented Spell using Python with a standard interface for benchmarking purpose.
Read more information about Spell from the following paper:
- Min Du, Feifei Li. Spell: Streaming Parsing of System Event Logs, IEEE International Conference on Data Mining (ICDM), 2016.
Running
The code has been tested in the following enviornment:
- python 3.7.6
- regex 2022.3.2
- pandas 1.0.1
- numpy 1.18.1
- scipy 1.4.1
Run the following script to start the demo:
Run the following script to execute the benchmark:
Benchmark
Running the benchmark script on Loghub_2k datasets, you could obtain the following results.
| Dataset | F1_measure | Accuracy |
|---|---|---|
| HDFS | 1 | 1 |
| Hadoop | 0.920197 | 0.7775 |
| Spark | 0.991018 | 0.905 |
| Zookeeper | 0.999549 | 0.9635 |
| BGL | 0.956932 | 0.7865 |
| HPC | 0.986063 | 0.654 |
| Thunderbird | 0.994456 | 0.8435 |
| Windows | 0.999974 | 0.9885 |
| Linux | 0.936822 | 0.605 |
| Android | 0.992196 | 0.9185 |
| HealthApp | 0.886674 | 0.639 |
| Apache | 1 | 1 |
| Proxifier | 0.832044 | 0.5265 |
| OpenSSH | 0.918038 | 0.554 |
| OpenStack | 0.994108 | 0.764 |
| Mac | 0.963472 | 0.7565 |
Citation
🔭 If you use our logparser tools or benchmarking results in your publication, please kindly cite the following papers.
- [ICSE'19] Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu. Tools and Benchmarks for Automated Log Parsing. International Conference on Software Engineering (ICSE), 2019.
- [DSN'16] Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. An Evaluation Study on Log Parsing and Its Use in Log Mining. IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2016.