logparser/logparser/SHISO at main · logpai/logparser

SHISO is a method for mining log formats and retrieving log types and parameters in an online manner. By creating a structured tree using the nodes generated from log messages, SHISO refines log format continuously in realtime. We implemented SHISO using Python with a standard interface for benchmarking purpose.

Read more information about SHISO from the following paper:

Running

The code has been tested in the following enviornment:

  • python 3.7.6
  • regex 2022.3.2
  • pandas 1.0.1
  • numpy 1.18.1
  • scipy 1.4.1
  • nltk 3.4.5

Run the following script to start the demo:

Run the following script to execute the benchmark:

Benchmark

Running the benchmark script on Loghub_2k datasets, you could obtain the following results.

Dataset F1_measure Accuracy
HDFS 0.999984 0.9975
Hadoop 0.997513 0.867
Spark 0.991526 0.906
Zookeeper 0.993337 0.66
BGL 0.99445 0.711
HPC 0.541336 0.3245
Thunderbird 0.911185 0.576
Windows 0.912983 0.7005
Linux 0.975457 0.6715
Android 0.843701 0.585
HealthApp 0.842471 0.397
Apache 1 1
Proxifier 0.77964 0.5165
OpenSSH 0.997639 0.619
OpenStack 0.993697 0.7215
Mac 0.959845 0.595

Citation

🔭 If you use our logparser tools or benchmarking results in your publication, please kindly cite the following papers.