GitHub - junshiguo/AMC: Code for KDD 2014 paper "Mining Topics in Documents: Standing on the Shoulders of Big Data"

AMC is an open-source Java package implementing the algorithm proposed in the paper (Chen and Liu, KDD 2014), created by Zhiyuan (Brett) Chen. For more details, please refer to this paper.

If you use this package, please cite the paper: Zhiyuan Chen and Bing Liu. Mining Topics in Documents: Standing on the Shoulders of Big Data. In Proceedings of KDD 2014, pages 1116-1125.

If you have any question or bug report, please send it to Zhiyuan (Brett) Chen (czyuanacm@gmail.com).

a. Then, change the current working directory to Src.

b. Build the package.

c. Increase the Java heap memory for Maven.

export MAVEN_OPTS=-Xmx1024m

d. Run the program.

mvn exec:java -Dexec.mainClass="launch.MainEntry"

The output directory contains topic model results for LDA and AMC.

Under each model (LDA or AMC) directory, we have results for different datasets (small or big). Under the sub-folder "DomainModels", there are a list of domain folders where each domain folder contains topic model results for each domain. Under each domain folder, there are 9 files (can be opened by text editors):