The master branch contains sources for reproducing our results reported in
the WMT21 Metrics workshop.
See ablation-study for evaluating an impact of each of the ensembled metrics
to the result, xling for zero-shot cross-lingual metric evaluation,
multiling for evaluation of the fit on multiple languages, test_judgements
for re-generating the submission, and docker-build for building a Docker image.
How to reproduce our results
Docker
To reproduce our results, you can use our miratmu/regemt Docker
image using the NVIDIA Container Toolkit:
mkdir submit_dir chmod 777 submit_dir # test the installation on a data subsample before running the full evaluation process: docker run --rm --gpus all -v "$PWD"/submit_dir:/submit_dir miratmu/regemt --fast # simply run the evaluation on the full data sets: # this takes ~10hrs on Tesla T4, might take longer on CPU docker run --rm --gpus all -v "$PWD"/submit_dir:/submit_dir miratmu/regemt
The evaluation process will generate the correlation reports in .png and
.pdf format for each of the evaluated configurations into the submit_dir/
directory.
Python
Alternatively, you can install our package using Python:
git clone https://github.com/MIR-MU/regemt.git cd regemt chmod 777 submit_dir # install the dependencies conda create --name wmt_eval python=3.8 conda activate wmt_eval pip install -r requirements.txt # test the installation on a data subsample before running the full evaluation process: python -m main --fast # simply run the evaluation on the full data sets: # this takes ~10hrs on Tesla T4, might take longer on CPU python -m main
The evaluation process will generate the correlation reports in .png and
.pdf format for each of the evaluated configurations into the regemt/
directory.
We're trying to keep it simple, but if you get into any trouble, or have a question, don't hesitate to create an issue and we'll take a look!
Citing RegEmt
Text
ŠTEFÁNIK, Michal, Vít NOVOTNÝ and Petr SOJKA. Regressive Ensemble for Machine Translation Quality Evaluation. In Markus Freitag. Proceedings of EMNLP 2021 Sixth Conference on Machine Translation (WMT 21). ACL, 2021. 8 pp.
BibTeX
@inproceedings{stefanik2021regressive, author = {\v{S}tef\'{a}nik, Michal and Novotn\'{y}, V\'{i}t and Sojka, Petr}, title = {Regressive Ensemble for Machine Translation Quality Evaluation}, booktitle = {Proceedings of {EMNLP} 2021 Sixth Conference on Machine Translation ({WMT} 21)}, editor = {Markus Freitag}, publisher = {ACL}, numpages = {8}, url = {https://arxiv.org/abs/2109.07242v1}, }