🔨 Tweak translation script by YuriiMotov · Pull Request #15174 · fastapi/fastapi
1. Instruct LLM to preserve the order of links from original document.
In some languages (ja, ko) sentences might be inverted and it changes the order of links. But our translation fixer tool relies on the fact that links should go in the same order as in the original document.
Commit: 7c5797c
2. On retry, pass validation error and results of previous iteration
If translation fails to pass validation we just retry it with the same input. So, retry is kind of a lottery - we just hope that next time it will make it better.
If we instead update prompt to provide validation error text and pass current result (instead of initial translation) - it improves results:
Before changes it was often something like:
$ python scripts/translate.py translate-page --en-path docs/en/docs/index.md --language ja
Found existing translation: docs/ja/docs/index.md
Translating docs/en/docs/index.md to ja (日本語)
Running agent for docs/ja/docs/index.md (attempt 1/3)
Failed on attempt 1/3: Number of markdown links does not match the number in the original document (6 vs 40)
Running agent for docs/ja/docs/index.md (attempt 2/3)
Failed on attempt 2/3: Number of markdown links does not match the number in the original document (6 vs 40)
Running agent for docs/ja/docs/index.md (attempt 3/3)
Failed on attempt 3/3: Number of markdown links does not match the number in the original document (14 vs 40)
Translation failed for docs/ja/docs/index.md after 3 attempts
Saving translation to docs/ja/docs/index.md
(result still doesn't pass validation)
After changes, it still needs several attempts, but results is improved with every attempt:
$ python scripts/translate.py translate-page --en-path docs/en/docs/index.md --language ja
Found existing translation: docs/ja/docs/index.md
Translating docs/en/docs/index.md to ja (日本語)
Running agent for docs/ja/docs/index.md (attempt 1/3)
Failed on attempt 1/3: Number of markdown links does not match the number in the original document (6 vs 40)
Running agent for docs/ja/docs/index.md (attempt 2/3)
Failed on attempt 2/3: Number of markdown links does not match the number in the original document (39 vs 40)
Running agent for docs/ja/docs/index.md (attempt 3/3)
Saving translation to docs/ja/docs/index.md
(after third attempt it gave a valid result)
Commit: c7ca144
Diff is actually not as big as GitHub shows it. I just moved prompt creation logic outside the translate_page and changed it a bit to add the verification error from the last attempt and pass the results of last attempt instead of initial translation.