test: consolidate utf8 text fixtures in tests by joyeecheung · Pull Request #50732 · nodejs/node

@nodejs-github-bot added needs-ci

PRs that need a full CI run.

test

Issues and PRs related to the tests.

labels

Nov 14, 2023

@joyeecheung

We previously used a text that appears to be an excerpt of
https://zh.wikipedia.org/wiki/%E5%8D%97%E8%B6%8A%E5%9B%BD
and can have copyright/license complications. It may
also include some geopolitical nuances. The text has been
repeated through out the code base without much reuse.

This patch consolidates the fixtures by adding a common helper
string as `fixtures.utf8TestText` which is identical to a copy
in test/fixtures/utf8_test_text.txt. It also updates the text
to a copy of 蘭亭集序, It was chosen because:

1. It's a well-known Chinese classical piece written in 353 CE
   and therefore in public domain. The string is copied from
   https://zh.wikisource.org/zh-hant/%E8%98%AD%E4%BA%AD%E9%9B%86%E5%BA%8F
   which contains a disclaimer of copyright for this reason.
2. The text is in suitable length for general UTF8 string
   read/write tests (including punctuations, 389 code points and
   1167 bytes).
3. This is also commonly used as reference text for Chinese text
   layout tests.
4. It's a timeless and harmless preface for a collection of poems,
   written by a uncontroversial figure who passed away >1600 years
   ago and contains no geopolitical nuances. Background and an
   English translation of this text can be found at
   https://en.wikipedia.org/wiki/Lantingji_Xu

@joyeecheung

targos pushed a commit that referenced this pull request

Dec 4, 2023
We previously used a text that appears to be an excerpt of
https://zh.wikipedia.org/wiki/%E5%8D%97%E8%B6%8A%E5%9B%BD
and can have copyright/license complications. It may
also include some geopolitical nuances. The text has been
repeated through out the code base without much reuse.

This patch consolidates the fixtures by adding a common helper
string as `fixtures.utf8TestText` which is identical to a copy
in test/fixtures/utf8_test_text.txt. It also updates the text
to a copy of 蘭亭集序, It was chosen because:

1. It's a well-known Chinese classical piece written in 353 CE
   and therefore in public domain. The string is copied from
   https://zh.wikisource.org/zh-hant/%E8%98%AD%E4%BA%AD%E9%9B%86%E5%BA%8F
   which contains a disclaimer of copyright for this reason.
2. The text is in suitable length for general UTF8 string
   read/write tests (including punctuations, 389 code points and
   1167 bytes).
3. This is also commonly used as reference text for Chinese text
   layout tests.
4. It's a timeless and harmless preface for a collection of poems,
   written by a uncontroversial figure who passed away >1600 years
   ago and contains no geopolitical nuances. Background and an
   English translation of this text can be found at
   https://en.wikipedia.org/wiki/Lantingji_Xu

PR-URL: #50732
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>

richardlau pushed a commit that referenced this pull request

Mar 25, 2024
We previously used a text that appears to be an excerpt of
https://zh.wikipedia.org/wiki/%E5%8D%97%E8%B6%8A%E5%9B%BD
and can have copyright/license complications. It may
also include some geopolitical nuances. The text has been
repeated through out the code base without much reuse.

This patch consolidates the fixtures by adding a common helper
string as `fixtures.utf8TestText` which is identical to a copy
in test/fixtures/utf8_test_text.txt. It also updates the text
to a copy of 蘭亭集序, It was chosen because:

1. It's a well-known Chinese classical piece written in 353 CE
   and therefore in public domain. The string is copied from
   https://zh.wikisource.org/zh-hant/%E8%98%AD%E4%BA%AD%E9%9B%86%E5%BA%8F
   which contains a disclaimer of copyright for this reason.
2. The text is in suitable length for general UTF8 string
   read/write tests (including punctuations, 389 code points and
   1167 bytes).
3. This is also commonly used as reference text for Chinese text
   layout tests.
4. It's a timeless and harmless preface for a collection of poems,
   written by a uncontroversial figure who passed away >1600 years
   ago and contains no geopolitical nuances. Background and an
   English translation of this text can be found at
   https://en.wikipedia.org/wiki/Lantingji_Xu

PR-URL: #50732
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>