src: support UTF-8 in compiled-in JS source files by bnoordhuis · Pull Request #11129 · nodejs/node

@nodejs-github-bot added build

Issues and PRs related to build files or the CI.

c++

Issues and PRs that require attention from people who are familiar with C++.

labels

Feb 2, 2017

aqrln

addaleax

bnoordhuis

addaleax

Detect it when source files in lib/ are not ASCII.  Decode them as UTF-8
and store them as UTF-16 in the binary so they can be used as external
string resources without non-ASCII characters getting mangled.

Fixes: nodejs#10673
PR-URL: nodejs#11129
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
The previous commit stores baked-in files with non-ASCII characters
as UTF-16.  Replace the \u2019 with a regular quote character so that
the files they're in can be stored as one-byte strings.  The UTF-16
functionality is still tested by the Unicode diagram in lib/timers.js.

PR-URL: nodejs#11129
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>

italoacasas pushed a commit to italoacasas/node that referenced this pull request

Feb 14, 2017
Detect it when source files in lib/ are not ASCII.  Decode them as UTF-8
and store them as UTF-16 in the binary so they can be used as external
string resources without non-ASCII characters getting mangled.

Fixes: nodejs#10673
PR-URL: nodejs#11129
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>

italoacasas pushed a commit to italoacasas/node that referenced this pull request

Feb 14, 2017
The previous commit stores baked-in files with non-ASCII characters
as UTF-16.  Replace the \u2019 with a regular quote character so that
the files they're in can be stored as one-byte strings.  The UTF-16
functionality is still tested by the Unicode diagram in lib/timers.js.

PR-URL: nodejs#11129
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>

@aqrln aqrln mentioned this pull request

Feb 16, 2017

2 tasks

aqrln added a commit to aqrln/node that referenced this pull request

Feb 16, 2017
In order to allow using Unicode characters inside comments of built-in
JavaScript libraries without forcing them to be stored as UTF-16 data in
Node's binary, update the tooling to strip comments during build
process. All line breaks are preserved so that line numbers in stack
traces aren't broken.

Refs: nodejs#11129
Refs: nodejs#11371 (comment)

aqrln added a commit to aqrln/node that referenced this pull request

Feb 16, 2017
This test ensures that UTF-8 characters can be used in core JavaScript
modules built into Node's binary.

Refs: nodejs#11129

@aqrln aqrln mentioned this pull request

Feb 16, 2017

3 tasks

italoacasas pushed a commit that referenced this pull request

Feb 16, 2017
Notable changes:

* deps:
    * update V8 to 5.5 (Michaël Zasso) [#11029](#11029)
    * upgrade libuv to 1.11.0 (cjihrig) [#11094](#11094)
    * add node-inspect 1.10.2 (Jan Krems) [#10187](#10187)
* lib: build `node inspect` into `node` (Anna Henningsen) [#10187](#10187)
* crypto: Remove expired certs from CNNIC whitelist (Shigeki Ohtsu) [#9469](#9469)
* inspector: add --inspect-brk (Josh Gavant) [#11149](#11149)
* fs: allow WHATWG URL and file: URLs as paths (James M Snell) [#10739](#10739)
* src: support UTF-8 in compiled-in JS source files (Ben Noordhuis) [#11129](#11129)
* url: extend url.format to support WHATWG URL (James M Snell) [#10857](#10857)

PR-URL: #11185

italoacasas pushed a commit to italoacasas/node that referenced this pull request

Feb 21, 2017

italoacasas pushed a commit to italoacasas/node that referenced this pull request

Feb 22, 2017

imyller added a commit to imyller/meta-nodejs that referenced this pull request

Mar 2, 2017

jasnell pushed a commit that referenced this pull request

Mar 7, 2017
Detect it when source files in lib/ are not ASCII.  Decode them as UTF-8
and store them as UTF-16 in the binary so they can be used as external
string resources without non-ASCII characters getting mangled.

Fixes: #10673
PR-URL: #11129
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>

MylesBorins pushed a commit that referenced this pull request

Mar 9, 2017
Detect it when source files in lib/ are not ASCII.  Decode them as UTF-8
and store them as UTF-16 in the binary so they can be used as external
string resources without non-ASCII characters getting mangled.

Fixes: #10673
PR-URL: #11129
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>

jasnell pushed a commit that referenced this pull request

Apr 4, 2017
This test ensures that UTF-8 characters can be used in core JavaScript
modules built into Node's binary.

PR-URL: #11423
Ref: #11129
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>

italoacasas pushed a commit to italoacasas/node that referenced this pull request

Apr 10, 2017
This test ensures that UTF-8 characters can be used in core JavaScript
modules built into Node's binary.

PR-URL: nodejs#11423
Ref: nodejs#11129
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>