doc: improve Buffer's encoding documentation · nodejs/node@a57dc06

@@ -79,17 +79,21 @@ console.log(Buffer.from('fhqwhgads', 'utf16le'));

7979

// Prints: <Buffer 66 00 68 00 71 00 77 00 68 00 67 00 61 00 64 00 73 00>

8080

```

818182+

Node.js buffers accept all case variations of encoding strings that they

83+

receive. For example, UTF-8 can be specified as `'utf8'`, `'UTF8'` or `'uTf8'`.

84+8285

The character encodings currently supported by Node.js are the following:

838684-

* `'utf8'`: Multi-byte encoded Unicode characters. Many web pages and other

85-

document formats use [UTF-8][]. This is the default character encoding.

86-

When decoding a `Buffer` into a string that does not exclusively contain

87-

valid UTF-8 data, the Unicode replacement character `U+FFFD` � will be used

88-

to represent those errors.

87+

* `'utf8'` (alias: `'utf-8'`): Multi-byte encoded Unicode characters. Many web

88+

pages and other document formats use [UTF-8][]. This is the default character

89+

encoding. When decoding a `Buffer` into a string that does not exclusively

90+

contain valid UTF-8 data, the Unicode replacement character `U+FFFD` � will be

91+

used to represent those errors.

899290-

* `'utf16le'`: Multi-byte encoded Unicode characters. Unlike `'utf8'`, each

91-

character in the string will be encoded using either 2 or 4 bytes.

92-

Node.js only supports the [little-endian][endianness] variant of [UTF-16][].

93+

* `'utf16le'` (alias: `'utf-16le'`): Multi-byte encoded Unicode characters.

94+

Unlike `'utf8'`, each character in the string will be encoded using either 2

95+

or 4 bytes. Node.js only supports the [little-endian][endianness] variant of

96+

[UTF-16][].

93979498

* `'latin1'`: Latin-1 stands for [ISO-8859-1][]. This character encoding only

9599

supports the Unicode characters from `U+0000` to `U+00FF`. Each character is

@@ -132,11 +136,11 @@ The following legacy character encodings are also supported:

132136

* `'binary'`: Alias for `'latin1'`. See [binary strings][] for more background

133137

on this topic. The name of this encoding can be very misleading, as all of the

134138

encodings listed here convert between strings and binary data. For converting

135-

between strings and `Buffer`s, typically `'utf-8'` is the right choice.

139+

between strings and `Buffer`s, typically `'utf8'` is the right choice.

136140137-

* `'ucs2'`: Alias of `'utf16le'`. UCS-2 used to refer to a variant of UTF-16

138-

that did not support characters that had code points larger than U+FFFF.

139-

In Node.js, these code points are always supported.

141+

* `'ucs2'`, `'ucs-2'`: Aliases of `'utf16le'`. UCS-2 used to refer to a variant

142+

of UTF-16 that did not support characters that had code points larger than

143+

U+FFFF. In Node.js, these code points are always supported.

140144141145

```js

142146

Buffer.from('1ag', 'hex');

@@ -900,7 +904,7 @@ Returns `true` if `encoding` is the name of a supported character encoding,

900904

or `false` otherwise.

901905902906

```js

903-

console.log(Buffer.isEncoding('utf-8'));

907+

console.log(Buffer.isEncoding('utf8'));

904908

// Prints: true

905909906910

console.log(Buffer.isEncoding('hex'));