htmlspecialchars returns latin1 from UTF-8
| Bug #20934 | htmlspecialchars returns latin1 from UTF-8 | ||||
|---|---|---|---|---|---|
| Submitted: | 2002-12-11 06:58 UTC | Modified: | 2002-12-12 07:38 UTC | ||
| From: | renato at cria dot org dot br | Assigned: | |||
| Status: | Closed | Package: | Strings related | ||
| PHP Version: | 4CVS-2002-12-11 (dev) | OS: | Red hat linux 8.0 | ||
| Private report: | No | CVE-ID: | None | ||
[2002-12-11 06:58 UTC] renato at cria dot org dot br
I used the script bellow for testing (calling it from MS Internet Explorer to directly see the xml output).
Calling it without parameters one should see a simple xml document showing a string in latin1.
Calling it with "?charset=utf8", the script correctly converts the string from latin1 to UTF-8 but after using "htmlspecialchars" it goes back to latin1, and the xml becomes invalid. (put a comment on the "htmlspecialchars" line after the character conversion and the xml will show up in UTF-8 without problem).
<?php
$string = "Hello from S?o Paulo";
$charset = isset($_GET["charset"]) ? $_GET["charset"] : "latin1";
if ($charset == "utf8")
{
$charset_code = "UTF-8";
$show_string = mb_convert_encoding($string, "UTF-8", "ISO-8859-1");
$show_string = htmlspecialchars($show_string, ENT_COMPAT, "UTF-8");
}
else
{
$charset_code = "ISO-8859-1";
$show_string = htmlspecialchars($string);
}
header ("Content-type: text/xml");
echo "<?xml version='1.0' encoding='$charset_code' ?>\n";
?>
<test>
<?php print($show_string); ?>
</test>
Patches
Pull Requests
History
AllCommentsChangesGit/SVN commits
[2002-12-12 07:10 UTC] moriyoshi@php.net
[2002-12-12 07:19 UTC] wez@php.net
[2002-12-12 07:38 UTC] moriyoshi@php.net