splitText doesn't handle multibyte characters
| Bug #46335 | DOMText::splitText doesn't handle multibyte characters | ||||
|---|---|---|---|---|---|
| Submitted: | 2008-10-17 17:10 UTC | Modified: | 2008-10-20 12:47 UTC | ||
| From: | sites at hubmed dot org | Assigned: | rrichards (profile) | ||
| Status: | Closed | Package: | DOM XML related | ||
| PHP Version: | 5.2.6 | OS: | OS X | ||
| Private report: | No | CVE-ID: | None | ||
[2008-10-17 17:10 UTC] sites at hubmed dot org
Description:
------------
Using the DOMText function splitText() on a text node containing multibyte characters results in the node being split at the wrong position.
Reproduce code:
---------------
$text = 'This is an ?example? of using DOM splitText';
$start = 30;
$length = 3;
$dom = new DOMDocument('1.0', 'UTF-8');
$node = $dom->createTextNode($text);
$dom->appendChild($node);
print "Text: $node->textContent\n";
print 'Expected (mb_substr): ' . mb_substr($text, $start, $length, 'UTF-8') . "\n";
$matched = $node->splitText($start);
$matched->splitText($length);
print "Actual (splitText): $matched->textContent\n";
Expected result:
----------------
Text: This is an ?example? of using DOM splitText
Expected (mb_substr): DOM
Actual (splitText): DOM
Actual result:
--------------
Text: This is an ?example? of using DOM splitText
Expected (mb_substr): DOM
Actual (splitText): ing
Patches
Pull Requests
History
AllCommentsChangesGit/SVN commits
[2008-10-17 18:18 UTC] tularis@php.net
[2008-10-17 19:22 UTC] rrichards@php.net
[2008-10-20 12:47 UTC] rrichards@php.net