[PATCH] strip_tags() truncates rest of string with invalid attribute
| Bug #45599 | [PATCH] strip_tags() truncates rest of string with invalid attribute | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Submitted: | 2008-07-22 23:37 UTC | Modified: | 2009-12-22 02:04 UTC |
|
||||||||||
| From: | david at grudl dot com | Assigned: | ||||||||||||
| Status: | Closed | Package: | Strings related | |||||||||||
| PHP Version: | 5.*, 6 | OS: | * | |||||||||||
| Private report: | No | CVE-ID: | None | |||||||||||
[2008-07-22 23:37 UTC] david at grudl dot com
Description:
------------
Problematic backslash in HTML attribute (bug exists since PHP 5.2.2)
Reproduce code:
---------------
1)
echo strip_tags('Hello <a href="any\\"> World');
2) this case is not HTML valid, but who cares...
echo strip_tags('Hello <a href=\"any"> World');
Expected result:
----------------
Hello World
(in both cases)
Actual result:
--------------
Hello
(in both cases)
Patches
Pull Requests
History
AllCommentsChangesGit/SVN commits
[2008-07-30 04:42 UTC] jet at synth-tec dot com
I am having the same problem. If an attribute has an extra quote in it, will cut off all the text afterwards. Example Input: ---------------- strip_tags(' text before link <a href="http://google.com"">google.com</a> text after link test 1 test 2 ') Expected Output: ----------------- text before link text after link test 1 test 2 Actual Output: -------------- text before link Note, I do not have this problem in PHP 5.0.4 or previous versions[2008-08-06 16:30 UTC] lbarnaud@php.net
[2008-08-06 16:52 UTC] david at grudl dot com
Character \ is allowed in tag attribute, so strip_tags('Hello <a href="any\"> World') leading to "Hello" (without "World") is bug.[2009-08-24 15:53 UTC] hradtke@php.net
PHP 5.x patch: Index: ext/standard/string.c =================================================================== --- ext/standard/string.c (revision 284189) +++ ext/standard/string.c (working copy) @@ -4367,7 +4367,7 @@ tp = ((tp-tbuf) >= PHP_TAG_BUF_SIZE ? tbuf: tp); *(tp++) = c; } - if (state && p != buf && *(p-1) != '\\' && (!in_q || *p == in_q)) { + if (state && p != buf && (state == 1 || *(p-1) != '\\') && (!in_q || *p == in_q)) { if (in_q) { in_q = 0; } else { Trunk patch: Index: ext/standard/string.c =================================================================== --- ext/standard/string.c (revision 284189) +++ ext/standard/string.c (working copy) @@ -6519,7 +6519,7 @@ tp = ((tp-tbuf) >= UBYTES(PHP_TAG_BUF_SIZE) ? tbuf: tp); *(tp++) = ch; } - if (state && prev1 != 0x5C /*'\\'*/ && (!in_q || ch == in_q)) { + if (state && (state ==1 || prev1 != 0x5C /*'\\'*/) && (!in_q || ch == in_q)) { if (in_q) { in_q = 0; } else { @@ -6763,7 +6763,7 @@ tp = ((tp-tbuf) >= PHP_TAG_BUF_SIZE ? tbuf: tp); *(tp++) = c; } - if (state && p != buf && *(p-1) != '\\' && (!in_q || *p == in_q)) { + if (state && p != buf && (state ==1 || *(p-1) != '\\') && (!in_q || *p == in_q)) { if (in_q) { in_q = 0; } else { Test case: --TEST-- Bug #45599 (strip_tags() ignore backslash (\) character inside html tags) --FILE-- <?php echo strip_tags('Hello <a href="any\"> World') . "\n"; echo strip_tags('Hello <a href="any\\"> World') . "\n"; echo strip_tags('Hello <a href=\"any"> World'); ?> --EXPECT-- Hello World Hello World Hello World[2009-12-22 02:04 UTC] iliaa@php.net