PHP :: Bug #21226 :: function parse_url() fails
| Bug #21226 | function parse_url() fails | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Submitted: | 2002-12-27 18:35 UTC | Modified: | 2002-12-30 10:44 UTC |
|
||||||||||
| From: | dev at lechat dot org | Assigned: | iliaa (profile) | |||||||||||
| Status: | Closed | Package: | *URL Functions | |||||||||||
| PHP Version: | 4.3.0 | OS: | w2000 | |||||||||||
| Private report: | No | CVE-ID: | None | |||||||||||
Patches
Pull Requests
History
AllCommentsChangesGit/SVN commits
[2002-12-27 18:47 UTC] dev at lechat dot org
[2002-12-27 18:49 UTC] edink@php.net
[2002-12-27 19:04 UTC] dev at lechat dot org
[2002-12-27 19:21 UTC] dev at lechat dot org
[2002-12-28 00:39 UTC] iliaa@php.net
[2002-12-28 09:10 UTC] dev at lechat dot org
[2002-12-30 02:03 UTC] jmcastagnetto@php.net
Reopening this bug. A closer look at RFC 2396 indicates that: "... This "generic URI" syntax consists of a sequence of four main components: <scheme>://<authority><path>?<query> ... . .. absoluteURI = scheme ":" ( hier_part | opaque_part ) URI that are hierarchical in nature use the slash "/" character for separating hierarchical components. ... ... hier_part = ( net_path | abs_path ) [ "?" query ] net_path = "//" authority [ abs_path ] abs_path = "/" path_segments URI that do not make use of the slash "/" character for separating hierarchical components are considered opaque by the generic URI parser. opaque_part = uric_no_slash *uric uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," ..." Later in section 3.3 of that RFC the syntax of the path component is clarified. Similar clarification is made in section 3.2 on what is considered as a correct authority component. Bottomline the $url given by the bug reporter is mostly conformant to being a hierarchical URI in nature, although not the usual case. As section 3.2 that deals w/ the authority component states that: "... The authority component is preceded by a double slash "//" and is terminated by the next slash "/", question-mark "?", or by the end of the URI. Within the authority component, the characters ";", ":", "@", "?", and "/" are reserved. ..." And that is reinforced in the BNF syntax later in the RFC. Not sure if all web servers will interpret correctly a URL w/o a path but w/ a query part immediately after the authority part, in view of the fact that the "/' in the path is usually internally mapped by the server to wherever the physical files are in the filesystem. The following code works as expected: $url = "http://user:passwd@www.example.com:8080/foo.php?bar=1&boom=0"; print_r(parse_url($url)); Giving as output: Array ( [scheme] => http [host] => www.example.com [port] => 8080 [user] => user [pass] => passwd [path] => /foo.php [query] => bar=1&boom=0 ) Tested w/ current CVS head on a RH Linux 6.1 machine: $ php_cvs -v PHP 4.4.0-dev (cli) (built: Dec 27 2002 14:00:56) Copyright (c) 1997-2002 The PHP Group Zend Engine v1.4.0, Copyright (c) 1998-2002 Zend Technologies as well as 4.3.0 (on the same OS) $ php -v PHP 4.3.0 (cli) (built: Dec 29 2002 23:59:53) Copyright (c) 1997-2002 The PHP Group Zend Engine v1.3.0, Copyright (c) 1998-2002 Zend Technologies[2002-12-30 02:31 UTC] pollita@php.net
[2002-12-30 04:48 UTC] gp at ccf dot fr
[2002-12-30 10:44 UTC] iliaa@php.net