-----Original Message----- From: Robin Johnson
Alex posted these fixes to libraries/sqlparser.lib.php3: @@ -171,6 +171,10 @@ if (!defined('PMA_SQP_LIB_INCLUDED')) { return $sql; }
// rabus: Convert all line feeds to Unix style
$sql = str_replace("\r\n", "\n", $sql);
$sql = str_replace("\r", "\n", $sql);
$len = $GLOBALS['PMA_strlen']($sql); if ($len == 0) { return array();
This changes newline contents for the query before any parsing is done. I think this should be avoided at this point, as it removes some newlines out of string escaped variables. Eg I might have SELECT 'foo\n\r\n\rbar'; # foo, bar, sepereated by TWO lines we should actually preserve the contents of strings as they are.
Hmm, my code would convert your "\n\r\n\r" into "\n\n\n"... But, where is a line feed style like "\n\r" used? I only know about these three ones:
Unix: "\n" Mac: "\r" Windows: "\r\n"
As I tested my code about the ANSI style comment variation, I noticed that your code in line 243 got problems with my Windows line feeds...
$pos = $GLOBALS['PMA_strpos']($sql, "\n", $count2);
imho, it would be annoying to handle all types of line feeds whenever we look for one. This is why I added these two lines at the beginning of the function.
If anybody can show me a testcase where we actually need this str_replace, then we can revise it, but I can't think of any, so in the name of speed, it should get removed.
Test your parser on a Windows or Mac client. You will see, without my code, it behaves a little different since it expects only Unix line feeds.
@@ -227,7 +231,7 @@ if (!defined('PMA_SQP_LIB_INCLUDED')) { // ANSI style -- if (($c == '#') || (($count2 + 1 < $len) && ($c == '/') && ($sql[$count2 + 1] == '*'))
|| (($count2 + 2 < $len) && ($c == '-') &&
($sql[$count2 + 1] == '-') && ($sql[$count2 + 2] == ' '))) {
|| (($count2 + 2 < $len) && ($c == '-') &&
- ($sql[$count2 + 1] == '-') && ereg("(\n|[space])",
$sql[$count2 + 2]))) { $count2++; $pos = 0; $type = 'bad';
Intend that it could match broken ANSI comments of '--\n' in addition to the standard '-- '. Should we support this as it is partially supported by MySQL? I say partially, because the MySQL commandline client strips out anything matching '--.*\n', but the actualy MySQL server if it is sent raw commands throws a syntax error on '--\n'.
I does NOT! I tested it myself here and MySQL server treated "--\n" like a comment. I tested this behavior on MySQL 3.23.51 (Linux x86) and 4.0.4-beta (WindowsNT)
I would say that we should follow the MySQL server in handling the input.
That's what I did. Without my code, the parser treated "--\n" like SQL code and tried to parse it...!
If we do keep it, lets change it to use only direct comparisions instead of regular expressions. Regex should be avoided inside the parser, as it makes code much slower.
OK, I'll change it since the regex used here is a very simple one.
Sorry for touching your holy parser code ;-)
Alexander