Hi List,
Just a question to ask the list.
Alex posted these fixes to libraries/sqlparser.lib.php3: @@ -171,6 +171,10 @@ if (!defined('PMA_SQP_LIB_INCLUDED')) { return $sql; }
+ // rabus: Convert all line feeds to Unix style + $sql = str_replace("\r\n", "\n", $sql); + $sql = str_replace("\r", "\n", $sql); + $len = $GLOBALS['PMA_strlen']($sql); if ($len == 0) { return array();
This changes newline contents for the query before any parsing is done. I think this should be avoided at this point, as it removes some newlines out of string escaped variables. Eg I might have SELECT 'foo\n\r\n\rbar'; # foo, bar, sepereated by TWO lines we should actually preserve the contents of strings as they are.
If anybody can show me a testcase where we actually need this str_replace, then we can revise it, but I can't think of any, so in the name of speed, it should get removed.
@@ -227,7 +231,7 @@ if (!defined('PMA_SQP_LIB_INCLUDED')) { // ANSI style -- if (($c == '#') || (($count2 + 1 < $len) && ($c == '/') && ($sql[$count2 + 1] == '*')) - || (($count2 + 2 < $len) && ($c == '-') && ($sql[$count2 + 1] == '-') && ($sql[$count2 + 2] == ' '))) { + || (($count2 + 2 < $len) && ($c == '-') && ($sql[$count2 + 1] == '-') && ereg("(\n|[space])", $sql[$count2 + 2]))) { $count2++; $pos = 0; $type = 'bad';
Intend that it could match broken ANSI comments of '--\n' in addition to the standard '-- '. Should we support this as it is partially supported by MySQL? I say partially, because the MySQL commandline client strips out anything matching '--.*\n', but the actualy MySQL server if it is sent raw commands throws a syntax error on '--\n'. I would say that we should follow the MySQL server in handling the input.
If we do keep it, lets change it to use only direct comparisions instead of regular expressions. Regex should be avoided inside the parser, as it makes code much slower.
-----Original Message----- From: Robin Johnson
Alex posted these fixes to libraries/sqlparser.lib.php3: @@ -171,6 +171,10 @@ if (!defined('PMA_SQP_LIB_INCLUDED')) { return $sql; }
// rabus: Convert all line feeds to Unix style
$sql = str_replace("\r\n", "\n", $sql);
$sql = str_replace("\r", "\n", $sql);
$len = $GLOBALS['PMA_strlen']($sql); if ($len == 0) { return array();
This changes newline contents for the query before any parsing is done. I think this should be avoided at this point, as it removes some newlines out of string escaped variables. Eg I might have SELECT 'foo\n\r\n\rbar'; # foo, bar, sepereated by TWO lines we should actually preserve the contents of strings as they are.
Hmm, my code would convert your "\n\r\n\r" into "\n\n\n"... But, where is a line feed style like "\n\r" used? I only know about these three ones:
Unix: "\n" Mac: "\r" Windows: "\r\n"
As I tested my code about the ANSI style comment variation, I noticed that your code in line 243 got problems with my Windows line feeds...
$pos = $GLOBALS['PMA_strpos']($sql, "\n", $count2);
imho, it would be annoying to handle all types of line feeds whenever we look for one. This is why I added these two lines at the beginning of the function.
If anybody can show me a testcase where we actually need this str_replace, then we can revise it, but I can't think of any, so in the name of speed, it should get removed.
Test your parser on a Windows or Mac client. You will see, without my code, it behaves a little different since it expects only Unix line feeds.
@@ -227,7 +231,7 @@ if (!defined('PMA_SQP_LIB_INCLUDED')) { // ANSI style -- if (($c == '#') || (($count2 + 1 < $len) && ($c == '/') && ($sql[$count2 + 1] == '*'))
|| (($count2 + 2 < $len) && ($c == '-') &&
($sql[$count2 + 1] == '-') && ($sql[$count2 + 2] == ' '))) {
|| (($count2 + 2 < $len) && ($c == '-') &&
- ($sql[$count2 + 1] == '-') && ereg("(\n|[space])",
$sql[$count2 + 2]))) { $count2++; $pos = 0; $type = 'bad';
Intend that it could match broken ANSI comments of '--\n' in addition to the standard '-- '. Should we support this as it is partially supported by MySQL? I say partially, because the MySQL commandline client strips out anything matching '--.*\n', but the actualy MySQL server if it is sent raw commands throws a syntax error on '--\n'.
I does NOT! I tested it myself here and MySQL server treated "--\n" like a comment. I tested this behavior on MySQL 3.23.51 (Linux x86) and 4.0.4-beta (WindowsNT)
I would say that we should follow the MySQL server in handling the input.
That's what I did. Without my code, the parser treated "--\n" like SQL code and tried to parse it...!
If we do keep it, lets change it to use only direct comparisions instead of regular expressions. Regex should be avoided inside the parser, as it makes code much slower.
OK, I'll change it since the regex used here is a very simple one.
Sorry for touching your holy parser code ;-)
Alexander
On Fri, Nov 01, 2002 at 09:28:36AM +0100, Rabus wrote:
Hmm, my code would convert your "\n\r\n\r" into "\n\n\n"... But, where is a line feed style like "\n\r" used? I only know about these three ones:
Unix: "\n" Mac: "\r" Windows: "\r\n"
PalmPilot: "\n\r"
Yes, I was sitting at a caf� downtown, drinking coffee, and testing out phpMyAdmin on my PalmPilot. Using Blazer as the web browser. Wireless rocks! (I just bought wireless 3 days ago, second hand off somebody, and i've had my Handspring Visor Edge for about a month now).
As I tested my code about the ANSI style comment variation, I noticed that your code in line 243 got problems with my Windows line feeds...
$pos = $GLOBALS['PMA_strpos']($sql, "\n", $count2);
imho, it would be annoying to handle all types of line feeds whenever we look for one. This is why I added these two lines at the beginning of the function.
Ok, for that bit we need it. However removing the \n or \r anything inside a string is expressly out. I actually have databases where I want that data preserved properly. So the only remaining solution is to write a little function that imitates strpos looking for a line end character.
Test your parser on a Windows or Mac client. You will see, without my code, it behaves a little different since it expects only Unix line feeds.
Windows is normally my main test platform. I don't have a mac at home.
@@ -227,7 +231,7 @@ if (!defined('PMA_SQP_LIB_INCLUDED')) { // ANSI style -- if (($c == '#') || (($count2 + 1 < $len) && ($c == '/') && ($sql[$count2 + 1] == '*'))
|| (($count2 + 2 < $len) && ($c == '-') &&
($sql[$count2 + 1] == '-') && ($sql[$count2 + 2] == ' '))) {
|| (($count2 + 2 < $len) && ($c == '-') &&
- ($sql[$count2 + 1] == '-') && ereg("(\n|[space])",
$sql[$count2 + 2]))) { $count2++; $pos = 0; $type = 'bad';
Intend that it could match broken ANSI comments of '--\n' in addition to the standard '-- '. Should we support this as it is partially supported by MySQL? I say partially, because the MySQL commandline client strips out anything matching '--.*\n', but the actualy MySQL server if it is sent raw commands throws a syntax error on '--\n'.
I does NOT! I tested it myself here and MySQL server treated "--\n" like a comment. I tested this behavior on MySQL 3.23.51 (Linux x86) and 4.0.4-beta (WindowsNT)
How did you test it? a raw mysql query, or using phpMyAdmin? I used a raw mysql query in PHP, and got this error: You have an error in your SQL syntax near '; -- SELECT 1' at line 1
Test code: <?php $con = mysql_connect('localhost','test','testing'); $res = mysql_query("SELECT 2; --\n SELECT 1;"); echo "E: ".mysql_errno()." ".mysql_error(). "\n"; ?>
Sorry for touching your holy parser code ;-)
LOL. It was something that I was allowed to work on while I was getting paid, my boss said sure, go and work on it, he didn't mind contributing back to Open Source. I'm busy researching more stuff myself. What I called parser originally, is actually a tokenizer. I'm trying to find a more efficent way to design it now. Unfortunetly, Donald Knuth is only due to write about tokenizers in 2010, and I haven't found many texts on it at all. For what it should do, my tokenizer is hideously slow. I was playing with debug code in MySQL, and found that for the whole MySQL server to tokenizer and analyze the query, it's approx 48x faster than my system on large queries, and about 8x faster on small queries.
It looks like they have some really good ideas for doing it fast in there, so I'm doing to see what I can get out of it in the next month or two.
-----Original Message----- From: Robin Johnson
On Fri, Nov 01, 2002 at 09:28:36AM +0100, Rabus wrote:
Hmm, my code would convert your "\n\r\n\r" into "\n\n\n"... But, where is a line feed style like "\n\r" used? I only know about these three ones:
Unix: "\n" Mac: "\r" Windows: "\r\n"
PalmPilot: "\n\r"
Argh!!! I'd suggest to burn all Palms... ;-) So, adding another line should magically solve the problem:
$sql = str_replace("\r\n", "\n", $sql); + $sql = str_replace("\n\r", "\n", $sql); $sql = str_replace("\r", "\n", $sql);
As I tested my code about the ANSI style comment variation, I noticed that your code in line 243 got problems with my Windows line feeds...
$pos = $GLOBALS['PMA_strpos']($sql, "\n", $count2);
imho, it would be annoying to handle all types of line feeds whenever we look for one. This is why I added these two lines at the beginning of the function.
Ok, for that bit we need it. However removing the \n or \r anything inside a string is expressly out. I actually have databases where I want that data preserved properly. So the only remaining solution is to write a little function that imitates strpos looking for a line end character.
Well, is the query passed by reference to the parser? My code should only change the query string the parser uses for ist work. The one that will be send to the server should not be affected, should it?
Intend that it could match broken ANSI comments of '--\n' in addition to the standard '-- '. Should we support this as it is partially supported by MySQL? I say partially, because the MySQL commandline client strips out anything matching '--.*\n', but the actualy MySQL server if it is sent raw commands throws a syntax error on '--\n'.
I does NOT! I tested it myself here and MySQL server treated "--\n" like a comment. I tested this behavior on MySQL 3.23.51 (Linux x86) and 4.0.4-beta (WindowsNT)
How did you test it? a raw mysql query, or using phpMyAdmin?
Both with the same result.
I used a raw mysql query in PHP, and got this error: You have an error in your SQL syntax near '; -- SELECT 1' at line 1
Test code:
<?php $con = mysql_connect('localhost','test','testing'); $res = mysql_query("SELECT 2; --\n SELECT 1;"); echo "E: ".mysql_errno()." ".mysql_error(). "\n"; ?>
Confirmed! Looks like we've found a MySQL bug here, because
mysql_query("SELECT 2; -- SELECT 1;");
throws an error, too, although it's not meant to! It appears to me as if MySQL only accepts ANSI comments before a real SQL command. Try this one:
mysql_query("--\n-- This is a comment\n--\nSELECT 'foo';");
This one works fine.
Alexander
List:
I read recently that MySQL only supports the -- comment syntax when the -- is followed by a space.
Hope this helps.
Jay Davis
-----Original Message----- From: Jay Davis
I read recently that MySQL only supports the -- comment syntax when the -- is followed by a space.
You're right, but outside of quoted areas, MySQL treats line feeds like spaces. This might be the reason why it accepts comments like "--\n".
Alexander