2014-10-12 21:06 GMT+02:00 Marc Delisle marc@infomarc.info:
Le 2014-10-12 12:57, Hugues Peccatte a écrit :
2014-10-05 20:36 GMT+02:00 Hugues Peccatte <hugues.peccatte@gmail.com mailto:hugues.peccatte@gmail.com>:
2014-10-04 9:01 GMT+02:00 Hugues Peccatte <hugues.peccatte@gmail.com <mailto:hugues.peccatte@gmail.com>>: Le 4 oct. 2014 03:22, "Madhura Jayaratne" <madhura.cj@gmail.com <mailto:madhura.cj@gmail.com>> a écrit : > > > > On Sat, Oct 4, 2014 at 1:24 AM, Hugues Peccatte <hugues.peccatte@gmail.com <mailto:hugues.peccatte@gmail.com>> wrote: >> >> 2014-10-03 12:26 GMT+02:00 Marc Delisle <marc@infomarc.info <mailto:marc@infomarc.info>>: >>> >>> Hi Hugues, >>> I retested this morning on a laptop, importing a SQL file containing >>> 10000 employees from the sample employees database. This is a small file >>> (660 KB). >>> >>> Current master: 3 min 25 sec (and ends with JSON.parse: unexpected >>> character) >>> >>> Current Tithugues/stringFunctions_master: 2 min 10 sec (same js error) >>> >>> Current QA_4_2: 0 min 5 sec >>> >>> There has been improvement, but we cannot release 4.3 with this import >>> speed. >>> >>> -- >>> Marc Delisle | phpMyAdmin >> >> >> Hi, >> >> I agree… But I'm afraid this is linked to multibytes
functions…
>> Maybe we shouldn't use the multibytes functions everywhere… >> >> I'll still try to improve performances. >> >> Hugues. >> > > Indeed, I also think that we should use mb_* function only when necessary and choice to use them should be made in case by case basis. > > -- > Thanks and Regards, > > Madhura Jayaratne Hi, I didn't push my commits, but that's what I've started. I replaced the mb_* calls by standard calls on configuration variables, reserved words, etc. Hugues. Hi, Out of desperation, I try another algorithm. Instead of buffering data until SQL delimiter, I'll try to parse all lines. So, I won't parse 1000 times a buffer of 50000 characters, but less than 10 times many buffers of 500 characters. I hope this will be faster. Hugues.
Hi,
The new algorithm is over. There are still some controls to add, but it is usable with the file in this ticket: [1] You can find my modifications here: [2]
Marc, is it faster for you ? It seems that I won ~33% of time. We're still far from 5 seconds… Maybe I'll try to use standard PHP functions to see the difference. If the standard PHP functions are really faster, I'll try to add an option to use mb_* functions or standard PHP functions, as you said.
https://github.com/Tithugues/phpmyadmin/tree/stringFunctions_useStandardFunc...
Hi Hugues, yes it's faster. With the same testing conditions, the import takes 1 min 20 sec.
-- Marc Delisle | phpMyAdmin
Thanks for your feedback. I'll try another another improvement to be faster.
Note for my self: * read X characters but don't restart the search from 0 each time * search for the escaped quote with a lookbehind expression, something like `(?!<\)(\\)*'`
Hugues.