2014-10-13 12:21 GMT+02:00 Hugues Peccatte hugues.peccatte@gmail.com:
2014-10-12 21:06 GMT+02:00 Marc Delisle marc@infomarc.info:
Le 2014-10-12 12:57, Hugues Peccatte a écrit :
2014-10-05 20:36 GMT+02:00 Hugues Peccatte <hugues.peccatte@gmail.com mailto:hugues.peccatte@gmail.com>:
2014-10-04 9:01 GMT+02:00 Hugues Peccatte <
hugues.peccatte@gmail.com
<mailto:hugues.peccatte@gmail.com>>: Le 4 oct. 2014 03:22, "Madhura Jayaratne" <madhura.cj@gmail.com <mailto:madhura.cj@gmail.com>> a écrit : > > > > On Sat, Oct 4, 2014 at 1:24 AM, Hugues Peccatte <hugues.peccatte@gmail.com <mailto:hugues.peccatte@gmail.com>> wrote: >> >> 2014-10-03 12:26 GMT+02:00 Marc Delisle <marc@infomarc.info <mailto:marc@infomarc.info>>: >>> >>> Hi Hugues, >>> I retested this morning on a laptop, importing a SQL file containing >>> 10000 employees from the sample employees database. This is a small file >>> (660 KB). >>> >>> Current master: 3 min 25 sec (and ends with JSON.parse: unexpected >>> character) >>> >>> Current Tithugues/stringFunctions_master: 2 min 10 sec (same js error) >>> >>> Current QA_4_2: 0 min 5 sec >>> >>> There has been improvement, but we cannot release 4.3 with this import >>> speed. >>> >>> -- >>> Marc Delisle | phpMyAdmin >> >> >> Hi, >> >> I agree… But I'm afraid this is linked to multibytes
functions…
>> Maybe we shouldn't use the multibytes functions everywhere… >> >> I'll still try to improve performances. >> >> Hugues. >> > > Indeed, I also think that we should use mb_* function only when necessary and choice to use them should be made in case by case basis. > > -- > Thanks and Regards, > > Madhura Jayaratne Hi, I didn't push my commits, but that's what I've started. I replaced the mb_* calls by standard calls on configuration variables, reserved words, etc. Hugues. Hi, Out of desperation, I try another algorithm. Instead of buffering data until SQL delimiter, I'll try to parse all lines. So, I won't parse 1000 times a buffer of 50000 characters, but less than 10 times many buffers of 500 characters. I hope this will be faster. Hugues.
Hi,
The new algorithm is over. There are still some controls to add, but it is usable with the file in this ticket: [1] You can find my modifications here: [2]
Marc, is it faster for you ? It seems that I won ~33% of time. We're still far from 5 seconds… Maybe I'll try to use standard PHP functions to see the difference. If the standard PHP functions are really faster, I'll try to add an option to use mb_* functions or standard PHP functions, as you said.
https://github.com/Tithugues/phpmyadmin/tree/stringFunctions_useStandardFunc...
Hi Hugues, yes it's faster. With the same testing conditions, the import takes 1 min 20 sec.
-- Marc Delisle | phpMyAdmin
Thanks for your feedback. I'll try another another improvement to be faster.
Note for my self:
- read X characters but don't restart the search from 0 each time
- search for the escaped quote with a lookbehind expression, something
like `(?!<\)(\\)*'`
Hugues.
Hi,
As asked by Marc, I added on option to import by reading as a multibytes string or not. The default configuration won't read as multibytes string (because it's too long…). It seems that the DnD to import doesn't use the default configuration, so what ever you define as default, it won't be use in this process. Should we create a ticket for this ? I think it's possible to get it in javascript.
Hugues.