2014-10-13 12:21 GMT+02:00 Hugues Peccatte <hugues.peccatte@gmail.com>:

2014-10-12 21:06 GMT+02:00 Marc Delisle <marc@infomarc.info>:
Le 2014-10-12 12:57, Hugues Peccatte a écrit :
> 2014-10-05 20:36 GMT+02:00 Hugues Peccatte <hugues.peccatte@gmail.com
> <mailto:hugues.peccatte@gmail.com>>:
>
> 2014-10-04 9:01 GMT+02:00 Hugues Peccatte <hugues.peccatte@gmail.com
> <mailto:hugues.peccatte@gmail.com>>:
>
> Le 4 oct. 2014 03:22, "Madhura Jayaratne" <madhura.cj@gmail.com
> <mailto:madhura.cj@gmail.com>> a écrit :
>
>
> >
> >
> >
> > On Sat, Oct 4, 2014 at 1:24 AM, Hugues Peccatte
> <hugues.peccatte@gmail.com <mailto:hugues.peccatte@gmail.com>>
> wrote:
> >>
> >> 2014-10-03 12:26 GMT+02:00 Marc Delisle <marc@infomarc.info
> <mailto:marc@infomarc.info>>:

> >>>
> >>> Hi Hugues,
> >>> I retested this morning on a laptop, importing a SQL file
> containing
> >>> 10000 employees from the sample employees database. This is
> a small file
> >>> (660 KB).
> >>>
> >>> Current master: 3 min 25 sec (and ends with JSON.parse:
> unexpected
> >>> character)
> >>>
> >>> Current Tithugues/stringFunctions_master: 2 min 10 sec (same
> js error)
> >>>
> >>> Current QA_4_2: 0 min 5 sec
> >>>
> >>> There has been improvement, but we cannot release 4.3 with
> this import
> >>> speed.
> >>>
> >>> --
> >>> Marc Delisle | phpMyAdmin
> >>
> >>
> >> Hi,
> >>
> >> I agree… But I'm afraid this is linked to multibytes functions…
> >> Maybe we shouldn't use the multibytes functions everywhere…
> >>
> >> I'll still try to improve performances.
> >>
> >> Hugues.
> >>
> >
> > Indeed, I also think that we should use mb_* function only
> when necessary and choice to use them should be made in case by
> case basis.
> >
> > --
> > Thanks and Regards,
> >
> > Madhura Jayaratne
>
> Hi,
>
> I didn't push my commits, but that's what I've started. I
> replaced the mb_* calls by standard calls on configuration
> variables, reserved words, etc.
>
> Hugues.
>
>
> Hi,
>
> Out of desperation, I try another algorithm. Instead of buffering
> data until SQL delimiter, I'll try to parse all lines.
> So, I won't parse 1000 times a buffer of 50000 characters, but less
> than 10 times many buffers of 500 characters. I hope this will be
> faster.
>
> Hugues.
>
>
> Hi,
>
> The new algorithm is over. There are still some controls to add, but it
> is usable with the file in this ticket: [1]
> You can find my modifications here: [2]
>
> Marc, is it faster for you ?
> It seems that I won ~33% of time. We're still far from 5 seconds…
> Maybe I'll try to use standard PHP functions to see the difference. If
> the standard PHP functions are really faster, I'll try to add an option
> to use mb_* functions or standard PHP functions, as you said.
>
> [1] https://sourceforge.net/p/phpmyadmin/bugs/4536/
> [2]
> https://github.com/Tithugues/phpmyadmin/tree/stringFunctions_useStandardFunctions_master

Hi Hugues,
yes it's faster. With the same testing conditions, the import takes 1
min 20 sec.

--
Marc Delisle | phpMyAdmin

Thanks for your feedback.
I'll try another another improvement to be faster.

Note for my self:
* read X characters but don't restart the search from 0 each time
* search for the escaped quote with a lookbehind expression, something like `(?!<\\)(\\\\)*'`

Hugues.

Hi,

As asked by Marc, I added on option to import by reading as a multibytes string or not.

The default configuration won't read as multibytes string (because it's too long…). It seems that the DnD to import doesn't use the default configuration, so what ever you define as default, it won't be use in this process.

Should we create a ticket for this ? I think it's possible to get it in javascript.

Hugues.