Re: [Phpmyadmin-devel] #4536 - master: import problem (PMA_String)

13 Oct 2014

      2014-10-12 21:06 GMT+02:00 Marc Delisle <marc@infomarc.info>:
...
Le 2014-10-12 12:57, Hugues Peccatte a écrit :
...
2014-10-05 20:36 GMT+02:00 Hugues Peccatte <hugues.peccatte@gmail.com
<mailto:hugues.peccatte@gmail.com>>:
2014-10-04 9:01 GMT+02:00 Hugues Peccatte <hugues.peccatte@gmail.com
    <mailto:hugues.peccatte@gmail.com>>:
Le 4 oct. 2014 03:22, "Madhura Jayaratne" <madhura.cj@gmail.com
        <mailto:madhura.cj@gmail.com>> a écrit :
>
        >
        >
        > On Sat, Oct 4, 2014 at 1:24 AM, Hugues Peccatte
        <hugues.peccatte@gmail.com <mailto:hugues.peccatte@gmail.com>>
        wrote:
        >>
        >> 2014-10-03 12:26 GMT+02:00 Marc Delisle <marc@infomarc.info
        <mailto:marc@infomarc.info>>:
        >>>
        >>> Hi Hugues,
        >>> I retested this morning on a laptop, importing a SQL file
        containing
        >>> 10000 employees from the sample employees database. This is
        a small file
        >>> (660 KB).
        >>>
        >>> Current master: 3 min 25 sec (and ends with JSON.parse:
        unexpected
        >>> character)
        >>>
        >>> Current Tithugues/stringFunctions_master: 2 min 10 sec (same
        js error)
        >>>
        >>> Current QA_4_2: 0 min 5 sec
        >>>
        >>> There has been improvement, but we cannot release 4.3 with
        this import
        >>> speed.
        >>>
        >>> --
        >>> Marc Delisle | phpMyAdmin
        >>
        >>
        >> Hi,
        >>
        >> I agree… But I'm afraid this is linked to multibytes
functions…
        >> Maybe we shouldn't use the multibytes functions everywhere…
        >>
        >> I'll still try to improve performances.
        >>
        >> Hugues.
        >>
        >
        > Indeed, I also think that we should use mb_* function only
        when necessary and choice to use them should be made in case by
        case basis.
        >
        > --
        > Thanks and Regards,
        >
        > Madhura Jayaratne
Hi,
I didn't push my commits, but that's what I've started. I
        replaced the mb_* calls by standard calls on configuration
        variables, reserved words, etc.
Hugues.
Hi,
Out of desperation, I try another algorithm. Instead of buffering
    data until SQL delimiter, I'll try to parse all lines.
    So, I won't parse 1000 times a buffer of 50000 characters, but less
    than 10 times many buffers of 500 characters. I hope this will be
    faster.
Hugues.
Hi,
The new algorithm is over. There are still some controls to add, but it
is usable with the file in this ticket: [1]
You can find my modifications here: [2]
Marc, is it faster for you ?
It seems that I won ~33% of time. We're still far from 5 seconds…
Maybe I'll try to use standard PHP functions to see the difference. If
the standard PHP functions are really faster, I'll try to add an option
to use mb_* functions or standard PHP functions, as you said.
[1] https://sourceforge.net/p/phpmyadmin/bugs/4536/
[2]
https://github.com/Tithugues/phpmyadmin/tree/stringFunctions_useStandardFunc...
Hi Hugues,
yes it's faster. With the same testing conditions, the import takes 1
min 20 sec.
...
--
Marc Delisle | phpMyAdmin
Thanks for your feedback.
I'll try another another improvement to be faster.

Note for my self:
* read X characters but don't restart the search from 0 each time
* search for the escaped quote with a lookbehind expression, something like
`(?!<\\)(\\\\)*'`

Hugues.