[rt-devel] [fsck.com #3201] Inefficient SQL generation
Paul Wagland
pwagland at allshare.nl
Thu Aug 7 04:46:11 EDT 2003
Hi Jesse,
The patch that I got from Robrt was actually for something different... The
patch that I got was for issue 2490 (Problem with custom field searching in
RT 3.0.2pre4). Although issue 3200 seems to supersede this (perhaps a link
between the two can be added?)
However, you got me to thinking, and so I went searching for the work that
Robrt had done, and came across issue 3201 (Use less joins for Custom
Fields) which you have already applied as change 229.
This does fix the problem! :-) However please continue reading the e-mail!
:-)
This also means that Issue 3287 should be (at least) linked to 3201, and
possibly marked as resolved as well.
However, one (minor) point is that the SQL that does get generated looks as
follows:
SELECT count(main.id)
FROM ((((((Tickets main
LEFT JOIN TicketCustomFieldValues as TicketCustomFieldValues_1
ON ( main.id = TicketCustomFieldValues_1.Ticket)
AND( (TicketCustomFieldValues_1.CustomField = '40')
OR (TicketCustomFieldValues_1.CustomField = '40')
OR (TicketCustomFieldValues_1.CustomField = '40')
OR (TicketCustomFieldValues_1.CustomField = '40')
OR (TicketCustomFieldValues_1.CustomField = '40')
OR (TicketCustomFieldValues_1.CustomField = '40')
OR (TicketCustomFieldValues_1.CustomField = '40')))
WHERE main.EffectiveId = main.id
AND main.Type = 'ticket'
AND ( ( TicketCustomFieldValues_1.Content = '49'
OR TicketCustomFieldValues_1.Content = '94'
OR TicketCustomFieldValues_1.Content = '97'
OR TicketCustomFieldValues_1.Content = '110'
OR TicketCustomFieldValues_1.Content = '71'
OR TicketCustomFieldValues_1.Content = '53' )
AND main.Queue = '5' );
This leads to only a very minor inefficiency as the postgres query optimiser
just removes the duplicates, but still it looks strange :-)
However, as mentioned in #3287, at least for postgres (I don't use mysql so
I cannot speak for it) if you can replace the left join with an inner join,
i.e. just by replacing the word 'LEFT' above with the word 'INNER'. In this
particular case we know that this is safe since both sides have to exist? I
have done that locally, and the difference is noticeable. I did this by
applying Robrt's patch and then changing line 729 to say 'inner' instead of
'left'. As mentioned before, I do not know what difference this will have on
MySQL, but on postgres, the difference is dramatic (the query is roughly
2500 times faster, 7ms instead of 1700ms)
I had a look at how to remove the repeated
(TicketCustomFieldValues_1.CustomField = '40') tests, and have also attached
a patch for that (I have also attached this patch to issue 3201).
Anyway, once again Jesse thanks for a great product, and for being
responsive to the various issues that we have thrown up.
Cheers,
Paul
> -----Original Message-----
> From: Jesse Vincent [mailto:jesse at bestpractical.com]
> Sent: Wednesday, August 06, 2003 19:13
> To: Paul Wagland
> Subject: Re: [rt-devel] Inefficient SQL generation
>
>
> This is even with robrt's latest searchbuilder patch that he sent you?
>
>
> On Wed, Aug 06, 2003 at 11:31:09AM +0200, Paul Wagland wrote:
> > Hi all,
> >
> > (Appologies for the HTML mail, but it made the formatting below much
> > easier...)
> >
> > Well, I haven't seen anything on the list for a little while,
> so I thought I
> > would chime in :-)
> >
> > When I am searching on multiple custom fields there is a
> serious performance
> > problem. The query is show on screen as follows:
> >
> > Current search criteria
> > Queue = Client Issues [delete]
> > CF.Client Issues.{Client Reference} = 94 [delete]
> > CF.Client Issues.{Client Reference} = 49 [delete]
> > CF.Client Issues.{Client Reference} = 97 [delete]
> > CF.Client Issues.{Client Reference} = 110 [delete]
> > CF.Client Issues.{Client Reference} = 71 [delete]
> > CF.Client Issues.{Client Reference} = 53 [delete]
> >
> >
> > The SQL that it generates (with seachbuilder 0.86 and 0.89_2) is:
> > (formatted to make it easier to read, and some extraneous
> brackets removed
> > for the same reason)
> >
> > SELECT count(main.id)
> > FROM ((((((Tickets main
> > LEFT JOIN TicketCustomFieldValues as TicketCustomFieldValues_6
> > ON ( main.id = TicketCustomFieldValues_6.Ticket)
> > AND( (TicketCustomFieldValues_6.CustomField = '40')))
> > LEFT JOIN TicketCustomFieldValues as TicketCustomFieldValues_3
> > ON ((TicketCustomFieldValues_3.CustomField = '40'))
> > AND( main.id = TicketCustomFieldValues_3.Ticket))
> > LEFT JOIN TicketCustomFieldValues as TicketCustomFieldValues_2
> > ON ( main.id = TicketCustomFieldValues_2.Ticket)
> > AND( (TicketCustomFieldValues_2.CustomField = '40')))
> > LEFT JOIN TicketCustomFieldValues as TicketCustomFieldValues_5
> > ON ((TicketCustomFieldValues_5.CustomField = '40'))
> > AND( main.id = TicketCustomFieldValues_5.Ticket))
> > LEFT JOIN TicketCustomFieldValues as TicketCustomFieldValues_4
> > ON ( main.id = TicketCustomFieldValues_4.Ticket)
> > AND( (TicketCustomFieldValues_4.CustomField = '40')))
> > LEFT JOIN TicketCustomFieldValues as TicketCustomFieldValues_1
> > ON ( main.id = TicketCustomFieldValues_1.Ticket)
> > AND( (TicketCustomFieldValues_1.CustomField = '40')))
> > WHERE main.EffectiveId = main.id
> > AND main.Type = 'ticket'
> > AND ( ( TicketCustomFieldValues_1.Content = '49'
> > OR TicketCustomFieldValues_2.Content = '94'
> > OR TicketCustomFieldValues_3.Content = '97'
> > OR TicketCustomFieldValues_4.Content = '110'
> > OR TicketCustomFieldValues_5.Content = '71'
> > OR TicketCustomFieldValues_6.Content = '53' )
> > AND main.Queue = '5' );
> >
> > When this gets thrown against our current (postgres) database,
> it takes 29
> > seconds to run. An almost identical query is then executed to
> actually get
> > the rows that were just counted. The problem here is clear to
> see: six left
> > joins of the same table.
> >
> > When this is simplified to the following:
> >
> > SELECT count(main.id)
> > FROM (Tickets main
> > LEFT JOIN TicketCustomFieldValues as TicketCustomFieldValues_1
> > ON ( main.id = TicketCustomFieldValues_6.Ticket)
> > AND(TicketCustomFieldValues_6.CustomField = '40'))
> > WHERE (main.EffectiveId = main.id)
> > AND main.Type = 'ticket'
> > AND ( ( TicketCustomFieldValues_1.Content = '49'
> > OR TicketCustomFieldValues_2.Content = '94'
> > OR TicketCustomFieldValues_3.Content = '97'
> > OR TicketCustomFieldValues_4.Content = '110'
> > OR TicketCustomFieldValues_5.Content = '71'
> > OR TicketCustomFieldValues_6.Content = '53' )
> > AND main.Queue = '5' );
> >
> > The query goes down to 6.7 seconds. Already a significant
> improvement. If we
> > can further simplify this to:
> >
> > SELECT count(main.id)
> > FROM (Tickets main
> > LEFT JOIN TicketCustomFieldValues as TicketCustomFieldValues_1
> > ON ( main.id = TicketCustomFieldValues_1.Ticket)
> > AND(TicketCustomFieldValues_1.CustomField = '40'))
> > WHERE (main.EffectiveId = main.id)
> > AND main.Type = 'ticket'
> > AND ( TicketCustomFieldValues_1.Content = '49'
> > OR TicketCustomFieldValues_1.Content = '94'
> > OR TicketCustomFieldValues_1.Content = '97'
> > OR TicketCustomFieldValues_1.Content = '110'
> > OR TicketCustomFieldValues_1.Content = '71'
> > OR TicketCustomFieldValues_1.Content = '53' )
> > AND main.Queue = '5';
> >
> > The execution time goes down to 4.8 seconds. Better still.
> However, if we
> > can simplify this to the following:
> >
> > SELECT count(main.id)
> > FROM Tickets main,
> > TicketCustomFieldValues as TicketCustomFieldValues_1
> > WHERE main.id = TicketCustomFieldValues_1.Ticket
> > AND TicketCustomFieldValues_1.CustomField = '40'
> > AND main.EffectiveId = main.id
> > AND main.Type = 'ticket'
> > AND ( TicketCustomFieldValues_1.Content = '49'
> > OR TicketCustomFieldValues_1.Content = '94'
> > OR TicketCustomFieldValues_1.Content = '97'
> > OR TicketCustomFieldValues_1.Content = '110'
> > OR TicketCustomFieldValues_1.Content = '71'
> > OR TicketCustomFieldValues_1.Content = '53' )
> > AND main.Queue = '5';
> >
> > Thebn we can get the excution times down to 6.4 milliseconds. This is
> > roughly 5000 times faster than the original query.
> >
> > That's the good news! :-) The bad news is that I don't know how
> to modify
> > SearchBuilder to produce this new optimised SQL.
> >
> > But, I do hope that this analysis can help someone else :-)
> >
> > If you want any further information, please mail me,
> >
> > Cheers,
> > Paul
> >
>
>
> --
> http://www.bestpractical.com/rt -- Trouble Ticketing. Free.
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CF_orlimit_inner.patch
Type: application/octet-stream
Size: 1065 bytes
Desc: not available
Url : http://pallas.eruditorum.org/pipermail/rt-devel/attachments/20030807/824ed9df/CF_orlimit_inner.obj
More information about the Rt-devel
mailing list