[rt-users] Custom field queries are very slow after 3.8.8 upgrade (from 3.6.10)
Kenneth Marshall
ktm at rice.edu
Thu Jan 6 08:56:06 EST 2011
On Thu, Jan 06, 2011 at 01:50:19PM +0900, Georgi Georgiev wrote:
> Quoting Kenneth Marshall at 05/01/2011-08:14:10(-0600):
> > On Wed, Jan 05, 2011 at 07:31:37PM +0900, Georgi Georgiev wrote:
> > > I am looking for some advice on how to speed up some queries using
> > > custom fields that got unusably slow after an upgrade from the RT 3.6 series
> > > to 3.8.
> > >
> > > We are currently running RT 3.6.10 on CentOS 5.5 with a postgresql
> > > back end (used to be PgSQL 8.1 but we upgraded to 8.4 when CentOS 5.5
> > > came out). The RT package is from EPEL and everything is fine.
> > >
> > > When trying to upgrade to RT 3.8.8 some of our saved queries using
> > > custom fields get very, *very* slow. I tried this twelve months ago with
> > > RT 3.8.7 on CentOS 5.x and PgSQL 8.1 (whatever was "current" then), and
> > > also again today, on RHEL6 with PgSQL 8.4.
> > >
> > > The problematic queries look something like this:
> > >
> > > Queue = 'somequeue' AND Status = 'stalled' AND (
> > > 'CF.{MyCF}' = 'value1' OR
> > > 'CF.{MyCF}' = 'value2' OR
> > > 'CF.{MyCF}' = 'value3' OR
> > > 'CF.{MyCF}' = 'value4' OR
> > > 'CF.{MyCF}' = 'value5' OR
> > > 'CF.{MyCF}' = 'value6' OR
> > > 'CF.{MyCF}' = 'value7' )
> > >
> > > where the listed values are 3-4 characters long and their number varies.
> > >
> > > After the ugprade, and with logging of slow queries enabled, I saw these
> > > numbers for queries with 5, 6, 7, and 8 OR statements for the custom
> > > field in the postgresql logs:
> > > - query for 5 possible CF values - 0.6 seconds
> > > - query for 6 possible CF values - 6 seconds
> > > - query for 7 possible CF values - 65 seconds
> > > - query for 8 possible CF values - 681 seconds
> > >
> > > In comparison, the corresponding query in RT 3.6 executes in less than 2
> > > milliseconds.
> > >
> > > One thing I noticed is that the SQL that is generated by RT 3.6 and 3.8
> > > is substantially different.
> > >
> > > In 3.6 there is a single SELECT on the ObjectCustomFieldValues table, so
> > > the generated SQL looks like this:
> > >
> > > SELECT DISTINCT main.*
> > > FROM Tickets main
> > > JOIN ObjectCustomFieldValues ObjectCustomFieldValues_1
> > > ON ( ObjectCustomFieldValues_1.CustomField = '1' )
> > > AND ( ObjectCustomFieldValues_1.ObjectType = 'RT::Ticket' )
> > > AND ( ObjectCustomFieldValues_1.Disabled = '0' )
> > > AND ( ObjectCustomFieldValues_1.ObjectId = main.id )
> > > WHERE (main.Status != 'deleted')
> > > AND (main.Queue = '11' AND main.Status = 'stalled' AND
> > > (
> > > ( ( ObjectCustomFieldValues_1.Content = 'value1' ) )
> > > OR ( ( ObjectCustomFieldValues_1.Content = 'value2' ) )
> > > ...
> > > )
> > > )
> > > AND (main.Type = 'ticket')
> > > AND (main.EffectiveId = main.id)
> > > ORDER BY main.id ASC
> > >
> > > In RT 3.8.8 the generated SQL looks like this:
> > >
> > > SELECT DISTINCT main.*
> > > FROM Tickets main
> > > LEFT JOIN ObjectCustomFieldValues ObjectCustomFieldValues_1
> > > ON ( ObjectCustomFieldValues_1.CustomField = '1' )
> > > AND ( ObjectCustomFieldValues_1.ObjectType = 'RT::Ticket' )
> > > AND ( ObjectCustomFieldValues_1.Disabled = '0' )
> > > AND ( ObjectCustomFieldValues_1.ObjectId = main.id )
> > > LEFT JOIN ObjectCustomFieldValues ObjectCustomFieldValues_2
> > > ON ( ObjectCustomFieldValues_2.CustomField = '1' )
> > > AND ( ObjectCustomFieldValues_2.ObjectType = 'RT::Ticket' )
> > > AND ( ObjectCustomFieldValues_2.Disabled = '0' )
> > > AND ( ObjectCustomFieldValues_2.ObjectId = main.id )
> > > ....
> > > WHERE (main.Status != 'deleted')
> > > AND (
> > > main.Queue = '11' AND main.Status = 'stalled' AND
> > > (
> > > ( ( ( ObjectCustomFieldValues_1.Content = 'value1' ) ) )
> > > OR ( ( ( ObjectCustomFieldValues_2.Content = 'value2' ) ) )
> > > ....
> > > )
> > > )
> > > AND (main.Type = 'ticket')
> > > AND (main.EffectiveId = main.id)
> > > ORDER BY main.id ASC
> > >
> > > I ran the queries through "explain analyze" and tried disabling nested
> > > loops, then disabling hash joins (I tried disabling whatever the planner
> > > was trying to use hoping it will try something else that is faster) but
> > > the query speed did not change considerably.
> > >
> > > Our database is not that big and can easily fit in the memory of the
> > > machine (400MB filesize), and we have less than 6000 tickets total. The
> > > queries I tried out return only a total of 4 to 5 tickets in the end. The
> > > custom field that I am referring to above is a mandatory field so it is
> > > assigned for almost all tickets, and with multiple values most of the
> > > time.
> > >
> > > Any advice?
> > >
> >
> > Hi Georgi,
> >
> > What are your postgresql.conf parameters?
> >
> > What are the EXPLAIN ANALYZE results for the fast query (3.6.x) and
> > the slow (3.8.8) query? That should also point out where an index may
> > help.
>
> Thanks for the pointers, Kenneth.
>
> We actually haven't touched postgresql.conf at all. The defaults have
> worked fine for our small postgresql installation thus far and I just
> didn't know where I should look next - in RT or postgres. Here is our
> postgresql.conf (it's the default one on the distro) and it is identical
> on RHEL6 and CentOS 5.5.
>
> max_connections = 100
> shared_buffers = 32MB
> logging_collector = on
> log_directory = 'pg_log'
> log_filename = 'postgresql-%a.log'
> log_truncate_on_rotation = on
> log_rotation_age = 1d
> log_rotation_size = 0
> datestyle = 'iso, mdy'
> lc_messages = 'en_US.UTF-8'
> lc_monetary = 'en_US.UTF-8'
> lc_numeric = 'en_US.UTF-8'
> lc_time = 'en_US.UTF-8'
> default_text_search_config = 'pg_catalog.english'
>
> I am also attaching the results and the query that I ran. This time I
> used bogus values (literally "value1", "value2") and the query ran much
> faster, but still not fast enough - 7 seconds for 7 values (going up to
> 45 seconds for 8 bogus values). It seems that indexes are indeed being
> used where appropriate.
>
> Am I missing something obvious?
>
Hi Georgi,
The default parameters for PostgreSQL are VERY conservative to allow
it to run even on systems with minimal resources. I did not see any
details of your system memory configuration, but you could set
shared_buffers a little higher. Also, for a mainly memory resident DB
you should make the random_page_cost closer to the seq_page_cost and
for a fully memory resident DB both to something like 0.1. You need
to set your effective_cache_size to about 2/3 of your systems
memory. Finally, in order to take advantage of more hash join options
you should bump your work_mem from the default 1MB to more based on
the amount of system memory you have. We set it to 128MB on an 8GB
system, but even 32MB or 64MB can really help. This is what you
need to set:
shared_buffers = 64MB # or 128MB if you have the memory
work_mem = 16MB # or higher
effective_cache_size = 3GB # for a 4GB system or system-memory * 2/3
random_page_cost = 0.1 # for fully cached DB use 2 for less cached
seq_page_cost = 0.1 # for fully cached DB use 1 for less cached
These changes should allow you to take much better advantage of
your systems memory.
Regards,
Ken
More information about the rt-users
mailing list