[rt-users] Slow ticket loading....

Tue Jun 24 06:43:18 EDT 2014

Michelle Sullivan wrote:
> Michelle Sullivan wrote:
>   
>> Alex Vandiver wrote:
>>   
>>     
>>> On 05/13/2014 05:39 AM, Michelle Sullivan via RT wrote:
>>>   
>>>     
>>>       
>>>> [snip]
>>>>     
>>>>       
>>>>         
>>> This message was rejected from rt-users because it was over 1M of
>>> logfiles.  Please don't send large messages to the list; you're welcome
>>> to provide a link to complete logs for those interested in them, of course.
>>>   
>>>     
>>>       
>> I actually didn't know the size (though I knew it might be bigger than
>> acceptable by the list software - figured some intelligent person would
>> choose to deal or allow it ;-) )
>>
>>   
>>     
>>>   
>>>     
>>>       
>>>> Here's the result of what RT4 does to the DB (note: RT3.8 does *NOT*
>>>> suffer the same problems - as am running that and just trying to
>>>> upgrade..  Also note:  RT 4.0.19 on Apache 2.2 with ModPerl2 running
>>>> against some very fast PostgreSQL 8.4.10 DBs (cluster of 4 servers) -
>>>> round trip to them is around 25-50ms)... the query log for a *SINGLE*
>>>> ticket follows...
>>>>     
>>>>       
>>>>         
>>> This only occurs on the first request; DBIx::SearchBuilder calls
>>> DBI->column_info to determine the set of columns for each RT table, to
>>> be able to use them in the GROUP BY.  It then caches the information;
>>> none of the queries to pg_constraints should happen after the first request.
>>>   
>>>     
>>>       
>> Yeah - i'm seeing this every time a ticket is loaded - so RT/DBIx is not
>> caching it... and that's probably because the IDs are different - so
>> even pgpool/memcached setup can't cache it...
>>
>>   
>>     
>>> There are two fixes to this; one is to find a way to not call
>>> DBIx::SearchBuilder->Fields, which is what calls DBI->column_info, which
>>> causes the slow queries.  RT is currently using the set of columns so
>>> obtained to pass to GROUP BY to perform both a DISTINCT and ORDER BY; Pg
>>> 9.1 is smart enough to know that grouping by the primary key is enough
>>> to imply the remaining columns as well[1].  As such, the attached patch
>>> (which has only been briefly tested) may suffice.
>>>   
>>>     
>>>       
>> I'm on 8.4 will that work as well, or should I not bother? (I can't
>> upgrade to 9.1 yet, as much as I want to - the software on the system is
>> in the hands of the Operations team and controlled by puppet so 8.x ->
>> 9.x is impossible without corrupting everything or Operations input).
>>   
>>     
>
> FYI it appears every new connection to the DB is causing
> DBI->column_info to be called.  On the 'in test' system this happens a
> lot .. page loads when it opens a new connection is +4 minutes...
> regardless of the page.
>
> It's quite usable with and without pgpool (it's faster without) - except
> when a new connection is opened.
>
>   

So, I have my individual ticket load times running 4 times faster
(minimum) with the following patch to DBIx::SearchBuilder:

--- DBIx/SearchBuilder/Handle.pm.orig    2014-06-24 05:21:54.000000000 -0500
+++ DBIx/SearchBuilder/Handle.pm    2014-06-24 05:21:41.000000000 -0500
@@ -1429,14 +1429,16 @@
 sub Fields {
     my $self  = shift;
     my $table = shift;
+    my $lctn = lc $table;
 
-    unless ( keys %FIELDS_IN_TABLE ) {
-        my $sth = $self->dbh->column_info( undef, '', '%', '%' )
+    unless ( keys %FIELDS_IN_TABLE && exists $FIELDS_IN_TABLE{$lctn} ) {
+        my $sth = $self->dbh->column_info( undef, '', $table, '%' )
             or return ();
         my $info = $sth->fetchall_arrayref({});
         foreach my $e ( @$info ) {
-            push @{ $FIELDS_IN_TABLE{ lc $e->{'TABLE_NAME'} } ||= [] },
lc $e->{'COLUMN_NAME'};
+            push @{ $FIELDS_IN_TABLE{ $lctn } ||= [] }, lc
$e->{'COLUMN_NAME'};
         }
+    $FIELDS_IN_TABLE{ $lctn } ||= []; # empty list so we don't call
column_info() again.
     }
 
     return @{ $FIELDS_IN_TABLE{$lctn } || [] };


I don't know how much this will break (if anything) but the 1000's of
queries is dropped to just a few when calling Fields().

What I know it will break is if you call Fields with a table name not
matching the case of the table in the DB (for the first call) it will
not return the table and it will block returning the table after that -
even if you subsequently call with the correct case... which is because
of using the "lc $table" (in the original: lc $e->{'TABLE_NAME'}) ...
which should not be done as in PostgreSQL you can have Table1, table1,
and taBLE1 as valid tables (though they are probably a bad idea) in the
same DB and schema.... It should be case sensitive so dropping all the
'lc's would be a good idea in my opinion (if nothing else because having
2 table names the same but different case and definition will result in
Fields being the wrong values for at least one of the tables.)

Jesse, thoughts?

-- 
Michelle Sullivan
http://www.mhix.org/