[rt-users] RT 3.8.9 + PostgreSQL 8.4.7 - missing/bad indexes
Václav Ovsík
vaclav.ovsik at i.cz
Thu Apr 14 09:26:33 EDT 2011
Hi,
because of horrible performance of spreadsheet export of tickets I found
some badly created or missing indexes. I didn't investigate if other
RDBMS schemas has the same problem.
To discover problems I rewrote Results.tsv into command-line shape and
run it with DBI_TRACE=2 finally.
For every row of Tickets the following additional problematic queries are
executed:
########################################################################
SELECT * FROM Groups WHERE Instance = ? AND LOWER(Domain) = LOWER(?) AND LOWER(Type) = LOWER(?)
e.g. one my bind params: 106431 'RT::Ticket-Role' 'Requestor'
rt=# EXPLAIN ANALYZE SELECT * FROM Groups WHERE Instance = 106431 AND LOWER(Domain) = LOWER('RT::Ticket-Role') AND LOWER(Type) = LOWER('Requestor');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on groups (cost=0.00..12925.34 rows=1 width=66) (actual time=64.672..64.904 rows=1 loops=1)
Filter: ((instance = 106431) AND (lower((domain)::text) = 'rt::ticket-role'::text) AND (lower((type)::text) = 'requestor'::text))
Total runtime: 64.936 ms
(3 rows)
Existing indexes:
CREATE UNIQUE INDEX Groups1 ON Groups (Domain,Instance,Type,id, Name);
CREATE INDEX Groups2 On Groups (Type, Instance, Domain);
Both indexes above are case sensitive. Either problematic query should be
case sensitive (without lower()) or a new index with lower() on fields should
be created:
CREATE INDEX Groups_zito1 ON Groups (lower(Type), Instance, lower(Domain));
rt=# EXPLAIN ANALYZE SELECT * FROM Groups WHERE Instance = 106431 AND LOWER(Domain) = LOWER('RT::Ticket-Role') AND LOWER(Type) = LOWER('Requestor');
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Index Scan using groups_zito1 on groups (cost=0.01..8.38 rows=1 width=66) (actual time=0.084..0.084 rows=1 loops=1)
Index Cond: ((lower((type)::text) = 'requestor'::text) AND (instance = 106431) AND (lower((domain)::text) = 'rt::ticket-role'::text))
Total runtime: 0.108 ms
(3 rows)
########################################################################
SELECT main.* FROM ObjectCustomFieldValues main WHERE (main.Disabled = '0') AND (main.ObjectType = 'RT::Ticket') AND (main.ObjectId = '106431')
rt=# EXPLAIN ANALYZE SELECT main.* FROM ObjectCustomFieldValues main WHERE (main.Disabled = '0') AND (main.ObjectType = 'RT::Ticket') AND (main.ObjectId = '106431');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Seq Scan on objectcustomfieldvalues main (cost=0.00..12298.75 rows=3 width=457) (actual time=142.497..142.887 rows=3 loops=1)
Filter: ((disabled = 0) AND ((objecttype)::text = 'RT::Ticket'::text) AND (objectid = 106431))
Total runtime: 142.924 ms
(3 rows)
Existing indexes:
CREATE INDEX ObjectCustomFieldValues1 ON ObjectCustomFieldValues (CustomField,ObjectType,ObjectId,Content);
CREATE INDEX ObjectCustomFieldValues2 ON ObjectCustomFieldValues (CustomField,ObjectType,ObjectId);
The above query selects based on fields ObjectType and ObjectId,
unfortunately ObjectCustomFieldValues2 contains additional field
CustomField. To speedup the search index with good selectivity
with only field ObjectId can suffice.
CREATE INDEX ObjectCustomFieldValues_zito1 ON ObjectCustomFieldValues (ObjectId);
rt=# EXPLAIN ANALYZE SELECT main.* FROM ObjectCustomFieldValues main WHERE (main.Disabled = '0') AND (main.ObjectType = 'RT::Ticket') AND (main.ObjectId = '106431');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using objectcustomfieldvalues_zito1 on objectcustomfieldvalues main (cost=0.00..8.58 rows=3 width=457) (actual time=0.059..0.063 rows=3 loops=1)
Index Cond: (objectid = 106431)
Filter: ((disabled = 0) AND ((objecttype)::text = 'RT::Ticket'::text))
Total runtime: 0.090 ms
(4 rows)
or index with two corresponding fields:
CREATE INDEX ObjectCustomFieldValues_zito2 ON ObjectCustomFieldValues (ObjectType, ObjectId);
My planner uses always objectcustomfieldvalues_zito1, because I have
only one ObjectType value:
rt=# select distinct objecttype from objectcustomfieldvalues;
objecttype
------------
RT::Ticket
(1 row)
########################################################################
And performance benefit?
Original state, without a new indexes:
zito at rt2:~/migration$ time ./rt-search2 /opt/RT >/dev/null 2>&1
real 3m12.235s
user 0m11.713s
sys 0m0.576s
After creation of the indexes:
CREATE INDEX Groups_zito1 ON Groups (lower(Type), Instance, lower(Domain));
CREATE INDEX ObjectCustomFieldValues_zito1 ON ObjectCustomFieldValues (ObjectId);
zito at rt2:~/migration$ time ./rt-search2 /opt/RT >/dev/null 2>&1
real 0m14.171s
user 0m9.545s
sys 0m0.456s
That is speed up by a factor of circa 14 in my case.
A testing script rt-search2 attached (rewrite of Results.tsv). A TSQL
query at the start of the script should be changed to something
appropriate of course.
Can be new indexes added into schema.Pg dist file?
Thanks
Best Regards
--
Zito
-------------- next part --------------
#!/usr/bin/perl
use warnings;
no warnings 'once';
use strict;
use Data::Dumper;
my $Query = <<'EOF';
Status = 'resolved' AND LastUpdated > '1.1.2009' AND ( Owner != 'mvas' OR Owner != 'behounek' ) AND Created > '1.1.2006' AND Queue = 'support RZP' AND Content LIKE 'uir'
EOF
my $OrderBy = 'id';
my $Order = 'ASC';
MAIN: {
$| = 1;
my $prefix = $ARGV[0];
unshift @INC, ("$prefix/local/lib", "$prefix/lib");
require RT;
RT::LoadConfig();
RT::Init();
require RT::Tickets;
my $Tickets = RT::Tickets->new( $RT::SystemUser );
$Tickets->FromSQL( $Query );
if ( $OrderBy =~ /\|/ ) {
# Multiple Sorts
my @OrderBy = split /\|/, $OrderBy;
my @Order = split /\|/, $Order;
$Tickets->OrderByCols(
map { { FIELD => $OrderBy[$_], ORDER => $Order[$_] } }
( 0 .. $#OrderBy )
);
}
else {
$Tickets->OrderBy( FIELD => $OrderBy, ORDER => $Order );
}
my %cf_id_to_name;
my %cf_name_to_pos;
{
my $cfs = RT::SQL::PossibleCustomFields(
Query => $Query, CurrentUser => $RT::SystemUser,
);
while ( my $cf = $cfs->Next ) {
my $name = $cf->Name;
$cf_id_to_name{ $cf->id } = $name;
next if $cf_name_to_pos{ $name };
$cf_name_to_pos{ $name } =
(sort { $b <=> $a } values %cf_name_to_pos)[0] + 1;
}
}
my @attrs = qw(
id QueueObj->Name Subject Status
TimeEstimated TimeWorked TimeLeft
Priority FinalPriority
OwnerObj->Name
Requestors->MemberEmailAddressesAsString
Cc->MemberEmailAddressesAsString
AdminCc->MemberEmailAddressesAsString
DueObj->ISO ToldObj->ISO CreatedObj->ISO
ResolvedObj->ISO LastUpdatedObj->ISO LastUpdatedByObj->Name
);
{
my @header;
foreach my $attr (@attrs) {
my $label = $attr;
$label =~ s'Obj-.(?:AsString|Name|ISO)''g;
$label =~ s'-\>MemberEmailAddressesAsString''g;
push @header, $label;
}
$_ += @header - 1 foreach values %cf_name_to_pos;
foreach my $name ( sort { $cf_name_to_pos{$a} <=> $cf_name_to_pos{$b} } keys %cf_name_to_pos ) {
push @header, "CF-". $name;
}
print join("\t", @header);
print "\n";
}
my $i = 0;
while ( my $Ticket = $Tickets->Next()) {
my @row;
foreach my $attr (@attrs) {
my $value;
if ($attr =~ /(.*)->ISO$/ and $Ticket->$1->Unix <= 0) {
$value = '';
} else {
my $method = '$Ticket->'.$attr.'()';
$method =~ s/->ISO\(\)$/->ISO( Timezone => 'user' )/;
$value = eval $method;
if ($@) {die "Failed to find $attr - ". $@};
}
push @row, $value;
}
my $values = $Ticket->CustomFieldValues;
$values->OrderByCols; # don't sort them
while (my $value = $values->Next) {
my $pos = $cf_name_to_pos{ $cf_id_to_name{ $value->CustomField } };
next unless $pos;
$row[$pos] = '' unless defined $row[$pos];
$row[$pos] .= ', ' if $row[$pos];
$row[$pos] .= $value->Content;
}
# remove tabs from all field values, they screw up the tsv
for (@row) {
$_ = '' unless defined;
$_ =~ s/(?:\n|\r)//g;
$_ =~ s{\t}{ }g;
}
print join("\t", at row);
print "\n";
}
};
More information about the rt-users
mailing list