[rt-users] Slow queries building list of privileged users (Postgres)

Dominic Hargreaves dominic.hargreaves at oucs.ox.ac.uk
Fri Nov 20 06:41:50 EST 2009


I'm migrating from an RT 2 install to an RT 3.8 install with around
170 privileged users (and around 90,000 total users).

I've done some initial testing with RT 3.8.6 and have observed
that building the list of privileged users (in the "create new
ticket" (ticket owner), "display ticket" (reminder owner) and
"ticket search" (ticket owner) pages) takes around 20-30 seconds to run.

It performs two queries which are logged by my slow query logger:

LOG:  duration: 30570.089 ms  statement: SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_4 JOIN Principals Principals_1  ON ( Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_2  ON ( CachedGroupMembers_2.MemberId = Principals_1.id ) JOIN Groups Groups_3  ON ( Groups_3.id = CachedGroupMembers_2.GroupId )  WHERE (Principals_1.Disabled = '0') AND (ACL_4.PrincipalType = Groups_3.Type) AND (Principals_1.id != '1') AND (Principals_1.PrincipalType = 'User') AND (ACL_4.RightName = 'OwnTicket') AND (Groups_3.Domain = 'RT::System-Role') AND ((ACL_4.ObjectType = 'RT::Queue' AND ACL_4.ObjectId = 3) OR (ACL_4.ObjectType = 'RT::System'))  ORDER BY main.Name ASC 

LOG:  duration: 824.692 ms  statement: SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_2 JOIN Principals Principals_1  ON ( Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_3  ON ( CachedGroupMembers_3.MemberId = Principals_1.id )  WHERE (Principals_1.Disabled = '0') AND (ACL_2.PrincipalId = CachedGroupMembers_3.GroupId) AND (Principals_1.id != '1') AND (ACL_2.PrincipalType = 'Group') AND (Principals_1.PrincipalType = 'User') AND (ACL_2.RightName = 'OwnTicket') AND ((ACL_2.ObjectType = 'RT::Queue' AND ACL_2.ObjectId   = 3) OR (ACL_2.ObjectType = 'RT::System'))  ORDER BY main.Name ASC 

The first, on our database, returns no entries at all, whilst the
second returns the expected number of entries (the list of privileged
users).

Concentrating on this very slow one and asking the query planner
with EXPLAIN ANALYZE:

 Unique  (cost=147226.19..147269.59 rows=496 width=3382) (actual time=30888.166..30888.166 rows=0 loops=1)
   ->  Sort  (cost=147226.19..147227.43 rows=496 width=3382) (actual time=30888.164..30888.164 rows=0 loops=1)
         Sort Key: main.name, main.id, main.password, main.comments, main.signature, main.emailaddress, main.freeformcontactinfo, main.organization, main.realname, main.nickname, main.lang, main.emailencoding, main.webencoding, main.externalcontactinfoid, main.contactinfosystem, main.externalauthid, main.authsystem, main.gecos, main.homephone, main.workphone, main.mobilephone, main.pagerphone, main.address1, main.address2, main.city, main.state, main.zip, main.country, main.timezone, main.pgpkey, main.creator, main.created, main.lastupdatedby, main.lastupdated
         Sort Method:  quicksort  Memory: 25kB
         ->  Hash Join  (cost=76985.91..147203.98 rows=496 width=3382) (actual time=30888.088..30888.088 rows=0 loops=1)
               Hash Cond: ((groups_3.type)::text = (acl_4.principaltype)::text)
               ->  Hash Join  (cost=76933.22..147046.33 rows=3077 width=3388) (actual time=30887.657..30887.657 rows=0 loops=1)
                     Hash Cond: (cachedgroupmembers_2.groupid = groups_3.id)
                     ->  Hash Join  (cost=57544.03..126361.67 rows=126470 width=3386) (actual time=1509.645..29963.426 rows=1278161 loops=1)
                           Hash Cond: (cachedgroupmembers_2.memberid = main.id)
                           ->  Seq Scan on cachedgroupmembers cachedgroupmembers_2  (cost=0.00..54962.02 rows=3357602 width=8) (actual time=0.026..2268.758 rows=3357455 loops=1)
                           ->  Hash  (cost=57500.57..57500.57 rows=3477 width=3386) (actual time=1509.465..1509.465 rows=92315 loops=1)
                                 ->  Hash Join  (cost=52857.90..57500.57 rows=3477 width=3386) (actual time=993.708..1360.215 rows=92315 loops=1)
                                       Hash Cond: (main.id = principals_1.id)
                                       ->  Seq Scan on users main  (cost=0.00..3223.16 rows=92316 width=3382) (actual time=0.022..75.080 rows=92316 loops=1)
                                       ->  Hash  (cost=51835.42..51835.42 rows=81798 width=4) (actual time=993.584..993.584 rows=92316 loops=1)
                                             ->  Seq Scan on principals principals_1  (cost=0.00..51835.42 rows=81798 width=4) (actual time=0.036..914.611 rows=92316 loops=1)
                                                   Filter: ((id <> 1) AND (disabled = 0) AND ((principaltype)::text = 'User'::text))
                     ->  Hash  (cost=18759.68..18759.68 rows=50361 width=10) (actual time=0.092..0.092 rows=4 loops=1)
                           ->  Bitmap Heap Scan on groups groups_3  (cost=1907.17..18759.68 rows=50361 width=10) (actual time=0.078..0.081 rows=4 loops=1)
                                 Recheck Cond: ((domain)::text = 'RT::System-Role'::text)
                                 ->  Bitmap Index Scan on groups1  (cost=0.00..1894.58 rows=50361 width=0) (actual time=0.059..0.059 rows=4 loops=1)
                                       Index Cond: ((domain)::text = 'RT::System-Role'::text)
               ->  Hash  (cost=52.39..52.39 rows=24 width=6) (actual time=0.401..0.401 rows=97 loops=1)
                     ->  Bitmap Heap Scan on acl acl_4  (cost=8.77..52.39 rows=24 width=6) (actual time=0.228..0.303 rows=97 loops=1)
                           Recheck Cond: ((((rightname)::text = 'OwnTicket'::text) AND ((objecttype)::text = 'RT::Queue'::text) AND (objectid = 3)) OR (((rightname)::text = 'OwnTicket'::text) AND ((objecttype)::text = 'RT::System'::text)))
                           ->  BitmapOr  (cost=8.77..8.77 rows=24 width=0) (actual time=0.205..0.205 rows=0 loops=1)
                                 ->  Bitmap Index Scan on acl1  (cost=0.00..4.30 rows=4 width=0) (actual time=0.091..0.091 rows=2 loops=1)
                                       Index Cond: (((rightname)::text = 'OwnTicket'::text) AND ((objecttype)::text = 'RT::Queue'::text) AND (objectid = 3))
                                 ->  Bitmap Index Scan on acl1  (cost=0.00..4.45 rows=20 width=0) (actual time=0.113..0.113 rows=95 loops=1)
                                       Index Cond: (((rightname)::text = 'OwnTicket'::text) AND ((objecttype)::text = 'RT::System'::text))
 Total runtime: 30892.368 ms

Here most of the time is spent in the hash join for:

Hash Cond: (cachedgroupmembers_2.memberid = main.id)

and this join eliminates all the remaining rows.

Clearly for us, this first join is useless, but can anyone comment on
whether this is usual, and why it's so slow for us? 

We're running an essentially identical database configuration with
another instance of 3.8.6, which has around 10 privileged users and so
unsurprisingly doesn't have this problem, and the query planner is
different here:

 Unique  (cost=638.22..638.31 rows=1 width=1086) (actual time=0.355..0.355 rows=0 loops=1)
   ->  Sort  (cost=638.22..638.22 rows=1 width=1086) (actual time=0.353..0.353 rows=0 loops=1)
         Sort Key: main.name, main.id, main.password, main.comments, main.signature, main.emailaddress, main.freeformcontactinfo, main.organization, main.realname, main.nickname, main.lang, main.emailencoding, main.webencoding, main.externalcontactinfoid, main.contactinfosystem, main.externalauthid, main.authsystem, main.gecos, main.homephone, main.workphone, main.mobilephone, main.pagerphone, main.address1, main.address2, main.city, main.state, main.zip, main.country, main.timezone, main.pgpkey, main.creator, main.created, main.lastupdatedby, main.lastupdated
         Sort Method:  quicksort  Memory: 25kB
         ->  Nested Loop  (cost=0.00..638.21 rows=1 width=1086) (actual time=0.300..0.300 rows=0 loops=1)
               ->  Nested Loop  (cost=0.00..637.87 rows=1 width=8) (actual time=0.299..0.299 rows=0 loops=1)
                     ->  Nested Loop  (cost=0.00..635.76 rows=6 width=4) (actual time=0.298..0.298 rows=0 loops=1)
                           ->  Nested Loop  (cost=0.00..608.38 rows=3 width=4) (actual time=0.296..0.296 rows=0 loops=1)
                                 ->  Seq Scan on acl acl_4  (cost=0.00..16.40 rows=2 width=6) (actual time=0.031..0.219 rows=3 loops=1)
                                       Filter: (((rightname)::text = 'OwnTicket'::text) AND ((((objecttype)::text = 'RT::Queue'::text) AND (objectid = 3)) OR ((objecttype)::text = 'RT::System'::text)))
                                 ->  Index Scan using groups1 on groups groups_3  (cost=0.00..294.15 rows=147 width=10) (actual time=0.024..0.024 rows=0 loops=3)
                                       Index Cond: (((groups_3.domain)::text = 'RT::System-Role'::text) AND ((groups_3.type)::text = (acl_4.principaltype)::text))
                           ->  Index Scan using cachedgroupmembers3 on cachedgroupmembers cachedgroupmembers_2  (cost=0.00..9.06 rows=5 width=8) (never executed)
                                 Index Cond: (cachedgroupmembers_2.groupid = groups_3.id)
                     ->  Index Scan using principals_pkey on principals principals_1  (cost=0.00..0.34 rows=1 width=4) (never executed)
                           Index Cond: (principals_1.id = cachedgroupmembers_2.memberid)
                           Filter: ((principals_1.id <> 1) AND (principals_1.disabled = 0::smallint) AND ((principals_1.principaltype)::text = 'User'::text))
               ->  Index Scan using users_pkey on users main  (cost=0.00..0.33 rows=1 width=1086) (never executed)
                     Index Cond: (main.id = principals_1.id)
 Total runtime: 0.602 ms

cachedgroupmembers and users have the same indexes on both systems
(I'm not sure whether that's relevant).

Both systems are Debian lenny, RT 3.8.6, Postgres 8.3.

If anyone has any other advice about running RT 3.8 on postgres
I'd be interested too - I've added a couple of extra indexes:

CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));
CREATE INDEX users5 ON users (LOWER(emailaddress));

which are missing from the default installation which have
cut down some other common slow queries (the former
is already in an RT ticket:

http://rt3.fsck.com/Ticket/Display.html?id=13056

)

Thanks,
Dominic.

-- 
Dominic Hargreaves, Systems Development and Support Team
Computing Services, University of Oxford



More information about the rt-users mailing list