[Rt-devel] Huge number of database queries
Jeff Davis
pgsql at j-davis.com
Tue Nov 20 19:39:07 EST 2007
On Mon, 2007-11-19 at 07:18 -0600, Jesse Vincent wrote:
> *nod* FWIW, enabling "log_statement" on 8.1 didn't do that for me.
>
PostgreSQL logging has many options, and many of them are
interdependent. I mentioned log_min_duration_statement=0 because that's
an easy way to make sure that the statements are logged.
> >
> > I'm more worried about the stability of performance than anything else.
> > With so many network round-trips to the database, I'm worried that any
> > minor slowdown would make RT unavailable. We don't notice any serious
> > problem right now, because our network ping time is ~0.2 ms, and our
> > processor usage on the database server is also low. Also, we have
> > essentially an empty database now, so what will the performance be like
> > as it grows? Will the number of queries per page load change as the
> > dataset changes?
>
>
> It shouldn't change significantly, but if you want to stress-test, the
> results can help us to improve RT.
We'll certainly provide our results.
> > It also just makes it harder to administer. If I turn on query logging
> > to diagnose a problem, the signal to noise ratio makes it almost
> > useless.
>
> I think that a little experience about what to grep out will go a long
> way toward that. We've had a fair amount of experience and luck tuning
> RT on Postgres with those logs.
>
Ok.
> >>> It's hard for me to imagine that we need 44 normal SELECT
> >>> statements to view a ticket.
> >> Can you pull out what they are?
> >
> > I have to scrub the data before putting it in a public place (yeah, I
> > know there's nothing sensitive, but I still have to look to be sure), so
> > this is just a little more detail to the summary:
> >
> > 016 parse
> > 001 "SELECT * FROM Users WHERE LOWER(Gecos) = LOWER($1)"
> > 002 "SELECT * FROM Users WHERE LOWER(Gecos) = LOWER($1)"
>
> LOWER(Gecos)? Really? Can you pull out the values there? I wouldn't
> expect to see that in normal operation unless you're playing external
> authenntication tricks.
We set WebExternalAuth and we unset WebFallbackToInternalAuth. I don't
think we do any tricks aside from that. We use a .htaccess file to
authenticate through LDAP.
> > 001 "SELECT * FROM Transactions WHERE id = $1"
>
> This seems a little surprsiing. That should get pulled in from a bigger
> prefetch of all txns related to that ticket
>
> > 009 other
> > 018 bind
> > 018 execute
> > 016 deallocate
>
> I talked about those in my last mail. I'd love to see a patch to
> SearchBuilder to reuse prepared statements, but I'm not sure it's a
> high-priority target.
>
I'll have to look into it to see how many statements can be effectively
reused.
> > 049 select
> > 005 ping test
> > 011 "SELECT ... from ACL, Groups, Principals, CachedGroupMembers ..."
>
> 11 ACL checks seems slightly high but not insanely high, since RT is
> checking queue, ticket and custom field ACLs. Though you should pay
> attention to Ruslan's recent mail about refactoring the ACL code to be a
> bit smarter and cut down on the number of queries.
Ok.
> > 004 "SELECT main.* FROM ( SELECT main.id FROM CustomFields ..."
>
> Your custom fields.
>
"customfield" and "customfieldvalues" are both empty tables
> > 004 "SELECT DISTINCT main.* FROM Users main JOIN Principals ..."
>
> Requestors, Ccs, AdminCcs and Owners.
>
> > 006 "SELECT main.* FROM Tickets main WHERE ..."
>
> Unsure
>
> > 006 "SELECT main.* FROM Links main WHERE ..."
>
> Six different kinds of links on tickets
>
> > 002 "...Users main CROSS JOIN ACL..."
> > 011 other
>
> What falls into "other"?
>
I'll send the logs to you off-list.
> *nod* Improving query count has generally taken a back-seat to improving
> overall query _performance_, since quite often there are one or two
> queries in that set that dwarf the run time of all the rest. That said,
Good performance and also stable/scalable performance that doesn't
drastically change when the conditions change are the main goals for me.
I'm not implying that RT does that currently, but that's where I'm
coming from.
> I'd love to make fewer queries and get the same results ;) Would you
> maybe be up for helping to improve things? Setting up a global
> prepared-query cache seems like it might be a good, well-contained
> project that would have a pretty direct impact on what's made your
> spidey-sense tingle.
I've been looking into it as I learn RT. The deadline for deploying RT
is approaching, but afterward when RT is in maintenance I may be able to
contribute (employer permitting).
> There's also interesting work to be done with caching and invalidating
> caches of collection search results, possibly using a tool like MemCache
> that I'd be happy to ramble about if somebody is interested in hacking
> on it.
That's always an interesting subject, but I'm cautious about introducing
more caching because it's a challenge to get it right. I'd hate to give
up real stability for hypothetical performance ;)
Regards,
Jeff Davis
More information about the Rt-devel
mailing list