[Rt-devel] Huge number of database queries
Jeff Davis
pgsql at j-davis.com
Sun Nov 18 19:34:36 EST 2007
On Sun, 2007-11-18 at 17:58 -0500, Jesse Vincent wrote:
> So, RT 3.6.6 should significantly improve the state of the static
> content. We've finally deployed CSS::Squish to force only a single CSS
> file for your RT instance and we've improved the browser hinting that
> should force caching so you only load it once unless you're forcing the
> browser's cache to wipe itself on reload. But do note that even 3.6.5
> should be caching those files.
>
Hmm... perhaps Ctrl-R forced it to re-request everything. I'm not quite
as concerned about this, because there's a workaround, but I'm glad to
hear it's being improved.
>
> > RT. However, a single dynamic page (e.g. ticket view) still generates
> > 119 queries, which still seems like a huge number to me.
>
> When I saw your blog post about this, I thought you were talking about
> _SQL_ queries.
What I mean, very specifically, is that when I turn
"log_min_duration_statement = 0" in postgresql.conf, it generates 119
lines that contain "duration:" in the log. These are primarily SQL
queries, although I suppose that depends on your definition.
> > Here is the breakdown for one dynamic page load (displaying a ticket):
>
> How many custom fields and transactions on the ticket?
No custom fields and 3 transactions, as far as I can tell. We have an
almost empty database.
> >
> > 119 queries total
> > 067 statements
> > 049 select
> > 005 ping test
> > 044 other
> > 016 deallocate
> > 002 set
> > 052 parse/bind/execute
> >
> > Is this expected? Is there a way to bring this to a more reasonable
> > number?
>
> Do you just find it "not-right" or are you seeing poor performance?
> Either way, I'd like to see what the queries are, so as to be able to
> give you a decent answer there.
>
I would say that the problem is more that it's "not right" than any
serious performance problem now.
We haven't deployed it to production yet, but we will have a substantial
number of users though, and most of those users will be using RT
constantly through the workday. People will avoid using the ticketing
system if every minor update or action interrupts them with a delay
(even a small delay), so I'd like the system to be essentially
instantaneous.
I'm more worried about the stability of performance than anything else.
With so many network round-trips to the database, I'm worried that any
minor slowdown would make RT unavailable. We don't notice any serious
problem right now, because our network ping time is ~0.2 ms, and our
processor usage on the database server is also low. Also, we have
essentially an empty database now, so what will the performance be like
as it grows? Will the number of queries per page load change as the
dataset changes?
It also just makes it harder to administer. If I turn on query logging
to diagnose a problem, the signal to noise ratio makes it almost
useless.
>
> > It's hard for me to imagine that we need 44 normal SELECT
> > statements to view a ticket.
>
> Can you pull out what they are?
I have to scrub the data before putting it in a public place (yeah, I
know there's nothing sensitive, but I still have to look to be sure), so
this is just a little more detail to the summary:
016 parse
001 "SELECT * FROM Users WHERE LOWER(Gecos) = LOWER($1)"
002 "SELECT * FROM Users WHERE LOWER(Gecos) = LOWER($1)"
001 "SELECT * FROM Tickets WHERE id = $1"
002 "SELECT * FROM Queues WHERE id = $1"
001 "SELECT * FROM Transactions WHERE id = $1"
009 other
018 bind
018 execute
016 deallocate
049 select
005 ping test
011 "SELECT ... from ACL, Groups, Principals, CachedGroupMembers ..."
004 "SELECT main.* FROM ( SELECT main.id FROM CustomFields ..."
004 "SELECT DISTINCT main.* FROM Users main JOIN Principals ..."
006 "SELECT main.* FROM Tickets main WHERE ..."
006 "SELECT main.* FROM Links main WHERE ..."
002 "...Users main CROSS JOIN ACL..."
011 other
>
> The 16 additional prepared statements,
> > which are then immediately DEALLOCATEd are probably excessive as well.
>
> I suspect that the issue here stems from the fact that until _very_
> recently, DBD::mysql and DBD::Pg didn't support server side prepared
> statements - everything was emulated on the client. Putting together a
> cache at the ORM layer should be doable. But I've never seen those sorts
> of queries actually generate measurable overhead.
I'm not worried about optimizing away a few protocol round-trips that
are only generated by DBD::Pg. I'm worried about the big numbers. If you
count just the executes, and ignore parse/bind/deallocate (which is fine
by me), that's still 18+49 = 67 queries.
So I guess the numbers are: 119 round trips to the DB server, and 67
queries. It would be nice if we could get both of those numbers down to
something reasonable.
Regards,
Jeff Davis
More information about the Rt-devel
mailing list