[Rt-devel] Huge number of database queries

Jeff Davis pgsql at j-davis.com
Sun Nov 18 19:34:36 EST 2007


On Sun, 2007-11-18 at 17:58 -0500, Jesse Vincent wrote:
> So, RT 3.6.6 should significantly improve the state of the static
> content. We've finally deployed CSS::Squish to force only a single CSS
> file for your RT instance and we've improved the browser hinting that
> should force caching so you only load it once unless you're forcing the
> browser's cache to wipe itself on reload.  But do note that even 3.6.5
> should be caching those files.
> 

Hmm... perhaps Ctrl-R forced it to re-request everything. I'm not quite
as concerned about this, because there's a workaround, but I'm glad to
hear it's being improved.

> 
> > RT. However, a single dynamic page (e.g. ticket view) still generates
> > 119 queries, which still seems like a huge number to me.
> 
> When I saw your blog post about this, I thought you were talking about
> _SQL_ queries. 

What I mean, very specifically, is that when I turn
"log_min_duration_statement = 0" in postgresql.conf, it generates 119
lines that contain "duration:" in the log. These are primarily SQL
queries, although I suppose that depends on your definition.

> > Here is the breakdown for one dynamic page load (displaying a ticket):
> 
> How many custom fields and transactions on the ticket?

No custom fields and 3 transactions, as far as I can tell. We have an
almost empty database.

> > 
> > 119  queries total
> >   067  statements
> >     049  select
> >       005  ping test
> >       044  other
> >     016  deallocate
> >     002  set
> >   052  parse/bind/execute
> > 
> > Is this expected? Is there a way to bring this to a more reasonable
> > number? 
> 
> Do you just find it "not-right" or are you seeing poor performance?
> Either way, I'd like to see what the queries are, so as to be able to
> give you a decent answer there.
> 

I would say that the problem is more that it's "not right" than any
serious performance problem now.

We haven't deployed it to production yet, but we will have a substantial
number of users though, and most of those users will be using RT
constantly through the workday. People will avoid using the ticketing
system if every minor update or action interrupts them with a delay
(even a small delay), so I'd like the system to be essentially
instantaneous.

I'm more worried about the stability of performance than anything else.
With so many network round-trips to the database, I'm worried that any
minor slowdown would make RT unavailable. We don't notice any serious
problem right now, because our network ping time is ~0.2 ms, and our
processor usage on the database server is also low. Also, we have
essentially an empty database now, so what will the performance be like
as it grows? Will the number of queries per page load change as the
dataset changes?

It also just makes it harder to administer. If I turn on query logging
to diagnose a problem, the signal to noise ratio makes it almost
useless.

> 
> > It's hard for me to imagine that we need 44 normal SELECT
> > statements to view a ticket.
> 
> Can you pull out what they are? 

I have to scrub the data before putting it in a public place (yeah, I
know there's nothing sensitive, but I still have to look to be sure), so
this is just a little more detail to the summary:

016  parse
  001  "SELECT  * FROM Users WHERE LOWER(Gecos) = LOWER($1)"
  002  "SELECT  * FROM Users WHERE LOWER(Gecos) = LOWER($1)"
  001  "SELECT  * FROM Tickets WHERE id = $1"
  002  "SELECT  * FROM Queues WHERE id = $1"
  001  "SELECT  * FROM Transactions WHERE id = $1"
  009  other
018  bind
018  execute
016  deallocate
049  select
  005  ping test
  011  "SELECT ... from ACL, Groups, Principals, CachedGroupMembers ..."
  004  "SELECT main.* FROM ( SELECT main.id FROM CustomFields ..."
  004  "SELECT DISTINCT main.* FROM Users main JOIN Principals ..."
  006  "SELECT main.* FROM Tickets main  WHERE ..."
  006  "SELECT main.* FROM Links main  WHERE ..."
  002  "...Users main CROSS JOIN ACL..."
  011  other

> 
> The 16 additional prepared statements,
> > which are then immediately DEALLOCATEd are probably excessive as well.
> 
> I suspect that the issue here stems from the fact that until _very_
> recently, DBD::mysql and DBD::Pg didn't support server side prepared
> statements - everything was emulated on the client. Putting together a
> cache at the ORM layer should be doable. But I've never seen those sorts
> of queries actually generate measurable overhead.

I'm not worried about optimizing away a few protocol round-trips that
are only generated by DBD::Pg. I'm worried about the big numbers. If you
count just the executes, and ignore parse/bind/deallocate (which is fine
by me), that's still 18+49 = 67 queries.

So I guess the numbers are: 119 round trips to the DB server, and 67
queries. It would be nice if we could get both of those numbers down to
something reasonable.

Regards,
	Jeff Davis




More information about the Rt-devel mailing list