[rt-users] ShowCustomFields extremely slow.
Michelle Sullivan
michelle at sorbs.net
Thu Jan 30 04:57:56 EST 2014
Alex Vandiver wrote:
> On Tue, 2014-01-28 at 03:30 +0100, Michelle Sullivan wrote:
>
>> Ok so temporarily I set min_duration to 0 and logged everything, the log
>> is here:
>>
>
> The situation can be made much clearer by limiting to the postgres
> process IDs which are responding to queries from RT, namely 15537 and
> 15481.
( parsed this an put it in http://www.mhix.org/ticketload-rt.log )
> The longest statement duration is 1559.273ms, which is a _parse_
> of a statement query, not even an execution. The next-longest query is
> DBD::Pg's query (which is only run once per process, FTR) at 02:11:09 to
> determine table information -- which takes nearly a full _second_ to
> return. On the largest Postgres installation which I have access to,
> this takes less than 100ms.
>
Well here's the thing...
Jan 28 02:11:09 to Jan 28 02:18:31 ... a full 7 minutes of being in
ShowCustomFields there are just 100's (like more than 1900 of them) of
statements like this:
Jan 28 02:18:30 m0040933 postgres[15537]: [7893-1] LOG: duration: 0.042
ms bind dbdpg_p88026_1968: SELECT consrc FROM pg_catalog.pg_constraint
WHERE contype = 'c' AND conrelid = 134792121 AND conkey = '{33}'
Jan 28 02:18:30 m0040933 postgres[15537]: [7894-1] LOG: duration: 0.021
ms execute dbdpg_p88026_1968: SELECT consrc FROM
pg_catalog.pg_constraint WHERE contype = 'c' AND conrelid = 134792121
AND conkey = '{33}'
Jan 28 02:18:31 m0040933 postgres[15537]: [7895-1] LOG: duration: 0.363
ms statement: DEALLOCATE dbdpg_p88026_1968
.. and no other queries... The start being immediately after:
Jan 28 02:11:09 m0040933 postgres[15537]: [218-2] SELECT
Jan 28 02:11:09 m0040933 postgres[15537]: [218-3]
NULL::text AS "TABLE_CAT"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-4] ,
quote_ident(n.nspname) AS "TABLE_SCHEM"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-5] ,
quote_ident(c.relname) AS "TABLE_NAME"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-6] ,
quote_ident(a.attname) AS "COLUMN_NAME"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-7] ,
a.atttypid AS "DATA_TYPE"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-8] ,
pg_catalog.format_type(a.atttypid, NULL) AS "TYPE_NAME"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-9] ,
a.attlen AS "COLUMN_SIZE"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-10] ,
NULL::text AS "BUFFER_LENGTH"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-11] ,
NULL::text AS "DECIMAL_DIGITS"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-12] ,
NULL::text AS "NUM_PREC_RADIX"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-13] ,
CASE a.attnotnull WHEN 't' THEN 0 ELSE 1 END AS "NULLABLE"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-14] ,
pg_catalog.col_description(a.attrelid, a.attnum) AS "REMARKS"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-15] ,
pg_catalog.pg_get_expr(af.adbin, af.adrelid) AS "COLUMN_DEF"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-16] ,
NULL::text AS "SQL_DATA_TYPE"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-17] ,
NULL::text AS "SQL_DATETIME_SUB"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-18] ,
NULL::text AS "CHAR_OCTET_LENGTH"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-19] ,
a.attnum AS "ORDINAL_POSITION"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-20] ,
CASE a.attnotnull WHEN 't' THEN 'NO' ELSE 'YES' END AS "IS_NULLABLE"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-21] ,
pg_catalog.format_type(a.atttypid, a.atttypmod) AS "pg_type"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-22] ,
'?' AS "pg_constraint"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-23] ,
n.nspname AS "pg_schema"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-24] ,
c.relname AS "pg_table"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-25] ,
a.attname AS "pg_column"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-26] ,
a.attrelid AS "pg_attrelid"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-27] ,
a.attnum AS "pg_attnum"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-28] ,
a.atttypmod AS "pg_atttypmod"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-29] ,
t.typtype AS "_pg_type_typtype"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-30] ,
t.oid AS "_pg_type_oid"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-31] FROM
Jan 28 02:11:09 m0040933 postgres[15537]: [218-32]
pg_catalog.pg_type t
Jan 28 02:11:09 m0040933 postgres[15537]: [218-33]
JOIN pg_catalog.pg_attribute a ON (t.oid = a.atttypid)
Jan 28 02:11:09 m0040933 postgres[15537]: [218-34]
JOIN pg_catalog.pg_class c ON (a.attrelid = c.oid)
Jan 28 02:11:09 m0040933 postgres[15537]: [218-35]
LEFT JOIN pg_catalog.pg_attrdef af ON (a.attnum = af.adnum AND
a.attrelid = af.adrelid)
Jan 28 02:11:09 m0040933 postgres[15537]: [218-36]
JOIN pg_catalog.pg_namespace n ON (n.oid = c.relnamespace)
Jan 28 02:11:09 m0040933 postgres[15537]: [218-37] WHERE
Jan 28 02:11:09 m0040933 postgres[15537]: [218-38]
a.attnum >= 0
Jan 28 02:11:09 m0040933 postgres[15537]: [218-39]
AND c.relkind IN ('r','v')
Jan 28 02:11:09 m0040933 postgres[15537]: [218-40]
Jan 28 02:11:09 m0040933 postgres[15537]: [218-41]
AND c.relname LIKE '%'
Jan 28 02:11:09 m0040933 postgres[15537]: [218-42]
AND a.attname LIKE '%'
Jan 28 02:11:09 m0040933 postgres[15537]: [218-43] ORDER
BY "TABLE_SCHEM", "TABLE_NAME", "ORDINAL_POSITION"
Jan 28 02:11:09 m0040933 postgres[15537]: [218-44]
Jan 28 02:11:09 m0040933 postgres[15537]: [219-1] LOG: duration: 0.665
ms statement: DEALLOCATE dbdpg_p88026_49
Which is a column_info() call in DBD::Pg
Which is called from Fields() in DBIx::Searchbuilder ..
Which I assume is being called in ShowCustomFields somewhere...
The problem is network latency between the front ends and the backend
DB... Running this on the same host returns in under a second.
My frontends are on completely different networks (and indeed different
data centers) from the database servers for security (mainly DDoS type)
reasons
Now I'd suspect something very broken possibly in column_info as
pg_catalog.pg_constraint only has 36 rows in the rt4 table, and all
those are the pk's..
The TABLE_CAT query returns 1921 rows, which corresponds exactly to the
number of pg_catalog.pg_constraint queries...
> I am not familiar with Bucardo as a replication strategy, so I can't
> comment on that.
I've CC'd Gregg as he's the current author for DBD::Pg, and he's also
the person behind Bucardo replication.
> However, I strongly suspect that the answer lies
> amongst your replication tool and network topology.
Replication tool - no. Network topology, maybe.
> The symptoms do not
> match any failure modes that I'm familiar with in RT. You may wish to
> try simplifying your database configuration (temporarily install Pg on
> one of the front-ends) to see if that resolves any of the problems.
>
Not really possible or practical - as it would appear the problem is
related to network latency (of around 20ms round-trip - the query itself
completes in 0.021ms - ~0.400ms if you include the bind and dealloc) and
the use of column_info() calls... is there a reason why ShowCustomFields
needs to call column_info()?
Regards,
--
Michelle Sullivan
http://www.mhix.org/
More information about the rt-users
mailing list