[Rt-devel] Postgres performance issues and some "odd" ways to get
IDs...
Matthew Sullivan
matthew at sorbs.net
Mon Aug 14 11:24:30 EDT 2006
Hey All,
Preface: I'm a complete newb and l4m3r when it comes to DBs... however I
do have some very good friends who work as DB programmers... I am also
designing a new DB and learning lots... I am also running RT with around
12m rows...originally on MySQL, now on Postgres....
Performance on MySQL was ok until we broken the 10m row mark now it
sucks big time...
Performance on Postgres .. not hot but very consistent and usable...
Performance on Postgres a week later (even with performing VACUUM FULL
ANALYSE) .. sucks - not as bad as MySQL but not good... seems there are
a number of improvements that can be made ... the first of which I
noticed when I remember the golden rule a colleague drummed into me when
he saw one of my tables like:
Table "public.users"
Column | Type |
Modifiers
-----------------------+-----------------------------+------------------------------------------------
id | integer | not null default
nextval('users_id_seq'::text)
.. he said "don't do it, at best it's bad practice, at worst it creates
race conditions"... when you try to do things like get the last insert
id..... so when I was trying to get to he bottom of the doggyness in the
RT DB here I noticed a SELECT following each INSERT and that it was
taking average around 15 seconds and upto 45 seconds and seemed to be
the only slow queries... needless to say I was surprised when I saw what
it was doing (eg: SELECT id from Transactions where oid = <blah>;) ...
further inspection lead me to DBIx::SearchBuilder::Handle::Pg::Insert()
... and the following:
sub Insert {
my $self = shift;
my $table = shift;
my $sth = $self->SUPER::Insert($table, @_ );
unless ($sth) {
return ($sth);
}
#Lets get the id of that row we just inserted
my $oid = $sth->{'pg_oid_status'};
my $sql = "SELECT id FROM $table WHERE oid = ?";
my @row = $self->FetchResult($sql, $oid);
# TODO: Propagate Class::ReturnValue up here.
unless ($row[0]) {
print STDERR "Can't find $table.id for OID $oid";
return(undef);
}
$self->{'id'} = $row[0];
return ($self->{'id'});
}
.. is there any reason why you're not SELECTing the next val from the
sequence and passing that to the insert and returning it .... that way
you could remove the whole reliance on the OIDs in Postgres and improve
large DB performance (substantially in my case)....?
I figure as you are naming the sequences as <tablename>_id_seq,
hardcoding it is as bad as hardcoding column the SELECT id FROM
$table.... and it seems to be the only place that RT relies on OIDs....
Something like
SELECT nextval AS id from nextval('$table_id_seq'::text);
will get the correct value of id _before_ the insert - discard if the
insert fails ... Postgres has a very large number for sequences I doubt
you'll run out.... Of course this code would be Pg specific as MySQL
still doesn't support sequences properly .. but that's another story.
Comments/flames welcome...
Regards,
Mat
PS: It's 1:15am here after 6 hours of database performance debugging,
apologies if the above offends, it is not meant to.
More information about the Rt-devel
mailing list