[rt-users] Making "make initdb" idempotent?

Fri Jun 21 00:52:46 EDT 2013

Hi all

I'm in the process of packing up various small bits of RT code for our
RT install into an extension that can be properly versioned and tracked.

There's just one frustrating barrier to this: RT's extension system
provides a "make initdb", but no way to make the application of the
initial data provided in etc/initialdata idempotent. If you run "make
initdb" twice, it duplicates everything, not just adding any new records.

Is this a problem anyone else has looked into? I'm considering teaching
"make initdb" to do a basic upsert: Lock the table in question in
EXCLUSIVE mode (PostgreSQL, allows SELECTs but nothing else in other
transactions), then do the usual portable two-command sequence:

UPDATE thetable
SET description = 'thedescription', ...
WHERE name = 'thename';

INSERT INTO thetable(name, description, ...)
SELECT 'thename', 'thedescription', ...
WHERE NOT EXISTS (SELECT 1 FROM thetable WHERE thetable.name = 'thename');

For other DBs, just do a "LOCK tablename;" (unless MySQL, etc, users
want to suggest an appropriate lock mode).

The problem with this is that the name isn't really meant to be a key,
for all sorts of reasons. It's user-visible, it's in natural language,
and it's translatable.

It'd be strongly preferable to add a new field to the initdb parameter
hash that provides a non-translatable internal text identifier for the
command of interest. Here's what I'm thinking of:

- Add a new nullable unique text column to scrips, scripactions, and
scripconditions named 'extension_identifier'.

- Have 'make initdb' check for a hash param 'ExtensionIdentifier' in
each row. If none is present, behave exactly as now, leaving the
extension_identifier column NULL.

- If the ExtensionIdentifier key is found, Do an upsert with table lock
as described above for each row when we 'make initdb', but instead of
using the 'name' field, discriminate based on the 'extension_identifier'
column.

This would allow extensions to provide a unique identifier for their
scrips, scrip actions, scrip conditions, etc, that if used consistently
would let them make the initdb command idempotent, so it can be used
when an extension is updated, not just first installed.

Guidance for extension authors would suggest that the extension's name
be used to prefix the extension identifier so as to avoid clashes.

No support for deleting obsolete entries would be provided; if an entry
vanishes from the initialdata but was installed from a previous version,
it would not be touched in the DB. That's let extensions continue to
provide support for old actions, conditions, etc without exposing them
to new users.

By using the extension name and a text identifier the need to have a
central identifier registry is avoided; CPAN's Perl namespace registry
already serves the required purpose.

Thoughts/opinions?

Would the RT team consider accepting a patch implementing the above as a
change to RTxInitDB ?

If so, how would you deal with the addition of the new column to each of
the tables? Do it as part of the upgrade script for the next RT release
that needs an upgrade script anyway, and if it isn't present, ignore the
ExtensionIdentifier field in the initialdata for backward compatibility?
-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services