[rt-users] PostgreSQL full-text index default type: GiST or GIN
mzagrabe at d.umn.edu
Thu Mar 17 18:47:50 EDT 2016
On Thu, Mar 17, 2016 at 3:29 PM, ktm at rice.edu <ktm at rice.edu> wrote:
> I am working on an upgrade from 3.8.x to 4.4.x and I noticed the
> following comment in the UPGRADING-4.2 file:
> The full-text indexing defaults for PostgreSQL have changed; GiST is now
> the suggested index, as well as storing data in a separate
> AttachmentsIndex table. Both changes improve lookup speed. For
> improved search performance, you may wish to drop existing C<tsvector>
> and C<GIN> indexes on C<Attachments>, and re-generate the index using
> I am curious about the benchmarks you used to make this determination?
> In particular, the GIN fastscan option can dramatically improve search
> performance in versions 9.4 and above. Here is a nice discussion:
> In addition, GIN indexes are much smaller in PostgreSQL 9.4 and above.
> So I am curious about the data that motivated the change to GiST.
Here is a correspondence between myself and a BP engineer you may find helpful:
On Fri, 2013-12-13 at 12:24 -0500, Matt Zagrabelny via RT wrote:
> I've enabled full-text searching on our 4.2 instance.
> How often are folks running the sbin/rt-fulltext-indexer to keep the
> index up-to-date?
Since the pg updates are incremental, they are safe to run rather
frequently. Depending on ticket volume, they've been run anywhere from
every minute to every hour. Unlike the Sphinx indexes for MySQL,
there's no need to hold back and run them daily. Indeed, we've
considered updating the Pg index in realtime as attachments are created
-- it's sufficiently fast.
> We are using Pg 9.1 on the backend with GIN indexing. We get ~100
> tickets created per day and I am not sure how many transactions are
> created per day.
> Any best practices or anecdotal data would be helpful. Also, how
> "strong" of a default recommendation is the GiST indexing? Am I being
> foolish for choosing GIN?
The GiST recommendation is very much a soft one. We're not aware of any
particular cases where it has been explicitly required, nor have we done
comprehensive comparisons between the two on RT's data at scale.
---end of email---
There is also some good documentation on the Pg site regarding the
> RT 4.4 and RTIR Training Sessions https://bestpractical.com/training
> * Washington DC - May 23 & 24, 2016
More information about the rt-users