[rt-users] '.' as delimiter/boundary breaks domain name searches

ktm at rice.edu ktm at rice.edu
Thu Oct 2 11:08:22 EDT 2014


On Thu, Oct 02, 2014 at 10:56:56AM -0400, Kevin Falcone wrote:
> On Wed, Oct 01, 2014 at 03:50:43PM -0400, Jeff Blaine wrote:
> > [ Similar, but unrelated to my other message from 10 minutes ago. ]
> > 
> > It appears any '.' is interpreted as a word boundary with
> > Pg full-text indexing turned on.
> > 
> > Is that known to be true, or am I wrong?
> > 
> > This breaks searches for FQDNs names in ticket contents.
> > 
> > Searching for 'foobar' will hit foobar.org
> > 
> > Searching for 'foobar.org' will not hit 'foobar.org'
> 
> What FTS will match/return is dictated by your database and its
> configuration.
> 
> Have you reviewed the Postgres full text search documentation for your
> release of Pg?
> 
> http://www.postgresql.org/docs/8.4/static/textsearch.html
> 
> -kevin


Wow! PostgreSQL 8.4, 4 major releases back! I cannot be certain that I
am recalling this correctly, but the default parser in older versions
of PostgreSQL did have that behavior. I do not know when they made the
change to fix it. What do you get when you run:

rt3=# select plainto_tsquery('rice.edu');
 plainto_tsquery 
-----------------
 'rice.edu'
(1 row)

I seem to recall that in the older version when I saw this issue, it
returned:

 plainto_tsquery
-----------------
 'rice' & 'edu'

You may be able to make a custom config for your text search using
the definitions from the current release. I just ended up searching
for 'rice' instead of 'rice.edu', for example.

Regards,
Ken



More information about the rt-users mailing list