[rt-users] Pg FTS query works, RT search returns 0 results ( was: '.' as delimiter/boundary breaks domain name searches)

Jeff Blaine jblaine at kickflop.net
Mon Aug 31 09:39:19 EDT 2015


I'm reviving this one time in case anyone has further ideas.

* PostgreSQL 8.4.20 (RHEL 6.6) with FTS does the right
  thing when parsing an email address[1]

* An RT 4.2.12 search for the same string returns 0 results[2].
  I'm 98% certain the ticket *is* indexed though as other queries
  return it[2].

* The rt-fulltext-indexer script runs every 10 minutes and has
  no errors.

Jeff

Footnote 1:

rt4=# SELECT alias, description, token FROM ts_debug('foo at domain.com');
 alias |  description  |     token
-------+---------------+----------------
 email | Email address | foo at domain.com
(1 row)

rt4=# SELECT alias, description, token FROM ts_debug('<foo at domain.com>');
 alias |  description  |     token
-------+---------------+----------------
 blank | Space symbols | <
 email | Email address | foo at domain.com
 blank | Space symbols | >
(3 rows)

rt4=#

Footnote 2:

Content LIKE 'foo at domain.com'

0 results

Content LIKE 'domain.com'

0 results

Content LIKE 'domain'

0 results

Content LIKE 'another unique string in the ticket I am trying to hit'

1 result

On 10/2/2014 11:08 AM, ktm at rice.edu wrote:
> On Thu, Oct 02, 2014 at 10:56:56AM -0400, Kevin Falcone wrote:
>> On Wed, Oct 01, 2014 at 03:50:43PM -0400, Jeff Blaine wrote:
>>> [ Similar, but unrelated to my other message from 10 minutes ago. ]
>>>
>>> It appears any '.' is interpreted as a word boundary with
>>> Pg full-text indexing turned on.
>>>
>>> Is that known to be true, or am I wrong?
>>>
>>> This breaks searches for FQDNs names in ticket contents.
>>>
>>> Searching for 'foobar' will hit foobar.org
>>>
>>> Searching for 'foobar.org' will not hit 'foobar.org'
>>
>> What FTS will match/return is dictated by your database and its
>> configuration.
>>
>> Have you reviewed the Postgres full text search documentation for your
>> release of Pg?
>>
>> http://www.postgresql.org/docs/8.4/static/textsearch.html
>>
>> -kevin
> 
> 
> Wow! PostgreSQL 8.4, 4 major releases back! I cannot be certain that I
> am recalling this correctly, but the default parser in older versions
> of PostgreSQL did have that behavior. I do not know when they made the
> change to fix it. What do you get when you run:
> 
> rt3=# select plainto_tsquery('rice.edu');
>  plainto_tsquery 
> -----------------
>  'rice.edu'
> (1 row)
> 
> I seem to recall that in the older version when I saw this issue, it
> returned:
> 
>  plainto_tsquery
> -----------------
>  'rice' & 'edu'
> 
> You may be able to make a custom config for your text search using
> the definitions from the current release. I just ended up searching
> for 'rice' instead of 'rice.edu', for example.
> 
> Regards,
> Ken
> 

-- 
Jeff Blaine
kickflop.net
PGP/GnuPG Key ID: 0x0C8EDD02



More information about the rt-users mailing list