[rt-users] Full text indexing failure (invalid byte sequence for encoding "UTF8")

Ben Poliakoff benp at reed.edu
Fri Feb 1 20:03:45 EST 2013


We're currently running RT 4.0.5-3~bpo60+1 (from Debian backports) with
Postgresql 8.4.12-0squeeze1.

Recently I tried to enable full text search following the instructions
here:

    http://blog.bestpractical.com/2011/06/full-text-searching.html

...but ran into this error an hour into the initial "rt-fulltext-indexer
--all":

    [crit]: error: ERROR:  invalid byte sequence for encoding "UTF8": 0xfc
    HINT:  This error can also happen if the byte sequence does not
      match the encoding expected by the server, which is controlled by
      "client_encoding". at /usr/sbin/rt-fulltext-indexer-4 line 375.
      (/usr/share/request-tracker4/lib/RT.pm:351)

Subsequent runs of the same command end with the same error.

The encoding for the rt4 db has been set to utf8 for as long as I can
recall.  I assume this relates to some data inserted into the db ages
ago when client_encoding was something other than utf8, or in a previous
version of postgresql which might have been less stringent about input.

There is a FAQ about 'invalid byte sequence for encoding' but I'm not
sure that this is the same issue.

Anyone else been through this sort of issue?  Would it be better to take
the question to a postgresql list?

Ben

-- 
________________________________________________________________________
pub   4096R/318B6A97 2009-05-11 Ben Poliakoff <benp at reed.edu>
 Primary key fingerprint: 3F23 EBC8 B73E 92B7 0A67  705A 8219 DCF0 318B 6A97
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 828 bytes
Desc: Digital signature
URL: <http://lists.bestpractical.com/pipermail/rt-users/attachments/20130201/89a656da/attachment.sig>


More information about the rt-users mailing list