[Rt-commit] rt branch, 4.0/fulltext-search, updated. rt-4.0.0-236-gd99fb38
Alex Vandiver
alexmv at bestpractical.com
Mon May 16 19:29:15 EDT 2011
The branch, 4.0/fulltext-search has been updated
via d99fb385b2b46191d749b48be83f21c7cbf2fd7f (commit)
from 3f5c3b21406eec4ac9f468edff351521801ca1b3 (commit)
Summary of changes:
docs/full_text_indexing.pod | 11 +++++++++++
1 files changed, 11 insertions(+), 0 deletions(-)
- Log -----------------------------------------------------------------
commit d99fb385b2b46191d749b48be83f21c7cbf2fd7f
Author: Alex Vandiver <alexmv at bestpractical.com>
Date: Mon May 16 19:28:53 2011 -0400
Note unicode limitations of full-text searching solutions
diff --git a/docs/full_text_indexing.pod b/docs/full_text_indexing.pod
index 88d88ef..9fad5a5 100644
--- a/docs/full_text_indexing.pod
+++ b/docs/full_text_indexing.pod
@@ -2,6 +2,17 @@
Full text indexing in RT
+=head1 LIMITATIONS
+
+While all of the below solutions can search for Unicode characters, they
+are not otherwise Unicode aware, and do no case folding, normalization,
+or the like. That is, a string that contains C<U+0065 LATIN SMALL
+LETTER E> followed by C<U+0301 COMBINING ACUTE ACCENT> will not match a
+search for C<U+00E9 LATIN SMALL LETTER E WITH ACUTE>. They also only
+know how to tokenize C<latin-1>-ish languages where words are separated
+by whitespace or similar characters; as such, support for searching for
+Japanese and Chinese content is extremely limited.
+
=head1 POSTGRES
=head2 Creating and configuring the index
-----------------------------------------------------------------------
More information about the Rt-commit
mailing list