[rt-devel] Fulltext Indexing
Christian Loos
cloos at netcologne.de
Tue May 19 05:05:44 EDT 2015
Hi RT developers,
first, thanks for the Fulltext Indexing improvements in RT 4.2.11.
My first time index creation drops from estimated 13 hours (I killed the
indexing after 2,5 hours and calculated the estimated time) to 35 minutes.
By playing around fullext indexing, I noticed that the EmailRecord and
CommentEmailRecord transaction attachments are also indexed. These
contains quite redundant informations as this attachments consist of the
content (Create, Correspond or Comment) and the template text.
For example in the default RT configuration with queue AdminCcs, a
ticket create results in a Create transaction and two EmailRecord
transactions (one for the Requestor autoreply and one for the queue
AdminCcs). So the valuable information in the create attachments is
indexed three times.
Attached a patch which excludes attachments from EmailRecord and
CommentEmailRecord transactions and the indexing results (using MySQL
5.5, AttachmentsIndex is a MyISAM table, 1333901 text/plain and
text/html attachments):
RT 4.2.11:
time /opt/rt4/sbin/rt-setup-fulltext-index --index-type mysql --table
AttachmentsIndex
36m39.340s
mysql -e 'select count(*) from rt4.AttachmentsIndex'
1333901
du -h /var/lib/mysql/rt4/AttachmentsIndex*
12K /var/lib/mysql/rt4/AttachmentsIndex.frm
1.3G /var/lib/mysql/rt4/AttachmentsIndex.MYD
653M /var/lib/mysql/rt4/AttachmentsIndex.MYI
RT 4.2.11 with the patch applied:
time /opt/rt4/sbin/rt-setup-fulltext-index --index-type mysql --table
AttachmentsIndex
26m43.715s
mysql -e 'select count(*) from rt4.AttachmentsIndex'
867423
du -h /var/lib/mysql/rt4/AttachmentsIndex*
12K /var/lib/mysql/rt4/AttachmentsIndex.frm
877M /var/lib/mysql/rt4/AttachmentsIndex.MYD
399M /var/lib/mysql/rt4/AttachmentsIndex.MYI
I think the results are worth to consider integrating my patch.
Thanks.
Chris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rt-fulltext-indexer.patch
Type: text/x-patch
Size: 816 bytes
Desc: not available
URL: <http://lists.bestpractical.com/pipermail/rt-devel/attachments/20150519/d8f50e88/attachment.bin>
More information about the rt-devel
mailing list