[Rt-commit] rt branch, 4.4/fts-refactor-performance, repushed

Alex Vandiver alexmv at bestpractical.com
Thu Jan 8 16:17:12 EST 2015


The branch 4.4/fts-refactor-performance was deleted and repushed:
       was dff4059ab03803ee0990573ca9397b034ccc998a
       now f8e301225ed61b860b81344bfd584b114c016934

 1:  4660cf6 =  1:  2823ea5 Add full path to one rt-fulltext-indexer that lacks it
 2:  9655340 =  2:  37af06d Add additional clarification points about Sphinx on MySQL
 3:  e390dc6 =  3:  437c763 Drop sphinx xmlpipe2 output, which was unusable and undocumented
 4:  6ba220e =  4:  48e71ae Drop finalize and clean functions, which are now unused
 5:  8da4115 =  5:  f408dcf Rename Sphinx FTS search tests to "sphinx", not "mysql"
 6:  982ca07 !  6:  d8f3b0c Support native FTS on MySQL 5.6 and above
    @@ -58,6 +58,19 @@
     +C<cron>:
     +
     +    /opt/rt4/sbin/rt-fulltext-indexer --quiet
    ++
    ++=head3 Caveats
    ++
    ++Searching is done in "boolean mode."  As such, the TicketSQL query
    ++C<Content LIKE 'winter 2014'> will return tickets with transactions that
    ++contain I<either> word.  To find transactions which contain both (but
    ++not necessarily adjacent), use C<Content LIKE '+winter +2014'>.  To find
    ++transactions containing the precise phrase, use C<Content LIKE '"winter
    ++2014">.
    ++
    ++See L<the mysql documentation, at
    ++L<http://dev.mysql.com/doc/refman/5.6/en/fulltext-boolean.html>, for a
    ++list of the full capabilities.
     +
     +
     +=head2 MySQL with Sphinx
    @@ -159,12 +172,11 @@
              }
     +        elsif ( $db_type eq 'mysql' and not $config->{Sphinx}) {
     +            my $dbh = $RT::Handle->dbh;
    -+            $value =~ s/["\\]+/ /g;
     +            $self->Limit(
     +                %rest,
     +                FUNCTION    => "MATCH($alias.Content)",
     +                OPERATOR    => 'AGAINST',
    -+                VALUE       => '("'. $dbh->quote($value) .'" IN BOOLEAN MODE)',
    ++                VALUE       => "(". $dbh->quote($value) ." IN BOOLEAN MODE)",
     +                QUOTEVALUE  => 0,
     +            );
     +            # As with Oracle, above, this forces the LEFT JOINs into
 7:  8e36778 !  7:  737acaf Using a separate MyISAM table, we can also support FTS on MySQL < 5.6
    @@ -27,21 +27,6 @@
     +table (which is InnoDB on versions of MySQL which support it), run:
      
          /opt/rt4/sbin/rt-setup-fulltext-index
    - 
    -@@
    - 
    -     /opt/rt4/sbin/rt-fulltext-indexer --quiet
    - 
    -+=head3 Caveats
    -+
    -+On versions of MySQL prior to 5.6, a MyISAM table is used.  This may
    -+cause poor performance, as the database server is likely tuned for
    -+InnoDB performance, not MyISAM performance.  Once the MySQL server is
    -+upgraded to version 5.6 or above, the extra table should be re-created
    -+as InnoDB by re-running the steps above.
    -+
    - 
    - =head2 MySQL with Sphinx
      
     
     diff --git a/sbin/rt-setup-fulltext-index.in b/sbin/rt-setup-fulltext-index.in
 8:  ed183c8 =  8:  7cb616a extract_text and extract_html are identical; inline them
 9:  0d40c43 =  9:  38deb6d Inline the differences between text/plain and text/html attachment lists
10:  86e69b6 = 10:  144a090 Stop skipping indexing of text/html within multipart/alternative
11:  d1076aa = 11:  920b09a Use the new, shorter, initialization form
12:  133c2b8 = 12:  d141515 Simplify and condense option parsing
13:  16c1297 = 13:  3311152 Documentation has moved out; update --help accordingly
14:  bf3f508 = 14:  4e4667d Remove AUTHOR section; it is unnecessary in core sbin files
15:  0750cf9 = 15:  34c4843 Skipping ACL checks yields a sizable performance increase
16:  a78cba2 = 16:  3e8c139 Index attachments in one pass through the database, not two
17:  e458362 = 17:  ca8668f Index attachments even on deleted tickets
18:  18250bd = 18:  bfa8a42 mysql and pg share the same last_indexed; unify the method
19:  eb95ffa = 19:  d7d1840 Replace the last use of goto_specific with explicit function calls
20:  f706f2b = 20:  f5a891d Simplify last_indexed
21:  1978743 = 21:  27da9ed Only call last_indexed once, as it may be heavy
22:  975d6f6 = 22:  abd8b88 Index even empty attachments
23:  9f41fc6 = 23:  24e8fed As last_indexed is based on the highest insert, there will never be an UPDATE needed
24:  5a8a4db = 24:  236ee0c Inversion of control of main indexing loops
25:  c569765 = 25:  c4bd561 Switch to preparing statements, rather than just setting strings
26:  440a429 = 26:  48460b3 INSERT DELAYED provides notable speed benefits on MyISAM
27:  ae4ab61 = 27:  dd4fbe4 Improve MySQL insert speed by batching inserts into one statement
28:  e1af270 = 28:  796c0e0 Testing finds 200 is a good default batch size
29:  05a16d1 = 29:  845b498 Indexing of "text" may fail for content with invalid UTF-8 byte sequences
30:  45bfd41 = 30:  8b7a0ba Refactor PostgreSQL's insert to also do bulk insertion
31:  4d21e31 = 31:  9bf8339 Perform PostgreSQL UPDATE statements inside of a database transaction
32:  d7bca97 = 32:  ae0d14c If a new table is used for indexing, grant rights on it
33:  1ac6246 = 33:  4af7673 Insert data to index before creating the index
34:  4045f7d = 34:  13b13f7 Switch the default Postgres index to GIN
35:  dff4059 = 35:  f8e3012 Default to storing the tsvector in a new table, to speed indexing



More information about the rt-commit mailing list