[rt-users] Search results anomaly

Jeff Blaine jblaine at kickflop.net
Mon May 17 15:30:27 EDT 2010


On 5/17/2010 3:26 PM, Kenneth Marshall wrote:
> Hi Jeff,
>
> Those are just what I would have expected. I would take one
> of the results from the second set of 77 and figure out why
> it is not matching the first query. It does look more and more
> like a collation problem. What is the collation of an attachment
> that matches one query but not the other?

Thanks for the reply.

I'll gladly provide any info I can, but you're talking over
my head re: collation.  Can you provide some more info?  I'm
not a DB person.

> Ken
>
> On Mon, May 17, 2010 at 03:20:00PM -0400, Jeff Blaine wrote:
>> Here are the transactions as logged.  FWIW, I find it curious
>> that all of these search failures for us so far are with
>> search terms that have a '.' in the term...
>>
>> LOG:  duration: 35.815 ms  statement: EXECUTE<unnamed>   [PREPARE: SELECT
>> DISTINCT main.* FROM Tickets main
>>    JOIN Transactions Transactions_1
>>      ON ( Transactions_1.ObjectId = main.id )
>>    JOIN Attachments Attachments_2
>>      ON ( Attachments_2.TransactionId = Transactions_1.id )
>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>>         AND (main.Status != 'deleted')
>>         AND ( ( Attachments_2.Content ILIKE '%members.linode%' ) AND
>> main.Owner = '66')
>>         AND (main.Type = 'ticket')
>>         AND (main.EffectiveId = main.id)
>> ORDER BY main.id ASC ]
>>
>> LOG:  duration: 38.362 ms  statement: EXECUTE<unnamed>   [PREPARE: SELECT
>> DISTINCT main.* FROM Tickets main
>>    JOIN Transactions Transactions_1
>>      ON ( Transactions_1.ObjectId = main.id )
>>    JOIN Attachments Attachments_2
>>      ON ( Attachments_2.TransactionId = Transactions_1.id )
>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>>         AND (main.Status != 'deleted')
>>         AND ( ( Attachments_2.Content ILIKE '%linode%' ) AND main.Owner =
>> '66')
>>         AND (main.Type = 'ticket')
>>         AND (main.EffectiveId = main.id)
>> ORDER BY main.id ASC ]
>>
>> On 5/17/2010 3:00 PM, Kenneth Marshall wrote:
>>> And what queries actually hit the database backend for these
>>> two? If it is the same SQL query, then the problem is somewhere
>>> else. If the query differs, try running the query manually to
>>> see why it is giving unexpected results. Again as others have
>>> mentioned, collation/encoding could cause these types of differences.
>>>
>>> Cheers,
>>> Ken
>>>
>>> On Mon, May 17, 2010 at 02:48:49PM -0400, Jeff Blaine wrote:
>>>> Another example of this failing:
>>>>
>>>> rt list -t ticket "Content like members.linode AND Owner = jsmith"
>>>>
>>>> 14 results
>>>>
>>>> rt list -t ticket "Content like linode AND Owner = jsmith"
>>>>
>>>> 77 results, all of which have "members.linode" in the content.
>>>>
>>>> On 4/26/2010 5:32 PM, Jeff Blaine wrote:
>>>>> On 4/26/2010 5:04 PM, Kenneth Marshall wrote:
>>>>>> I am not certain what to do. For myself, I would turn on
>>>>>> SQL statement logging within the database. Set:
>>>>>>
>>>>>> log_min_duration_statement = 0
>>>>>>
>>>>>> in your postgresql.conf for the database and run both
>>>>>> queries in RT. Then turn it back done and pick apart the
>>>>>> query results to see if the problem is a logic problem or
>>>>>> possibly an encoding issue which results in "equal" items
>>>>>> not showing as "equal". Also, it looks like you are just
>>>>>
>>>>> These were sequential in the log as a result of my simple
>>>>> "Content matches foo.com" query via the web GUI. One is
>>>>> a COUNT and the other is for getting data, apparently.
>>>>>
>>>>> I formatted them here for easier reading.
>>>>>
>>>>> LOG: duration: 101.115 ms statement: EXECUTE<unnamed>   [PREPARE:
>>>>> SELECT COUNT(DISTINCT main.id) FROM Tickets main
>>>>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id
>>>>> )
>>>>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
>>>>> Transactions_1.id )
>>>>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>>>>> AND (main.Status != 'deleted')
>>>>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) )
>>>>> AND (main.Type = 'ticket')
>>>>> AND (main.EffectiveId = main.id)
>>>>> ]
>>>>>
>>>>> LOG: duration: 105.350 ms statement: EXECUTE<unnamed>   [PREPARE:
>>>>> SELECT DISTINCT main.* FROM Tickets main
>>>>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id
>>>>> )
>>>>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
>>>>> Transactions_1.id )
>>>>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>>>>> AND (main.Status != 'deleted')
>>>>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) )
>>>>> AND (main.Type = 'ticket')
>>>>> AND (main.EffectiveId = main.id)
>>>>> ORDER BY main.id ASC LIMIT 50
>>>>> ]
>>>>>
>>>>> What "both queries" are you referring to? The RT CLI and the
>>>>> RT web GUI both fail the same way with my 1 test query. I'm
>>>>> confused by what you mean there.
>>>>>
>>>>> Thanks for the ideas, Ken
>>>>>
>>>>>> starting with such a low ticket number, but if you plan
>>>>>> on using full content searches for any large-ish amount
>>>>>> of data you should really plan on using the full-text index
>>>>>> support for PostgreSQL/RT that is described in the wiki.
>>>>>> Speaking as one whose database was brought to its knees by
>>>>>> some creative data mining. :)
>>>>>>
>>>>>> Good luck and let us know what you find to be the cause.
>>>>>> Regards,
>>>>>>
>>>>>> Ken
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 26, 2010 at 04:54:45PM -0400, Jeff Blaine wrote:
>>>>>>> On 4/26/2010 4:52 PM, Kenneth Marshall wrote:
>>>>>>>> Jeff,
>>>>>>>>
>>>>>>>> Are you using the Full-text index support from the wiki?
>>>>>>>
>>>>>>> Nope.
>>>>>>>
>>>>>>> Just RT 3.8.7 + RTFM 2.4.2. No add-ons/tweaks from the wiki.
>>>>>>>
>>>>>>>> There were some index bugs that may require you to re-index
>>>>>>>> to fix, specifically some rows were not reported correctly
>>>>>>>> as valid matches. I believe that the PostgreSQL release
>>>>>>>> notes mentioned that need. Maybe that is your problem. We
>>>>>>>> are using RT-3.8.5 and PostgreSQL 8.4.2 here with the full-text
>>>>>>>> support without an issue.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Ken
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Apr 26, 2010 at 03:00:39PM -0400, Jeff Blaine wrote:
>>>>>>>>> On 4/26/2010 2:19 PM, Kenneth Marshall wrote:
>>>>>>>>>> Well, that knocks out the ACL issue. Do you think that your
>>>>>>>>>> Mason cache is confused? Maybe stop RT, clear the cache, and
>>>>>>>>>> restart RT to see if that helps. What DB backend are you using
>>>>>>>>>> and which version of RT are you running?
>>>>>>>>>
>>>>>>>>> RT 3.8.7
>>>>>>>>> PostgreSQL as it comes with RHELv5 + updates from yum
>>>>>>>>>
>>>>>>>>> Clearing the Mason cache didn't help :|
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Ken
>>>>>>>>>>
>>>>>>>>>> On Mon, Apr 26, 2010 at 12:47:22PM -0400, Jeff Blaine wrote:
>>>>>>>>>>> On 4/26/2010 12:29 PM, Raed El-Hames wrote:
>>>>>>>>>>>> Jeff;
>>>>>>>>>>>>
>>>>>>>>>>>> Does your CLI user have permissions on the queue that ticket 39
>>>>>>>>>>>> is
>>>>>>>>>>>> in??
>>>>>>>>>>>> login to the web interface with the same cli user and see if you
>>>>>>>>>>>> can
>>>>>>>>>>>> view the ticket.
>>>>>>>>>>>
>>>>>>>>>>> Yes, it does.
>>>>>>>>>>>
>>>>>>>>>>> Again, however, this is not really a report about an anomaly in
>>>>>>>>>>> the RT CLI.
>>>>>>>>>>>
>>>>>>>>>>> The incorrect search results are returned via a web GUI search
>>>>>>>>>>> of "Content matches foo.com"
>>>>>>>>>>>
>>>>>>>>>>> Here, maybe this makes it more clear, showing the same problem
>>>>>>>>>>> when using the RT CLI:
>>>>>>>>>>>
>>>>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt list "Content like foo.com"
>>>>>>>>>>> Query:Content like 'foo.com'
>>>>>>>>>>> Ticket Owner Queue Age Told Status Requestor Subject
>>>>>>>>>>> --------------------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>> 23 mbs Incid 1 wk resolv enVision@ alert
>>>>>>>>>>> -NICAlert-Secur
>>>>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>>>>
>>>>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt show 39 | grep foo.com
>>>>>>>>>>> foo.com blah blah... 1 line... not including in this email
>>>>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>>>>
>>>>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt show 23 | grep foo.com
>>>>>>>>>>> foo.com blah blah... not including in this email
>>>>>>>>>>> foo.com matching lines 66 more times... not including in this
>>>>>>>>>>> email
>>>>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Regards;
>>>>>>>>>>>> Roy
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Jeff Blaine wrote:
>>>>>>>>>>>>> On 4/26/2010 11:50 AM, Kenneth Marshall wrote:
>>>>>>>>>>>>>> Hi Jeff,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There is nothing here that indicates a problem. It looks
>>>>>>>>>>>>>> like an apples vs. oranges comparison by the time you include
>>>>>>>>>>>>>> the actual parameters of the search from the web interface
>>>>>>>>>>>>>> and the rt commandline interface and possible privilege and
>>>>>>>>>>>>>> ACL differences. You can use DB query logging to figure out
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think my original post is being misinterpreted. The 'rt'
>>>>>>>>>>>>> CLI commands aren't doing a search. They're just showing
>>>>>>>>>>>>> this list's readers that 'foo.com' does show up in each of
>>>>>>>>>>>>> the tickets when doing a simple 'rt show<ticket>'. It's
>>>>>>>>>>>>> not a comparison of "CLI search vs. web search".
>>>>>>>>>>>>>
>>>>>>>>>>>>>> what SQL is being used in the web search or the commandline
>>>>>>>>>>>>>> rt and compare the output piece-wise to put yourself at ease.
>>>>>>>>>>>>>> Maybe look at the individual components of each of the two
>>>>>>>>>>>>>> tickets, as well.
>>>>>>>>>>>>>
>>>>>>>>>>>>> When viewing the tickets using 'Full headers" and then
>>>>>>>>>>>>> "Ctrl-F" to examine every instance of 'foo.com' in each ticket
>>>>>>>>>>>>> shows that both tickets have the 'foo.com' in text/html parts
>>>>>>>>>>>>> (and only there).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ticket 23 has 67 of those parts and is returned when RT
>>>>>>>>>>>>> searching
>>>>>>>>>>>>> for 'foo.com'
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ticket 39 has 1 of those parts and is not returned when RT
>>>>>>>>>>>>> searching
>>>>>>>>>>>>> for 'foo.com'
>>>>>>>>>>>>>
>>>>>>>>>>>>> By "DB query logging" do you mean Set($StatementLog, "DEBUG");
>>>>>>>>>>>>> or something?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the reply, Ken
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jeff
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Ken
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Apr 26, 2010 at 11:21:45AM -0400, Jeff Blaine wrote:
>>>>>>>>>>>>>>> Does anyone have any suggestions for how to go about
>>>>>>>>>>>>>>> figuring out what is wrong here?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 4/22/2010 2:09 PM, Jeff Blaine wrote:
>>>>>>>>>>>>>>>> RT 3.8.7
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> A search for 'Content matches foo.com' is returning some
>>>>>>>>>>>>>>>> tickets
>>>>>>>>>>>>>>>> and missing others that clearly have foo.com in the Content.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [root at rtsrv1 bin]# ./rt show 39 | grep foo.com | wc -l
>>>>>>>>>>>>>>>>> 1
>>>>>>>>>>>>>>>>> [root at rtsrv1 bin]#
>>>>>>>>>>>>>>>>> [root at rtsrv1 bin]# ./rt show 23 | grep foo.com | wc -l
>>>>>>>>>>>>>>>>> 67
>>>>>>>>>>>>>>>>> [root at rtsrv1 bin]#
>>>>>>>>>>>>>>>> 23 shows up in the web search results.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 39 does not.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Any ideas?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>>>>>> Media.
>>>>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>>>>> Media.
>>>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>>> Media.
>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly Media.
>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>
>>>>
>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly Media.
>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>
>>>
>>
>



More information about the rt-users mailing list