[rt-users] Search results anomaly

Kenneth Marshall ktm at rice.edu
Mon May 17 15:26:45 EDT 2010


Hi Jeff,

Those are just what I would have expected. I would take one
of the results from the second set of 77 and figure out why
it is not matching the first query. It does look more and more
like a collation problem. What is the collation of an attachment
that matches one query but not the other?

Ken

On Mon, May 17, 2010 at 03:20:00PM -0400, Jeff Blaine wrote:
> Here are the transactions as logged.  FWIW, I find it curious
> that all of these search failures for us so far are with
> search terms that have a '.' in the term...
>
> LOG:  duration: 35.815 ms  statement: EXECUTE <unnamed>  [PREPARE: SELECT 
> DISTINCT main.* FROM Tickets main
>   JOIN Transactions Transactions_1
>     ON ( Transactions_1.ObjectId = main.id )
>   JOIN Attachments Attachments_2
>     ON ( Attachments_2.TransactionId = Transactions_1.id )
> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>        AND (main.Status != 'deleted')
>        AND ( ( Attachments_2.Content ILIKE '%members.linode%' ) AND 
> main.Owner = '66')
>        AND (main.Type = 'ticket')
>        AND (main.EffectiveId = main.id)
> ORDER BY main.id ASC ]
>
> LOG:  duration: 38.362 ms  statement: EXECUTE <unnamed>  [PREPARE: SELECT 
> DISTINCT main.* FROM Tickets main
>   JOIN Transactions Transactions_1
>     ON ( Transactions_1.ObjectId = main.id )
>   JOIN Attachments Attachments_2
>     ON ( Attachments_2.TransactionId = Transactions_1.id )
> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>        AND (main.Status != 'deleted')
>        AND ( ( Attachments_2.Content ILIKE '%linode%' ) AND main.Owner = 
> '66')
>        AND (main.Type = 'ticket')
>        AND (main.EffectiveId = main.id)
> ORDER BY main.id ASC ]
>
> On 5/17/2010 3:00 PM, Kenneth Marshall wrote:
>> And what queries actually hit the database backend for these
>> two? If it is the same SQL query, then the problem is somewhere
>> else. If the query differs, try running the query manually to
>> see why it is giving unexpected results. Again as others have
>> mentioned, collation/encoding could cause these types of differences.
>>
>> Cheers,
>> Ken
>>
>> On Mon, May 17, 2010 at 02:48:49PM -0400, Jeff Blaine wrote:
>>> Another example of this failing:
>>>
>>> rt list -t ticket "Content like members.linode AND Owner = jsmith"
>>>
>>> 14 results
>>>
>>> rt list -t ticket "Content like linode AND Owner = jsmith"
>>>
>>> 77 results, all of which have "members.linode" in the content.
>>>
>>> On 4/26/2010 5:32 PM, Jeff Blaine wrote:
>>>> On 4/26/2010 5:04 PM, Kenneth Marshall wrote:
>>>>> I am not certain what to do. For myself, I would turn on
>>>>> SQL statement logging within the database. Set:
>>>>>
>>>>> log_min_duration_statement = 0
>>>>>
>>>>> in your postgresql.conf for the database and run both
>>>>> queries in RT. Then turn it back done and pick apart the
>>>>> query results to see if the problem is a logic problem or
>>>>> possibly an encoding issue which results in "equal" items
>>>>> not showing as "equal". Also, it looks like you are just
>>>>
>>>> These were sequential in the log as a result of my simple
>>>> "Content matches foo.com" query via the web GUI. One is
>>>> a COUNT and the other is for getting data, apparently.
>>>>
>>>> I formatted them here for easier reading.
>>>>
>>>> LOG: duration: 101.115 ms statement: EXECUTE<unnamed>  [PREPARE:
>>>> SELECT COUNT(DISTINCT main.id) FROM Tickets main
>>>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id 
>>>> )
>>>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
>>>> Transactions_1.id )
>>>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>>>> AND (main.Status != 'deleted')
>>>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) )
>>>> AND (main.Type = 'ticket')
>>>> AND (main.EffectiveId = main.id)
>>>> ]
>>>>
>>>> LOG: duration: 105.350 ms statement: EXECUTE<unnamed>  [PREPARE:
>>>> SELECT DISTINCT main.* FROM Tickets main
>>>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id 
>>>> )
>>>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
>>>> Transactions_1.id )
>>>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>>>> AND (main.Status != 'deleted')
>>>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) )
>>>> AND (main.Type = 'ticket')
>>>> AND (main.EffectiveId = main.id)
>>>> ORDER BY main.id ASC LIMIT 50
>>>> ]
>>>>
>>>> What "both queries" are you referring to? The RT CLI and the
>>>> RT web GUI both fail the same way with my 1 test query. I'm
>>>> confused by what you mean there.
>>>>
>>>> Thanks for the ideas, Ken
>>>>
>>>>> starting with such a low ticket number, but if you plan
>>>>> on using full content searches for any large-ish amount
>>>>> of data you should really plan on using the full-text index
>>>>> support for PostgreSQL/RT that is described in the wiki.
>>>>> Speaking as one whose database was brought to its knees by
>>>>> some creative data mining. :)
>>>>>
>>>>> Good luck and let us know what you find to be the cause.
>>>>> Regards,
>>>>>
>>>>> Ken
>>>>>
>>>>>
>>>>> On Mon, Apr 26, 2010 at 04:54:45PM -0400, Jeff Blaine wrote:
>>>>>> On 4/26/2010 4:52 PM, Kenneth Marshall wrote:
>>>>>>> Jeff,
>>>>>>>
>>>>>>> Are you using the Full-text index support from the wiki?
>>>>>>
>>>>>> Nope.
>>>>>>
>>>>>> Just RT 3.8.7 + RTFM 2.4.2. No add-ons/tweaks from the wiki.
>>>>>>
>>>>>>> There were some index bugs that may require you to re-index
>>>>>>> to fix, specifically some rows were not reported correctly
>>>>>>> as valid matches. I believe that the PostgreSQL release
>>>>>>> notes mentioned that need. Maybe that is your problem. We
>>>>>>> are using RT-3.8.5 and PostgreSQL 8.4.2 here with the full-text
>>>>>>> support without an issue.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Ken
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Apr 26, 2010 at 03:00:39PM -0400, Jeff Blaine wrote:
>>>>>>>> On 4/26/2010 2:19 PM, Kenneth Marshall wrote:
>>>>>>>>> Well, that knocks out the ACL issue. Do you think that your
>>>>>>>>> Mason cache is confused? Maybe stop RT, clear the cache, and
>>>>>>>>> restart RT to see if that helps. What DB backend are you using
>>>>>>>>> and which version of RT are you running?
>>>>>>>>
>>>>>>>> RT 3.8.7
>>>>>>>> PostgreSQL as it comes with RHELv5 + updates from yum
>>>>>>>>
>>>>>>>> Clearing the Mason cache didn't help :|
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Ken
>>>>>>>>>
>>>>>>>>> On Mon, Apr 26, 2010 at 12:47:22PM -0400, Jeff Blaine wrote:
>>>>>>>>>> On 4/26/2010 12:29 PM, Raed El-Hames wrote:
>>>>>>>>>>> Jeff;
>>>>>>>>>>>
>>>>>>>>>>> Does your CLI user have permissions on the queue that ticket 39 
>>>>>>>>>>> is
>>>>>>>>>>> in??
>>>>>>>>>>> login to the web interface with the same cli user and see if you
>>>>>>>>>>> can
>>>>>>>>>>> view the ticket.
>>>>>>>>>>
>>>>>>>>>> Yes, it does.
>>>>>>>>>>
>>>>>>>>>> Again, however, this is not really a report about an anomaly in
>>>>>>>>>> the RT CLI.
>>>>>>>>>>
>>>>>>>>>> The incorrect search results are returned via a web GUI search
>>>>>>>>>> of "Content matches foo.com"
>>>>>>>>>>
>>>>>>>>>> Here, maybe this makes it more clear, showing the same problem
>>>>>>>>>> when using the RT CLI:
>>>>>>>>>>
>>>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt list "Content like foo.com"
>>>>>>>>>> Query:Content like 'foo.com'
>>>>>>>>>> Ticket Owner Queue Age Told Status Requestor Subject
>>>>>>>>>> --------------------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>> 23 mbs Incid 1 wk resolv enVision@ alert
>>>>>>>>>> -NICAlert-Secur
>>>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>>>
>>>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt show 39 | grep foo.com
>>>>>>>>>> foo.com blah blah... 1 line... not including in this email
>>>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>>>
>>>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt show 23 | grep foo.com
>>>>>>>>>> foo.com blah blah... not including in this email
>>>>>>>>>> foo.com matching lines 66 more times... not including in this 
>>>>>>>>>> email
>>>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Regards;
>>>>>>>>>>> Roy
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Jeff Blaine wrote:
>>>>>>>>>>>> On 4/26/2010 11:50 AM, Kenneth Marshall wrote:
>>>>>>>>>>>>> Hi Jeff,
>>>>>>>>>>>>>
>>>>>>>>>>>>> There is nothing here that indicates a problem. It looks
>>>>>>>>>>>>> like an apples vs. oranges comparison by the time you include
>>>>>>>>>>>>> the actual parameters of the search from the web interface
>>>>>>>>>>>>> and the rt commandline interface and possible privilege and
>>>>>>>>>>>>> ACL differences. You can use DB query logging to figure out
>>>>>>>>>>>>
>>>>>>>>>>>> I think my original post is being misinterpreted. The 'rt'
>>>>>>>>>>>> CLI commands aren't doing a search. They're just showing
>>>>>>>>>>>> this list's readers that 'foo.com' does show up in each of
>>>>>>>>>>>> the tickets when doing a simple 'rt show<ticket>'. It's
>>>>>>>>>>>> not a comparison of "CLI search vs. web search".
>>>>>>>>>>>>
>>>>>>>>>>>>> what SQL is being used in the web search or the commandline
>>>>>>>>>>>>> rt and compare the output piece-wise to put yourself at ease.
>>>>>>>>>>>>> Maybe look at the individual components of each of the two
>>>>>>>>>>>>> tickets, as well.
>>>>>>>>>>>>
>>>>>>>>>>>> When viewing the tickets using 'Full headers" and then
>>>>>>>>>>>> "Ctrl-F" to examine every instance of 'foo.com' in each ticket
>>>>>>>>>>>> shows that both tickets have the 'foo.com' in text/html parts
>>>>>>>>>>>> (and only there).
>>>>>>>>>>>>
>>>>>>>>>>>> Ticket 23 has 67 of those parts and is returned when RT 
>>>>>>>>>>>> searching
>>>>>>>>>>>> for 'foo.com'
>>>>>>>>>>>>
>>>>>>>>>>>> Ticket 39 has 1 of those parts and is not returned when RT
>>>>>>>>>>>> searching
>>>>>>>>>>>> for 'foo.com'
>>>>>>>>>>>>
>>>>>>>>>>>> By "DB query logging" do you mean Set($StatementLog, "DEBUG");
>>>>>>>>>>>> or something?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the reply, Ken
>>>>>>>>>>>>
>>>>>>>>>>>> Jeff
>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Ken
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Apr 26, 2010 at 11:21:45AM -0400, Jeff Blaine wrote:
>>>>>>>>>>>>>> Does anyone have any suggestions for how to go about
>>>>>>>>>>>>>> figuring out what is wrong here?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 4/22/2010 2:09 PM, Jeff Blaine wrote:
>>>>>>>>>>>>>>> RT 3.8.7
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> A search for 'Content matches foo.com' is returning some
>>>>>>>>>>>>>>> tickets
>>>>>>>>>>>>>>> and missing others that clearly have foo.com in the Content.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [root at rtsrv1 bin]# ./rt show 39 | grep foo.com | wc -l
>>>>>>>>>>>>>>>> 1
>>>>>>>>>>>>>>>> [root at rtsrv1 bin]#
>>>>>>>>>>>>>>>> [root at rtsrv1 bin]# ./rt show 23 | grep foo.com | wc -l
>>>>>>>>>>>>>>>> 67
>>>>>>>>>>>>>>>> [root at rtsrv1 bin]#
>>>>>>>>>>>>>>> 23 shows up in the web search results.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 39 does not.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any ideas?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>>>>> Media.
>>>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>>>> Media.
>>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>> Media.
>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly Media.
>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>
>>>
>>> Discover RT's hidden secrets with RT Essentials from O'Reilly Media.
>>> Buy a copy at http://rtbook.bestpractical.com
>>>
>>
>



More information about the rt-users mailing list