[rt-users] Search results anomaly

Kenneth Marshall ktm at rice.edu
Mon May 17 15:00:11 EDT 2010


And what queries actually hit the database backend for these
two? If it is the same SQL query, then the problem is somewhere
else. If the query differs, try running the query manually to
see why it is giving unexpected results. Again as others have
mentioned, collation/encoding could cause these types of differences.

Cheers,
Ken

On Mon, May 17, 2010 at 02:48:49PM -0400, Jeff Blaine wrote:
> Another example of this failing:
>
> rt list -t ticket "Content like members.linode AND Owner = jsmith"
>
> 14 results
>
> rt list -t ticket "Content like linode AND Owner = jsmith"
>
> 77 results, all of which have "members.linode" in the content.
>
> On 4/26/2010 5:32 PM, Jeff Blaine wrote:
>> On 4/26/2010 5:04 PM, Kenneth Marshall wrote:
>>> I am not certain what to do. For myself, I would turn on
>>> SQL statement logging within the database. Set:
>>>
>>> log_min_duration_statement = 0
>>>
>>> in your postgresql.conf for the database and run both
>>> queries in RT. Then turn it back done and pick apart the
>>> query results to see if the problem is a logic problem or
>>> possibly an encoding issue which results in "equal" items
>>> not showing as "equal". Also, it looks like you are just
>>
>> These were sequential in the log as a result of my simple
>> "Content matches foo.com" query via the web GUI. One is
>> a COUNT and the other is for getting data, apparently.
>>
>> I formatted them here for easier reading.
>>
>> LOG: duration: 101.115 ms statement: EXECUTE <unnamed> [PREPARE:
>> SELECT COUNT(DISTINCT main.id) FROM Tickets main
>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id )
>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
>> Transactions_1.id )
>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>> AND (main.Status != 'deleted')
>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) )
>> AND (main.Type = 'ticket')
>> AND (main.EffectiveId = main.id)
>> ]
>>
>> LOG: duration: 105.350 ms statement: EXECUTE <unnamed> [PREPARE:
>> SELECT DISTINCT main.* FROM Tickets main
>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id )
>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
>> Transactions_1.id )
>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>> AND (main.Status != 'deleted')
>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) )
>> AND (main.Type = 'ticket')
>> AND (main.EffectiveId = main.id)
>> ORDER BY main.id ASC LIMIT 50
>> ]
>>
>> What "both queries" are you referring to? The RT CLI and the
>> RT web GUI both fail the same way with my 1 test query. I'm
>> confused by what you mean there.
>>
>> Thanks for the ideas, Ken
>>
>>> starting with such a low ticket number, but if you plan
>>> on using full content searches for any large-ish amount
>>> of data you should really plan on using the full-text index
>>> support for PostgreSQL/RT that is described in the wiki.
>>> Speaking as one whose database was brought to its knees by
>>> some creative data mining. :)
>>>
>>> Good luck and let us know what you find to be the cause.
>>> Regards,
>>>
>>> Ken
>>>
>>>
>>> On Mon, Apr 26, 2010 at 04:54:45PM -0400, Jeff Blaine wrote:
>>>> On 4/26/2010 4:52 PM, Kenneth Marshall wrote:
>>>>> Jeff,
>>>>>
>>>>> Are you using the Full-text index support from the wiki?
>>>>
>>>> Nope.
>>>>
>>>> Just RT 3.8.7 + RTFM 2.4.2. No add-ons/tweaks from the wiki.
>>>>
>>>>> There were some index bugs that may require you to re-index
>>>>> to fix, specifically some rows were not reported correctly
>>>>> as valid matches. I believe that the PostgreSQL release
>>>>> notes mentioned that need. Maybe that is your problem. We
>>>>> are using RT-3.8.5 and PostgreSQL 8.4.2 here with the full-text
>>>>> support without an issue.
>>>>>
>>>>> Regards,
>>>>> Ken
>>>>>
>>>>>
>>>>> On Mon, Apr 26, 2010 at 03:00:39PM -0400, Jeff Blaine wrote:
>>>>>> On 4/26/2010 2:19 PM, Kenneth Marshall wrote:
>>>>>>> Well, that knocks out the ACL issue. Do you think that your
>>>>>>> Mason cache is confused? Maybe stop RT, clear the cache, and
>>>>>>> restart RT to see if that helps. What DB backend are you using
>>>>>>> and which version of RT are you running?
>>>>>>
>>>>>> RT 3.8.7
>>>>>> PostgreSQL as it comes with RHELv5 + updates from yum
>>>>>>
>>>>>> Clearing the Mason cache didn't help :|
>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Ken
>>>>>>>
>>>>>>> On Mon, Apr 26, 2010 at 12:47:22PM -0400, Jeff Blaine wrote:
>>>>>>>> On 4/26/2010 12:29 PM, Raed El-Hames wrote:
>>>>>>>>> Jeff;
>>>>>>>>>
>>>>>>>>> Does your CLI user have permissions on the queue that ticket 39 is
>>>>>>>>> in??
>>>>>>>>> login to the web interface with the same cli user and see if you
>>>>>>>>> can
>>>>>>>>> view the ticket.
>>>>>>>>
>>>>>>>> Yes, it does.
>>>>>>>>
>>>>>>>> Again, however, this is not really a report about an anomaly in
>>>>>>>> the RT CLI.
>>>>>>>>
>>>>>>>> The incorrect search results are returned via a web GUI search
>>>>>>>> of "Content matches foo.com"
>>>>>>>>
>>>>>>>> Here, maybe this makes it more clear, showing the same problem
>>>>>>>> when using the RT CLI:
>>>>>>>>
>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt list "Content like foo.com"
>>>>>>>> Query:Content like 'foo.com'
>>>>>>>> Ticket Owner Queue Age Told Status Requestor Subject
>>>>>>>> --------------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> 23 mbs Incid 1 wk resolv enVision@ alert
>>>>>>>> -NICAlert-Secur
>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>
>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt show 39 | grep foo.com
>>>>>>>> foo.com blah blah... 1 line... not including in this email
>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>
>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt show 23 | grep foo.com
>>>>>>>> foo.com blah blah... not including in this email
>>>>>>>> foo.com matching lines 66 more times... not including in this email
>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Regards;
>>>>>>>>> Roy
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Jeff Blaine wrote:
>>>>>>>>>> On 4/26/2010 11:50 AM, Kenneth Marshall wrote:
>>>>>>>>>>> Hi Jeff,
>>>>>>>>>>>
>>>>>>>>>>> There is nothing here that indicates a problem. It looks
>>>>>>>>>>> like an apples vs. oranges comparison by the time you include
>>>>>>>>>>> the actual parameters of the search from the web interface
>>>>>>>>>>> and the rt commandline interface and possible privilege and
>>>>>>>>>>> ACL differences. You can use DB query logging to figure out
>>>>>>>>>>
>>>>>>>>>> I think my original post is being misinterpreted. The 'rt'
>>>>>>>>>> CLI commands aren't doing a search. They're just showing
>>>>>>>>>> this list's readers that 'foo.com' does show up in each of
>>>>>>>>>> the tickets when doing a simple 'rt show<ticket>'. It's
>>>>>>>>>> not a comparison of "CLI search vs. web search".
>>>>>>>>>>
>>>>>>>>>>> what SQL is being used in the web search or the commandline
>>>>>>>>>>> rt and compare the output piece-wise to put yourself at ease.
>>>>>>>>>>> Maybe look at the individual components of each of the two
>>>>>>>>>>> tickets, as well.
>>>>>>>>>>
>>>>>>>>>> When viewing the tickets using 'Full headers" and then
>>>>>>>>>> "Ctrl-F" to examine every instance of 'foo.com' in each ticket
>>>>>>>>>> shows that both tickets have the 'foo.com' in text/html parts
>>>>>>>>>> (and only there).
>>>>>>>>>>
>>>>>>>>>> Ticket 23 has 67 of those parts and is returned when RT searching
>>>>>>>>>> for 'foo.com'
>>>>>>>>>>
>>>>>>>>>> Ticket 39 has 1 of those parts and is not returned when RT
>>>>>>>>>> searching
>>>>>>>>>> for 'foo.com'
>>>>>>>>>>
>>>>>>>>>> By "DB query logging" do you mean Set($StatementLog, "DEBUG");
>>>>>>>>>> or something?
>>>>>>>>>>
>>>>>>>>>> Thanks for the reply, Ken
>>>>>>>>>>
>>>>>>>>>> Jeff
>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Ken
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Apr 26, 2010 at 11:21:45AM -0400, Jeff Blaine wrote:
>>>>>>>>>>>> Does anyone have any suggestions for how to go about
>>>>>>>>>>>> figuring out what is wrong here?
>>>>>>>>>>>>
>>>>>>>>>>>> On 4/22/2010 2:09 PM, Jeff Blaine wrote:
>>>>>>>>>>>>> RT 3.8.7
>>>>>>>>>>>>>
>>>>>>>>>>>>> A search for 'Content matches foo.com' is returning some
>>>>>>>>>>>>> tickets
>>>>>>>>>>>>> and missing others that clearly have foo.com in the Content.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> [root at rtsrv1 bin]# ./rt show 39 | grep foo.com | wc -l
>>>>>>>>>>>>>> 1
>>>>>>>>>>>>>> [root at rtsrv1 bin]#
>>>>>>>>>>>>>> [root at rtsrv1 bin]# ./rt show 23 | grep foo.com | wc -l
>>>>>>>>>>>>>> 67
>>>>>>>>>>>>>> [root at rtsrv1 bin]#
>>>>>>>>>>>>> 23 shows up in the web search results.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 39 does not.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Any ideas?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>>> Media.
>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>>>
>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>> Media.
>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>> Media.
>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>> Discover RT's hidden secrets with RT Essentials from O'Reilly Media.
>> Buy a copy at http://rtbook.bestpractical.com
>>
>
> Discover RT's hidden secrets with RT Essentials from O'Reilly Media.
> Buy a copy at http://rtbook.bestpractical.com
>



More information about the rt-users mailing list