[rt-users] Search results anomaly
Jeff Blaine
jblaine at kickflop.net
Mon May 17 15:20:00 EDT 2010
Here are the transactions as logged. FWIW, I find it curious
that all of these search failures for us so far are with
search terms that have a '.' in the term...
LOG: duration: 35.815 ms statement: EXECUTE <unnamed> [PREPARE:
SELECT DISTINCT main.* FROM Tickets main
JOIN Transactions Transactions_1
ON ( Transactions_1.ObjectId = main.id )
JOIN Attachments Attachments_2
ON ( Attachments_2.TransactionId = Transactions_1.id )
WHERE (Transactions_1.ObjectType = 'RT::Ticket')
AND (main.Status != 'deleted')
AND ( ( Attachments_2.Content ILIKE '%members.linode%' ) AND
main.Owner = '66')
AND (main.Type = 'ticket')
AND (main.EffectiveId = main.id)
ORDER BY main.id ASC ]
LOG: duration: 38.362 ms statement: EXECUTE <unnamed> [PREPARE:
SELECT DISTINCT main.* FROM Tickets main
JOIN Transactions Transactions_1
ON ( Transactions_1.ObjectId = main.id )
JOIN Attachments Attachments_2
ON ( Attachments_2.TransactionId = Transactions_1.id )
WHERE (Transactions_1.ObjectType = 'RT::Ticket')
AND (main.Status != 'deleted')
AND ( ( Attachments_2.Content ILIKE '%linode%' ) AND main.Owner
= '66')
AND (main.Type = 'ticket')
AND (main.EffectiveId = main.id)
ORDER BY main.id ASC ]
On 5/17/2010 3:00 PM, Kenneth Marshall wrote:
> And what queries actually hit the database backend for these
> two? If it is the same SQL query, then the problem is somewhere
> else. If the query differs, try running the query manually to
> see why it is giving unexpected results. Again as others have
> mentioned, collation/encoding could cause these types of differences.
>
> Cheers,
> Ken
>
> On Mon, May 17, 2010 at 02:48:49PM -0400, Jeff Blaine wrote:
>> Another example of this failing:
>>
>> rt list -t ticket "Content like members.linode AND Owner = jsmith"
>>
>> 14 results
>>
>> rt list -t ticket "Content like linode AND Owner = jsmith"
>>
>> 77 results, all of which have "members.linode" in the content.
>>
>> On 4/26/2010 5:32 PM, Jeff Blaine wrote:
>>> On 4/26/2010 5:04 PM, Kenneth Marshall wrote:
>>>> I am not certain what to do. For myself, I would turn on
>>>> SQL statement logging within the database. Set:
>>>>
>>>> log_min_duration_statement = 0
>>>>
>>>> in your postgresql.conf for the database and run both
>>>> queries in RT. Then turn it back done and pick apart the
>>>> query results to see if the problem is a logic problem or
>>>> possibly an encoding issue which results in "equal" items
>>>> not showing as "equal". Also, it looks like you are just
>>>
>>> These were sequential in the log as a result of my simple
>>> "Content matches foo.com" query via the web GUI. One is
>>> a COUNT and the other is for getting data, apparently.
>>>
>>> I formatted them here for easier reading.
>>>
>>> LOG: duration: 101.115 ms statement: EXECUTE<unnamed> [PREPARE:
>>> SELECT COUNT(DISTINCT main.id) FROM Tickets main
>>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id )
>>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
>>> Transactions_1.id )
>>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>>> AND (main.Status != 'deleted')
>>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) )
>>> AND (main.Type = 'ticket')
>>> AND (main.EffectiveId = main.id)
>>> ]
>>>
>>> LOG: duration: 105.350 ms statement: EXECUTE<unnamed> [PREPARE:
>>> SELECT DISTINCT main.* FROM Tickets main
>>> JOIN Transactions Transactions_1 ON ( Transactions_1.ObjectId = main.id )
>>> JOIN Attachments Attachments_2 ON ( Attachments_2.TransactionId =
>>> Transactions_1.id )
>>> WHERE (Transactions_1.ObjectType = 'RT::Ticket')
>>> AND (main.Status != 'deleted')
>>> AND ( ( Attachments_2.Content ILIKE '%foo.com%' ) )
>>> AND (main.Type = 'ticket')
>>> AND (main.EffectiveId = main.id)
>>> ORDER BY main.id ASC LIMIT 50
>>> ]
>>>
>>> What "both queries" are you referring to? The RT CLI and the
>>> RT web GUI both fail the same way with my 1 test query. I'm
>>> confused by what you mean there.
>>>
>>> Thanks for the ideas, Ken
>>>
>>>> starting with such a low ticket number, but if you plan
>>>> on using full content searches for any large-ish amount
>>>> of data you should really plan on using the full-text index
>>>> support for PostgreSQL/RT that is described in the wiki.
>>>> Speaking as one whose database was brought to its knees by
>>>> some creative data mining. :)
>>>>
>>>> Good luck and let us know what you find to be the cause.
>>>> Regards,
>>>>
>>>> Ken
>>>>
>>>>
>>>> On Mon, Apr 26, 2010 at 04:54:45PM -0400, Jeff Blaine wrote:
>>>>> On 4/26/2010 4:52 PM, Kenneth Marshall wrote:
>>>>>> Jeff,
>>>>>>
>>>>>> Are you using the Full-text index support from the wiki?
>>>>>
>>>>> Nope.
>>>>>
>>>>> Just RT 3.8.7 + RTFM 2.4.2. No add-ons/tweaks from the wiki.
>>>>>
>>>>>> There were some index bugs that may require you to re-index
>>>>>> to fix, specifically some rows were not reported correctly
>>>>>> as valid matches. I believe that the PostgreSQL release
>>>>>> notes mentioned that need. Maybe that is your problem. We
>>>>>> are using RT-3.8.5 and PostgreSQL 8.4.2 here with the full-text
>>>>>> support without an issue.
>>>>>>
>>>>>> Regards,
>>>>>> Ken
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 26, 2010 at 03:00:39PM -0400, Jeff Blaine wrote:
>>>>>>> On 4/26/2010 2:19 PM, Kenneth Marshall wrote:
>>>>>>>> Well, that knocks out the ACL issue. Do you think that your
>>>>>>>> Mason cache is confused? Maybe stop RT, clear the cache, and
>>>>>>>> restart RT to see if that helps. What DB backend are you using
>>>>>>>> and which version of RT are you running?
>>>>>>>
>>>>>>> RT 3.8.7
>>>>>>> PostgreSQL as it comes with RHELv5 + updates from yum
>>>>>>>
>>>>>>> Clearing the Mason cache didn't help :|
>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Ken
>>>>>>>>
>>>>>>>> On Mon, Apr 26, 2010 at 12:47:22PM -0400, Jeff Blaine wrote:
>>>>>>>>> On 4/26/2010 12:29 PM, Raed El-Hames wrote:
>>>>>>>>>> Jeff;
>>>>>>>>>>
>>>>>>>>>> Does your CLI user have permissions on the queue that ticket 39 is
>>>>>>>>>> in??
>>>>>>>>>> login to the web interface with the same cli user and see if you
>>>>>>>>>> can
>>>>>>>>>> view the ticket.
>>>>>>>>>
>>>>>>>>> Yes, it does.
>>>>>>>>>
>>>>>>>>> Again, however, this is not really a report about an anomaly in
>>>>>>>>> the RT CLI.
>>>>>>>>>
>>>>>>>>> The incorrect search results are returned via a web GUI search
>>>>>>>>> of "Content matches foo.com"
>>>>>>>>>
>>>>>>>>> Here, maybe this makes it more clear, showing the same problem
>>>>>>>>> when using the RT CLI:
>>>>>>>>>
>>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt list "Content like foo.com"
>>>>>>>>> Query:Content like 'foo.com'
>>>>>>>>> Ticket Owner Queue Age Told Status Requestor Subject
>>>>>>>>> --------------------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> 23 mbs Incid 1 wk resolv enVision@ alert
>>>>>>>>> -NICAlert-Secur
>>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>>
>>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt show 39 | grep foo.com
>>>>>>>>> foo.com blah blah... 1 line... not including in this email
>>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>>
>>>>>>>>> [root at rtsrv1 etc]# /apps/rt/bin/rt show 23 | grep foo.com
>>>>>>>>> foo.com blah blah... not including in this email
>>>>>>>>> foo.com matching lines 66 more times... not including in this email
>>>>>>>>> [root at rtsrv1 etc]#
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Regards;
>>>>>>>>>> Roy
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Jeff Blaine wrote:
>>>>>>>>>>> On 4/26/2010 11:50 AM, Kenneth Marshall wrote:
>>>>>>>>>>>> Hi Jeff,
>>>>>>>>>>>>
>>>>>>>>>>>> There is nothing here that indicates a problem. It looks
>>>>>>>>>>>> like an apples vs. oranges comparison by the time you include
>>>>>>>>>>>> the actual parameters of the search from the web interface
>>>>>>>>>>>> and the rt commandline interface and possible privilege and
>>>>>>>>>>>> ACL differences. You can use DB query logging to figure out
>>>>>>>>>>>
>>>>>>>>>>> I think my original post is being misinterpreted. The 'rt'
>>>>>>>>>>> CLI commands aren't doing a search. They're just showing
>>>>>>>>>>> this list's readers that 'foo.com' does show up in each of
>>>>>>>>>>> the tickets when doing a simple 'rt show<ticket>'. It's
>>>>>>>>>>> not a comparison of "CLI search vs. web search".
>>>>>>>>>>>
>>>>>>>>>>>> what SQL is being used in the web search or the commandline
>>>>>>>>>>>> rt and compare the output piece-wise to put yourself at ease.
>>>>>>>>>>>> Maybe look at the individual components of each of the two
>>>>>>>>>>>> tickets, as well.
>>>>>>>>>>>
>>>>>>>>>>> When viewing the tickets using 'Full headers" and then
>>>>>>>>>>> "Ctrl-F" to examine every instance of 'foo.com' in each ticket
>>>>>>>>>>> shows that both tickets have the 'foo.com' in text/html parts
>>>>>>>>>>> (and only there).
>>>>>>>>>>>
>>>>>>>>>>> Ticket 23 has 67 of those parts and is returned when RT searching
>>>>>>>>>>> for 'foo.com'
>>>>>>>>>>>
>>>>>>>>>>> Ticket 39 has 1 of those parts and is not returned when RT
>>>>>>>>>>> searching
>>>>>>>>>>> for 'foo.com'
>>>>>>>>>>>
>>>>>>>>>>> By "DB query logging" do you mean Set($StatementLog, "DEBUG");
>>>>>>>>>>> or something?
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the reply, Ken
>>>>>>>>>>>
>>>>>>>>>>> Jeff
>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Ken
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Apr 26, 2010 at 11:21:45AM -0400, Jeff Blaine wrote:
>>>>>>>>>>>>> Does anyone have any suggestions for how to go about
>>>>>>>>>>>>> figuring out what is wrong here?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 4/22/2010 2:09 PM, Jeff Blaine wrote:
>>>>>>>>>>>>>> RT 3.8.7
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> A search for 'Content matches foo.com' is returning some
>>>>>>>>>>>>>> tickets
>>>>>>>>>>>>>> and missing others that clearly have foo.com in the Content.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [root at rtsrv1 bin]# ./rt show 39 | grep foo.com | wc -l
>>>>>>>>>>>>>>> 1
>>>>>>>>>>>>>>> [root at rtsrv1 bin]#
>>>>>>>>>>>>>>> [root at rtsrv1 bin]# ./rt show 23 | grep foo.com | wc -l
>>>>>>>>>>>>>>> 67
>>>>>>>>>>>>>>> [root at rtsrv1 bin]#
>>>>>>>>>>>>>> 23 shows up in the web search results.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 39 does not.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any ideas?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>>>> Media.
>>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>>>> Media.
>>>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Discover RT's hidden secrets with RT Essentials from O'Reilly
>>>>>>>>>>> Media.
>>>>>>>>>>> Buy a copy at http://rtbook.bestpractical.com
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>> Discover RT's hidden secrets with RT Essentials from O'Reilly Media.
>>> Buy a copy at http://rtbook.bestpractical.com
>>>
>>
>> Discover RT's hidden secrets with RT Essentials from O'Reilly Media.
>> Buy a copy at http://rtbook.bestpractical.com
>>
>
More information about the rt-users
mailing list