[rt-users] RTx-shredder with a very large database

Luke Vanderfluit lvanderf at internode.com.au
Wed Mar 1 16:40:32 EST 2006


Todd Chapman wrote:

>On Wed, Mar 01, 2006 at 12:19:10PM +1030, Luke Vanderfluit wrote:
>  
>
>>Not sure whether web-interface invoked actions and RT API processed 
>>requests will be equal in performance to straigh SQL from perl script.
>>Would have to test both to give objective reply.
>>I have tried both API and direct sql, both _are_ slow.
>>
>>BTW.
>>The stuff that I put in my script comes from the WIKI.
>>http://wiki.bestpractical.com/index.cgi?DatabaseAdmin
>>
>>Kind regards.
>>Luke
>>
>>    
>>
>
>I would guess that it's the database work that is taking
>the most time. You are guessing that it is the script
>overhead. It would be better to benchmark Shredder before
>you do a bunch of potentially wasted work. Perhaps adding
>an index or two would speed things up...
>
>-Todd
>  
>
Bill R. Williams <brw at etsu.edu> Wrote:

How about (for massive cleanup)
*  Dump the rt3 database (mysqldump -A rt3 >rt3.sql) plus whatever
   other switches might be needed to include the IF/DROP CREATE
   directives.
*  Stop MySQL or at least the RT database.
*  Process that dump file with a perl script which drops the things
   that associate with the tickets you want to delete -- a filter to
   omit the stuff as mentioned in your notes (below).
   (massDelete.pl <rt3.sql >rt3clean.sql)
*  Start MySQL (if stopped)
*  mysql -u rt_user <rt3clean.sql
   which will recreate the entire DB without the stuff you deleted.

I know that this sounds like killing a fly with a shotgun, but it
*might* actually be the easiest (quickest?) way to go.  Of course the
downside is that you would want to stop your RT database unitl you
filtered your dumped .sql, cause we're talking DROP/CREATE every
table.  BUT it might not take that long to process.
On the upside, you DB would be nice, clean, and organized.

Just a thought.  I'd certainly be interested in your thoughts on the
method.  'Cause I am uncomfortable with data that I can't clean.

My response:
/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is a good approach in theory, however the drawback here is how to 
delete stuff, for example the attachements in the dump file.
I have tried editing the dump file. Given that my dump file is 14Gig, I 
haven't found an editor that can buffer that size file.
I use vi, it can't do that, or maybe it can but I haven't found out how.
I've tried several other editors which claim to be able to do it, but 
none effectively can.
By far the largest proportion of data in the dump file is in the 
attachments table.
It would take some experimentation before a watertight script can be run 
over the dump file. Given the amount of time it takes to do that, it 
would be very cumbersome to test a script. Testing a smaller dump file 
may be an option but it wouldn't be wise to apply the outcomes of that 
to the larger file and _assume_ that all is well.

\_____________________________________________________________________

Kind regards.



-- 
Luke

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.bestpractical.com/pipermail/rt-users/attachments/20060302/9660573b/attachment.htm>


More information about the rt-users mailing list