[rt-users] RTx-shredder with a very large database

Luke Vanderfluit lvanderf at internode.com.au
Tue Feb 28 18:02:57 EST 2006


Hi.

]Jim Faulkner wrote:

>Hello,
>
>I run a large RT installation.  We have around 20 queues, 200-300 users, 
>and many more requestors.  I am currently running RT 3.0.12 on the 
>production RT, and RT 3.4.5 on a development machine.
>
>Unfortunately we get a lot of spam going into the RT queues.  The previous 
>administrator of RT set up a way of dealing with this -- before going into 
>RT all mail is checked with spamassassin, and if it reports a positive the 
>mail is put into an RT queue named "ZZSpam".  Unfortunately over the years 
>this "ZZSpam" queue has gotten extremely large, with about 250,000 
>tickets, and is probably taking up the bulk of the database because many 
>spams come with large attachments.
>
>Because this ZZSpam queue has become so unweildy, I'd like to clean out 
>all tickets from that queue, as well as all users, attachments, etc. that 
>are associated with those tickets.  I've tried the rtx-shredder program, 
>but it is extremely slow.  I issued this command on the devel RT, which 
>is a dual 2ghz Xeon  machine with 1 GB of RAM, and has no load on it:
>time /opt/rt3/sbin/rtx-shredder --force --plugin 
>'Tickets=queue,ZZSpam;limit,20'
>
>And it took 23 minutes and 17 seconds to complete.  Obviously if it takes 
>that long to delete 20 tickets, I cannot use the tool to clean out 250,000 
>tickets.
>
>Is there anything I can do to make rtx-shredder significantly faster?  If 
>not, does anyone have any advice as to how I should go about cleaning out 
>this massive queue?
>  
>
I'm afraid I have a similar story, no spam queue (spam is filtered 
before it reaches RT), but large amounts of redundant data.
I have a similar size RT installation.
I've installed rtx-shredder and tried that.
See my earlier post of 27Feb2006.

I thought that doing the deletes to mysql directly (that is, not using 
the RT api, might be faster) but I'm afraid that isn't necessarily so.

I have created a script that does this and it shows considerably long 
processing times.

What needs to be done  to delete a ticket is this:

/~~~~~~~~~~~~~~~~~~~~~~~~~~~
select * from Tickets where id = <ticketID>
<1>select id from Transactions where ObjectId = <ticketID>;
<4>select id from ObjectCustomFieldValues where ObjectId = <ticketID>;
<3>select id from Links where LocalTarget = <ticketID>;
<2>select id from Attachments where TransactionId = ( for each 
transactionid retrieved with <1> )

-delete from Attachments where id in ( for each Attachment retrieved 
with <2> )
-delete from Links where id = ( for each link in <3> )
-delete from ObjectCustomFieldValues where id in ( for each 
ObjectCustomFieldValue in <4> )
-delete from Transactions where id in ( for each Transaction in <1> )
-delete from Tickets where id = <ticketID>;

Additionally Stuff needs to be deleted from
principals, cachedgroupmembers, groupmembers, groups.

For each ticket:
-delete from principals where
    id=groups.id and groups.instance=tickets.id as int
-delete from cachedgroupmembers where groupid=groups.id
    and groups.instance=tickets.id
-delete from groupmembers where groupid=groups.id and
    groups.instance=tickets.id
-delete from groups where instance=tickets.id
\____________________________________________

There may be more that could be deleted, namely Users(requestors) who 
created the ticket etc.

I'm toying with the idea of a script that runs each night and cleans out 
1000 tickets or so, I have 90000 tickets to delete in first instance, 
after that a regular cleanup of resolved tickets.

Hope this contributes.

Kind regards.
Luke.

















>thanks for any help,
>Jim Faulkner
>
>_______________________________________________
>http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users
>
>Be sure to check out the RT Wiki at http://wiki.bestpractical.com
>
>Download a free sample chapter of RT Essentials from O'Reilly Media at http://rtbook.bestpractical.com
>
>WE'RE COMING TO YOUR TOWN SOON - RT Training in Amsterdam, Boston and
>San Francisco - Find out more at http://bestpractical.com/services/training.html
>  
>


-- 
Luke




More information about the rt-users mailing list