[rt-users] HTML stripper

Todd Chapman rt at chaka.net
Wed Mar 31 08:36:01 EST 2004


Why re-invent the wheel?

http://scifi.squawk.com/demime.html

Works wonderfully for us.

-Todd

On Tue, Mar 30, 2004 at 07:45:53PM -0500, Craig Schenk wrote:
> Ive seen numerous posts here from people ($self->include) wishing that RT
> could take incoming HTML mail and strip them down to plain text. I wrote this
> Perl script to do this, it may not be the most elegant solution but it works
> and can be used for RT, MajorDomo, whatever.
> 
> Basically it takes mail from STDIN and spits out a new one to STDOUT, so you
> can put in your /etc/aliases:
> 
>      rt-queue: "| htmldump | rt-mailgate etc..."
> 
> If incoming mail is a straight text/html MIME type, the script will run it
> through lynx -dump (or you can use html2txt) to generate a text version. Since
> this may not be the prettiest formatting, a header is attached saying "this
> was generated from HTML automatically etc" and the original HTML email is
> preserved. The output of the script in this case will be a multipart MIME
> email which has the text part first and then the HTML as another MIME
> attachment, given the name "original.html" so it's obvious when viewed in the
> RT ticket.
> 
> If the incoming email is already multipart, any text parts and attachments are
> passed on unchanged. HTML parts are treated as above, with the exception that
> if the MIME header already has a filename for the HTML part, it won't get
> given the "original.html" name.
> 
> Im sure it can be improved, but it seems to work well enough for what I need.
> _______________________________________________
> rt-users mailing list
> rt-users at lists.bestpractical.com
> http://lists.bestpractical.com/mailman/listinfo/rt-users
> 
> Have you read the FAQ? The RT FAQ Manager lives at http://fsck.com/rtfm



More information about the rt-users mailing list