[rt-users] HTML stripper

Jonas Liljegren jonas at rit.se
Sat Apr 17 12:56:32 EDT 2004


Craig Schenk wrote:
> Ive seen numerous posts here from people ($self->include) wishing that RT
> could take incoming HTML mail and strip them down to plain text. I wrote this
> Perl script to do this, it may not be the most elegant solution but it works
> and can be used for RT, MajorDomo, whatever.
> 
> Basically it takes mail from STDIN and spits out a new one to STDOUT, so you
> can put in your /etc/aliases:
> 
>      rt-queue: "| htmldump | rt-mailgate etc..."

I worked with this code for several hours, trying to make it work.

It didn't encode the other parts (like images).  It didn't set a header 
to indicate that the mail was filtered.

It shouldn't convert HTML if there is a text alternative for the HTML.

And I prefere ISO-8859-1.


I looked at a couple of other alternatives (demime and stripmime) that 
didn't do what I wanted.

Firstly, it should create a text/plain part from HTML if none exist.


Before I continue modifying dumphtml, are there any other alternatives?
Craig, do you have a later version of dumphtml?

-- 
jonas at rit.se   RIT AB   http://www.rit.se
Box 70, 428 21 Kållered Besök: G:a Riksvägen 36
Tel: +46 (0)31 751 8600  Fax: +46 (0)31 751 8609



More information about the rt-users mailing list