[rt-users] A solution for decoding iso-8859-1 subjects

Jesper Holm Olsen dunkel at diku.dk
Tue May 9 11:00:16 EDT 2000


> > > Wouldn't it be better to put the hack in mail/manipulate.pm?
> > 
> > I thought about this too, but I think this is a "clean" way to do it as:
> > 
> > 1) It must be up to a users mailclient to decode the quoted-printable.
> 
> Ok, that's a point.  Anyway, isn't it also a problem that RT won't
> identify the "[MyTag #3432]" string when the subject is encoded?

You actually have a very good point there. Therefor I spend some time
doing what you suggested and now I simply strip the encoding in
lib/rt/ui/manipulate/manipulate.pm in 'sub parse_headers':

sub parse_headers {
    my ($content) ="@_";

     ($headers, $body) = split (/\n\n/, $content, 2);

    foreach $line (split (/\n/,$headers)) {
      
	#By Jesper Holm Olsen (dunkel at diku.dk) 05/05/2000
	#This decodes Quoted-printable from subject.
	use MIME::QuotedPrint;
	my $decoded;
	$decoded = decode_qp($line);
	$decoded =~ s/=\?iso-8859-1\?Q\?(.*)\?=/$1/;
	$decoded =~ s/_/\ /g;
	$line = $decoded;


I added a substituion for '_'. This has the side-effect of removing
any '_' in the original subject, but I can live with that. I have not read
the appropriate RFC or found a perl-module which could strip the encoding,
so now I just use this hack.

This works fine so far and the replies now gets stored in the right
ticket.

> 
> > 2) If you make a reply, you'd have to encode it once again (this would'nt
> >    be to hard though, just use MIME::encode_qp.)
> 
> I'd say there is no use for it.  Ok, according to the RFC the subject
> should only contain 7-bit characters, but in practice there is never any
> problems for it - at least not when sticking to only one character set
> (typically ISO-8859-1 for Denmark).
> 
Also a good point :-)

> > Also I have not tested if MySQL supports european characters.
> 
> No problems storing 8 bit characters in MySQL.  I'm not sure if it has any
> logic for separating different character sets ... I guess it uses Locale
> to decide how to handle case insensitive searches.

Well, it seems to work fine - for me anyway :)


--
Jesper Holm Olsen,  Department of Computer Science and Department of 
                    Film and Media studies, University of Copenhagen
Email: dunkel at diku.dk * Homepage: http://www.diku.dk/students/dunkel







More information about the rt-users mailing list