[rt-users] utf8 in email headers is not decoded properly

Mathieu Longtin mathieu at closetwork.org
Mon Nov 15 17:09:59 EST 2010


Using version 3.8.8 of RT with perl 5.8.8 and Oracle, I tracked down this
problem:

- Reply to an existing ticket using an email agent that will encode emails
in UTF8
- Put a non ascii character in the subject
- The email gets recorded with crap in the subject rather than the properly
encoded subject

In theory, you shouldn't have pure UTF8 in the subject, but Outlook does it,
so does gmail. Don't know about other emails.

This only happens with existing tickets. Here's how.

When receiving the mail, MIME::Parser doesn't decode the utf8, no matter
what. So its all binary utf8 in the header data structure.

RT adds the "RT-Ticket-ID" field. This string is properly encoded and
flagged as utf8.

When saving into the database, RT calls $Attachment->head->as_string, which
basically does a join of all the headers.

When joining strings in perl5, if any one of them is a flagged UTF8 string,
then the resulting string is flagged utf8, even if it contains binary data.
Don't know why, that's just what I found out.

So when RT does this:
utf8::decode( $head ) unless utf8::is_utf8( $head );

$head checks as being utf8, even though it's not really, so it doesn't get
decoded.

And that goes into the database, which is then used everywhere, and I see
this in my tickets:
test d'à cçênts

Now, where should I fix this?
- In Attachment_Overlay, force the decoding
- In RT::I18N, SetMIMEHeadToEncoding.

Thanks

--
Mathieu Longtin
1-514-803-8977
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.bestpractical.com/pipermail/rt-users/attachments/20101115/695932c2/attachment.htm>


More information about the rt-users mailing list