[rt-users] RT::I18N::DecodeMIMEWordsToEncoding and multiline headers

Thu Feb 5 09:59:47 EST 2004

Headers with folding whitespace are not correctly decoded because the
leading whitespace of folded lines might be removed.  Our fix simply
removes all line feeds from the output.

Test case:

Subject: =?ISO-8859-1?Q?Re=3A_=5BXXXXXX=23269=5D_=5BComment=5D_Frag?=
 =?ISO-8859-1?Q?e_zu_XXXXXX--xxxxxx_/_Xxxxx=FCxxxxxxxxxx?=

Without the fix, the following ends up in the database, which causes
SendEmail transactions to fail:

Subject: Re: [XXXXXX#269] [Comment] Frag
e zu XXXXXX--xxxxxx / Xxxxxüxxxxxxxxxxfw at deneb:~$ 

With the fix, we get:

Subject: Re: [XXXXXX#269] [Comment] Frage zu XXXXXX--xxxxxx / Xxxxxüxxxxxxxxxx

Index: lib/RT/I18N.pm
===================================================================

--- lib/RT/I18N.pm	(revision 32)
+++ lib/RT/I18N.pm	(revision 33)
@@ -299,6 +299,10 @@
 	$str .= $prefix . $enc_str . $trailing;
     }
 
+    # We might have \n without trailing whitespace, which will result in
+    # invalid headers.
+    $str =~ s,\n,,gs;
+
     return ($str)
 }
 
This quick fix is to simplicistic; it doesn't properly unfold lines.
The rules for this are rather complex and hardly anybody gets them
right, and in my current state of mind, I won't try to implement them in
a regexp. 8-)

Another option would be to properly fold subjects consisting of multiple
lines.  This would fix the SendEmail crash, but given that the decoder
is buggy anyway (it would introduce an improper space character after
"Frag" in the example above), this approach seems less than stellar.