[rt-devel] [PATCH] Preserve original charset + fix bogus UTF32LE guess on binary attachments

Autrijus Tang autrijus at autrijus.org
Fri Jun 27 07:05:13 EDT 2003


On Fri, Jun 27, 2003 at 01:29:33PM +0400, Dmitry Sivachenko wrote:
> Now RT sends text/html body as an attach, but it does not specify the
> encoding of that attach:
> 
> ------------=_1056705889-6534-1
> Content-Type: text/html
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> The body of the original message was text/html; charset=windows-1251,
> but the information about charset is missing in the message RT sends to
> subscribers.

Indeed, that's a flaw of my OriginalEncoding implementation.

The patch below against lib/RT/I18N.pm should cure it, as well as
fix the bofus UTF32 guessing bug for binary attachments -- it now
only feeds text/*'s body to Encode::Guess, which is IMHO the
Right Thing to do.

Testing and feedback will be appreciated.

Thanks,
/Autrijus/


--- I18N.pm.orig	Fri Jun 27 18:43:29 2003
+++ I18N.pm	Fri Jun 27 18:53:43 2003
@@ -177,6 +177,10 @@
 	}
     }
 
+    # If this is a textual entity, we'd need to preserve its original encoding
+    $head->add( "X-RT-Original-Encoding" => $charset )
+	if $head->mime_attr('content-type.charset') or $head->mime_type =~ /^text/;
+
     return unless ( $head->mime_type =~ /^text\/plain$/i );
 
     my $body = $entity->bodyhandle;
@@ -210,7 +214,6 @@
         $head->mime_attr( "content-type" => 'text/plain' )
           unless ( $head->mime_attr("content-type") );
         $head->mime_attr( "content-type.charset" => $enc );
-        $head->add( "X-RT-Original-Encoding" => $charset );
         $entity->bodyhandle($new_body);
     }
 }
@@ -305,8 +308,14 @@
 	return $head->mime_attr("content-type.charset");
     }
 
-    my $body = $entity->bodyhandle or return;
-    return _GuessCharset( $head->as_string . $body->as_string );
+    if ( $head->mime_type =~ m{^text/}) {
+	my $body = $entity->bodyhandle or return;
+	return _GuessCharset( $head->as_string . $body->as_string );
+    }
+    else {
+	# potentially binary data -- don't guess the body
+	return _GuessCharset( $head->as_string );
+    }
 }
 
 # }}}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://pallas.eruditorum.org/pipermail/rt-devel/attachments/20030627/69e75e98/attachment.pgp


More information about the Rt-devel mailing list