[Rt-devel] [BUG] OriginalContent can return string with utf8
flag on
Ruslan U. Zakirov
Ruslan.Zakirov at acronis.com
Tue Oct 12 06:24:44 EDT 2004
Thing that RT shouldn't use at all is Encode::from_to because of absence
of CHECK argument. Encode uses default 0.
perldoc Encode:
If CHECK is 0, (en|de)code will put a substitution character in place of
a malformed character.
code example:
perl -e 'my $str = "\x{442}\x{435}\x{441}\x{442}"; require Encode;
Encode::_utf8_off($str); Encode::from_to($str, "utf8"=> "us-ascii");
print $str;'
output would be '????' instead of croak. RT don't like die, but current
behaviour hide bugs. Now we have corrupted data in DB.
Codepath in RT that trigger this bug:
sub RT::I18N::SetMIMEEntityToEncoding {
...
eval {
$RT::Logger->debug("Converting '$charset' to '$enc' for ".
$head->mime_type . " - ". $head->get('subject'));
# NOTE:: see the comments at the end of the sub.
Encode::_utf8_off( $lines[$_] ) foreach ( 0 .. $#lines );
Encode::from_to( $lines[$_], $charset => $enc ) for ( 0 ..
$#lines );
};
...
}
Ruslan U. Zakirov wrote:
> Hello.
> sub Attachment::OriginalContent {
> ...
> if (!$enc || $enc eq '' || $enc eq 'utf8' || $enc eq 'utf-8') {
> # If we somehow fail to do the decode, at least push out the raw bits
> eval {return( Encode::decode_utf8($content))} || return ($content);
> }
> ...
> }
>
> decode_utf8 returns string instead of octets.
>
> perldoc Encode:
> $string = decode_utf8($octets [, CHECK]);
>
>
> Best regards. Ruslan.
More information about the Rt-devel
mailing list