[Rt-commit] rt branch, 4.0/strict-decodelob-decoding, created. rt-4.0.18-120-g82403d5
Kevin Falcone
falcone at bestpractical.com
Mon Dec 16 17:38:05 EST 2013
The branch, 4.0/strict-decodelob-decoding has been created
at 82403d5aabad7bcc155fa89db301cd9bd1fa552d (commit)
- Log -----------------------------------------------------------------
commit 82403d5aabad7bcc155fa89db301cd9bd1fa552d
Author: Kevin Falcone <falcone at bestpractical.com>
Date: Mon Dec 16 16:14:22 2013 -0500
Instead of the flimsy utf8 encoding, use UTF-8 and fix bogus data.
Old versions of RT (especially those running on MySQL) were happy to pass
garbage into MySQL and it was stored there, lurking, waiting for you to
retrieve it. If you do retrieve it and then try to treat it like UTF-8
data (say by passing it to another system that strictly handles UTF-8
such as PostgreSQL) it will be rejected vigorously.
This converts from
Encode::decode('utf8','string');
which doesn't check the content and converts to perl's internal utf8.
Encode::decode('UTF-8','string',Encode::PERLQQ);
which converts to actual UTF-8 strings and will apply the PERLQQ filter
documented in the Encode docs under Handling Malformed Data.
This is similar to what we now do to all Web UI input in
RT::Interface::Web::DecodeArgs
diff --git a/lib/RT/Record.pm b/lib/RT/Record.pm
index 66a6d65..b498459 100644
--- a/lib/RT/Record.pm
+++ b/lib/RT/Record.pm
@@ -820,7 +820,7 @@ sub _DecodeLOB {
return ( $self->loc( "Unknown ContentEncoding [_1]", $ContentEncoding ) );
}
if ( RT::I18N::IsTextualContentType($ContentType) ) {
- $Content = Encode::decode_utf8($Content) unless Encode::is_utf8($Content);
+ $Content = Encode::decode('UTF-8',$Content,Encode::FB_PERLQQ) unless Encode::is_utf8($Content);
}
return ($Content);
}
-----------------------------------------------------------------------
More information about the rt-commit
mailing list