[Rt-devel] Encoding bug in ticket creation
Jesse Vincent
jesse at bestpractical.com
Mon Sep 1 17:53:33 EDT 2008
On Aug 26, 2008, at 2:51 AM, Otmar Lendl wrote:
> On 2008/08/15 11:08, Jesse Vincent <jesse at bestpractical.com> wrote:
>>>
>>> Aug 15 09:03:13 web1 RT: Attachment insert failed - ERROR: invalid
>>> byte sequence for encoding "UTF8": 0xf60a436f HINT: This error can
>>> also happen if the byte sequence does not match the encoding
>>> expected by the server, which is controlled by "client_encoding". (/
>>> opt/rt3/lib/RT/Attachment_Overlay.pm:153)
>
>>> I'm currently using RT 3.8.1rc5 with PostgreSQL 8.3.
>>>
>>
>> What version of DBD::Pg? What version of Perl?
>>
>> Does it happen if you submit the same thing by email or a custom
>> field
>> contains an ö?
>
> FYI, I'm seeing the same error.
>
> Here it's RT 3.7.21 / one of the RTIR milestone releases. The rest is
> Debian stable, thus libdbd-pg-perl at 1.49-2, perl at 5.8.8-7etch3.
>
> In my case it's triggered by an incoming email, and I see the error in
> the postgres logfile as well:
>
> 2008-08-26 11:25:07 CEST ERROR: invalid byte sequence for encoding
> "UTF8": 0xe92064
> 2008-08-26 11:25:07 CEST HINT: This error can also happen if the
> byte sequence does not match the encoding expected by the server,
> which is controlled by "client_encoding".
>
> Drilling down on the email (It really pays to keep a copy of each
> incoming mail) I found it labeled as:
>
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: quoted-printable
>
> There are a few signs of QP being applied, e.g.
>
> ---------------------------------------------------------------=20
>
> Now the kicker: the mail has a legal disclaimer appended (NOT
> attached!), which containes 8-bit characters. Checking the
> encoding shows that they are in latin1 and not utf-8.
>
> Great. In other words, the incoming mail is invalid. (removing the
> disclaimer makes RT indeed happy.)
>
> Well, this is suboptimal. As a CERT, we can't just ignore slightly
> broken mails. Any ideas?
There's a switch to perl's UTF8 handling to replace broken utf8 in
text with blanks. Alternatively, QP could get forced before write to
the DB on sensitive databases.
The routine you want to be looking at / fixing is lib/RT/Record.pm's
sub _EncodeLOB.
A patch would be most welcome.
-jesse
>
>
> /ol
> --
> -=- Otmar Lendl -- ol at bofh.priv.at -=-
> _______________________________________________
> List info: http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-devel
>
More information about the Rt-devel
mailing list