[Rt-devel] Creating tickets with non-ascii subject (and/or postgresql)

Bram rtdevel at lists.wizbit.be
Mon Nov 30 09:04:11 EST 2009


When I create a ticket in the Web Interface and the Subject contains a  
non-ascii character: "a é b" (that is: a + LATIN SMALL LETTER E WITH  
ACUTE + b) then RT produces the error:

'Ticket could not be created due to an internal error'

Looking at the log file:

[Mon Nov 30 17:35:41 2009] [warning]: DBD::Pg::st execute failed:  
ERROR:  invalid byte sequence for encoding "UTF8": 0xe92062
HINT:  This error can also happen if the byte sequence does not match  
the encoding expected by the server, which is controlled by  
"client_encoding". at /usr/share/perl5/DBIx/SearchBuilder/Handle.pm
  line 532, <DATA> line 301.  
(/usr/share/perl5/DBIx/SearchBuilder/Handle.pm:532)
[Mon Nov 30 17:35:41 2009] [warning]: RT::Handle=HASH(0xb06ee58)  
couldn't execute the query 'INSERT INTO Attachments (Subject,  
ContentType, Headers, Creator, MessageId, Parent, Created, Transaction
Id) VALUES (?, ?, ?, ?, ?, ?, ?, ?)' at  
/usr/share/perl5/DBIx/SearchBuilder/Handle.pm line 545
[....]


What happened:
The database in postgresl is set to UTF8 and an insert was attempted  
with the byte \xe9 (ascii) instead of \xc3\xa9 (utf-8).
\xe9 is not valid UTF8 so it produces an error.

What does work: creating the ticket with the subject: "a b" and then  
changing it to "a é b"

Digging in the code:

RT::Attachments_Overlay::Create contains:
     # MIME::Head doesn't support perl strings well and can return
     # octets which later will be double encoded in low-level code
     my $head = $Attachment->head->as_string;
     utf8::decode( $head );

Dumping the contents of the variable $head (using Devel::Peek) before  
and after the decode:

SV = PV(0xbf57fb0) at 0xa9fd9d8
   REFCNT = 1
   FLAGS = (PADMY,POK,pPOK,UTF8)
   PV = 0xbf6c938 "MIME-Version: 1.0\nX-Mailer: MIME-tools 5.427  
(Entity 5.427)\nSubject: a \303\251 b\nContent-Type: text/plain\n"\0  
[UTF8 "MIME-Version: 1.0nX-Mailer: MIME-tools 5.427 (Entity  
5.427)nSubject: a \x{e9} bnContent-Type: text/plainn"]
   CUR = 101
   LEN = 104

SV = PV(0xbf57fb0) at 0xa9fd9d8
   REFCNT = 1
   FLAGS = (PADMY,POK,pPOK)
   PV = 0xbf6c938 "MIME-Version: 1.0\nX-Mailer: MIME-tools 5.427  
(Entity 5.427)\nSubject: a \351 b\nContent-Type: text/plain\n"\0
   CUR = 100
   LEN = 104



Looking in git to find some history of this:

commit 0e92634d782383f5c64bece63962f1eb361f96fb (2008-08-01)
added the code:
     my $head = $Attachment->head->as_string;
     utf8::decode( $head );


commit 0d14f4e5a6a36597300559b8efe1716989cae61d (2009-05-01)
modified the code to:
   my $head = Encode::decode_utf8($Attachment->head->as_string);


commit 6f1f370a28146902391a5aa0e6aca3e6027d9b9a (2009-08-26)
reverted the change and changed it back to:
     my $head = $Attachment->head->as_string;
     utf8::decode( $head );

The reason for doing this apperently was because of:
   http://rt3.fsck.com/Ticket/Display.html?id=13278


The result (seems to be) that tickets that contain a non-ascii  
character in the subject can not be created when Postgresql is used  
and the database is set to UTF8.


Does anyone remember why all this is nessesary?


Best regards,

Bram





More information about the Rt-devel mailing list