[rt-devel] Re: RT 2.1.56 (wrong charset)

Stefan Fischer info at debian.homeunix.net
Mon Feb 3 16:48:52 EST 2003


Hello Stan & Jesse,

thank you for your investigations! Stan asked me for the perl version on
my box...

hostname:~/rt-2-1-66# perl -v
This is perl, v5.6.1 built for i386-linux
hostname:~/rt-2-1-66# uname -a
Linux hostname 2.4.18 #5 Fri Jan 31 16:52:02 CET 2003 i686 unknown
Debian woody stable


The problem further below described is still the same in 2.1.66.

Greetings Stefan!

>
> --- Jesse Vincent <jesse at bestpractical.com> wrote:
>> > > First is that the charset information should be
>> > > sent in HTTP header,
>>
>> And, actually, it is:
>>
>> Content-Type: text/html; charset=utf-8
>
> aha, the situation is more complicated: the strings that
> Stefan Fischer has sent, are not Unicode! and neither ISO latin1.
>
> Unfortunately, I've got no server to check it quickly,
> but I suspect it went through these steps:
>
> 1)
> lib/RT/I18N/de.po is encoded Latin1.
>
> 2)
> Then it goes through lib/RT/I18N.pm and is presented as wanna-be Unicode.
> I'm not sure at this stage if it really produces unicode.
>
> 3)
> Then it goes through HTML::Entities (as told by default_escape_flags =>
> 'h'),
> and all non-ascii characters are replaced with entities:
> &Auml; for a-umlaut etc.
> At this stage, HTML::Entities depends on Perl version (Stefan, what's
> yours?).
>
> If it's 5.6, it treats each non-ascii byte (remember, Unicode
> symbols come as two-byte symbols?) as non-ascii character, and
> produces two HTML entities per each Unicode symbol.
>
> In 5.8, each non-ascii Unicode symbol (two bytes) is
> replaced with a HTML entity. In HTML::Entities, they are defined
> for Latin1 symbols only. It means, Cyrillic (Russian) symbols would
> be replaced with (one or two?) numeric entities.
> Some browsers will survive that (in case if it's still one entity),
> but it's definitely wrong way.
>
> The right way would be to totally avoid entity'izing, and
> shoot out the plain text, with correct charset in HTTP header.
>
> With regards,
>
> Stan
>
>
>
> _______________________________________________
> rt-devel mailing list
> rt-devel at lists.fsck.com
> http://lists.fsck.com/mailman/listinfo/rt-devel
>


-- 
Mit freundlichen Grüßen / Kind Regards
Stefan Fischer



More information about the Rt-devel mailing list