[rt-users] Different charsets problem

Jan Okrouhly okrouhly at civ.zcu.cz
Tue Feb 19 14:36:02 EST 2002

On Tue, 19 Feb 2002, Bruce Campbell wrote:

> On Mon, 18 Feb 2002, Jan Okrouhly wrote:
> > We have everyday serious problems with web ui - some characters
> > are displayed well (due to ugly hacked component with HTML header),
> > but when a user replies to message containing some accented characters
> > the system doesn't send/say anything or just say incorrect transaction
> > type etc. There are also problems with displaying Subjects/Bodies with
> > different charsets etc.
> Which particular charsets does your browser and apache server purport to
> know about?  You might find that the AddDefaultCharset (apache 1.3.12+)
> directive might help with your normal characterset, as RT does not add a
> charset by default (although you could put one in
> WebRT/html/Elements/Header).

AddDefaultCharset iso-8859-2 (On had there till today;-) partly
helps to Netscape browsers (I thing MSIE ignores that and uses
ContentType/Charset from my local/WebRT/html/Elements/Header.
But some problems will stay, because people [may|use] here at least
iso-8859-1 (M$ default mistake for Central Europe), iso-8859-2 (best bet
here), or Win-1250 and UTF-8 too...

During time I'll test behavior of more browsers on more platforms with
different encodings.
There was one interesting issue with HTML encoded 'accented content' -
Ticket History seems to be OK, but during reply (etc.) people see &#xxx
instead of 'accented content'. So I think the problem is also at text
area input side.

> Handling multiple charsets is.. currently outside RT's abilities right now
> (see the charset comment in lib/RT/Interface/Email.pm), but Jesse has

I've look at this (man MIME::Head /decode). This is just another
(maybe also important) problem with To, From, Subject etc. This actual
behavior is just fine to me. The main problem is that charset information
from Content-Type is not stored/used.
Content-Type: text/plain;

I suppose the right behavior will be to reencode all incomming plain texts
into one internal encoding (UTF8 should be the best). The
Attachments.ContentEncoding could just fit for those, but it need a BIG
work around ;-(I think).

> planned for it (see the SQL Users.{Lang,EmailEncoding,WebEncoding}
> columns).

Yes, I know that schema, but not detailed Jesse's plans (are somewhere on
web?). In my opinion .Lang will be usable, but one user often has
more different emails and/or webs encoding (in some heterogenous/open

> Regards,

Thanks for the response

Jan Okrouhly
---------------------------------------\-\-\+\-\-\---okrouhly at civ.zcu.cz---
Laboratory for Computer Science             |    phone: (420 19) 7491588
University of West Bohemia                  | location: Univerzitni 22
Americka 42, 306 14 Pilsen, Czech Republic  |     room: UI404
------------------------------------------73!-de-OK1INC at OK0PPL.#BOH.CZE.EU-

