[rt-devel] Language detection bug
Andrew Snare
ASnare at allshare.nl
Fri Feb 7 05:23:48 EST 2003
At 01:28 PM 6/02/2003 -0500, Jesse Vincent wrote:
>On Thu, Feb 06, 2003 at 04:44:40PM +0100, Andrew Snare wrote:
> > At 08:48 PM 6/02/2003 +0800, Autrijus Tang wrote:
> > >This is because 'en-us' is provided, instead of 'en'. This is arguably
> > >wrong,
> > >since there's no US-specific things in that lexicon -- maybe just mv
> > >rt/lib/RT/I18N/en_us.po to en.po?
> >
> > While this fix may work, I think there's still a bug in the
> > language-matching. According to my reading of RFC2616, Section 14.4, if
> > 'en' is in my list, RT should be matching that against the 'en-us' that it
> > can supply.
>
>Which part of the language in that section? I'm not seeing it.
>FWIW, RT is using Locale::Maketext to do the parsing of the language
>tags. Switching from en-us to en seems to be at least _one_ of the right
>things to do. If we can make a case for anything else, I'm sure Sean
>would be happy to let us try to sell him on it.
I'm reading it again; it's a little ambiguous. The text we're discussing,
reformatted, is:
A language-range matches a language-tag if:
1) it exactly equals the tag; or if
2) it exactly equals a prefix of the tag such that the
first tag character following the prefix is "-".
The special range "*", if present in the Accept-Language field,
matches every tag not matched by any other range present in the
Accept-Language field.
Definitions, restated to give context and also highlight any bad
assumptions[1] I'm making, are:
language-range:
One of the languages tags in the client-supplied
Accept-Languages
header. Eg: 'en' or 'en-au'
language-tag:
The language tag of the available content on the server.
Eg: 'en-us'
(NOTE: This is not explicitly defined, unfortunately, and I
may be making an error in assuming this.)
The situation that has occurred, is that people have 'en' in their
Accept-Languages header, and the server has 'en-us' content available. It
seems to me that these should match since 'en' is a prefix of 'en-us'.
As you mention, switching from en-us to en is one of several possible
solutions. I'd argue against this switch however, for the following reasons:
1) If a user has 'en-us' on the list, but not 'en', they won't get
the content. (The prefix rule is one-way). This is apparently quite common.
2) Although there might not be much specifically American in
the translation, it will be American in subtle ways[2].
I hope this helps, one way or the other. Cheers,
- Andrew
[1] My assumptions appear to at least match those in this post:
<http://groups.google.com/groups?selm=Pine.HPP.3.95a.1000121173010.24389J-100000%40hpplus01.cern.ch>
[2] For more information, see Section 2.2 of
<http://kfa.univ.szczecin.pl/histvar/american.html>
More information about the Rt-devel
mailing list