[rt-devel] Language detection bug

Andrew Snare ASnare at allshare.nl
Fri Feb 7 05:23:48 EST 2003


At 01:28 PM 6/02/2003 -0500, Jesse Vincent wrote:
>On Thu, Feb 06, 2003 at 04:44:40PM +0100, Andrew Snare wrote:
> > At 08:48 PM 6/02/2003 +0800, Autrijus Tang wrote:
> > >This is because 'en-us' is provided, instead of 'en'.  This is arguably
> > >wrong,
> > >since there's no US-specific things in that lexicon -- maybe just mv
> > >rt/lib/RT/I18N/en_us.po to en.po?
> >
> > While this fix may work, I think there's still a bug in the
> > language-matching. According to my reading of RFC2616, Section 14.4, if
> > 'en' is in my list, RT should be matching that against the 'en-us' that it
> > can supply.
>
>Which part of the language in that section? I'm not seeing it.
>FWIW, RT is using Locale::Maketext to do the parsing of the language
>tags. Switching from en-us to en seems to be at least _one_ of the right
>things to do. If we can make a case for anything else, I'm sure Sean
>would be happy to let us try to sell him on it.

I'm reading it again; it's a little ambiguous. The text we're discussing, 
reformatted, is:
         A language-range matches a language-tag if:
                 1) it exactly equals the tag; or if
                 2) it exactly equals a prefix of the tag such that the
                    first tag character following the prefix is "-".
         The special range "*", if present in the Accept-Language field,
         matches every tag not matched by any other range present in the
         Accept-Language field.

Definitions, restated to give context and also highlight any bad 
assumptions[1] I'm making, are:
         language-range:
                 One of the languages tags in the client-supplied 
Accept-Languages
                 header. Eg: 'en' or 'en-au'
         language-tag:
                 The language tag of the available content on the server.
                 Eg: 'en-us'
                 (NOTE: This is not explicitly defined, unfortunately, and I
                  may be making an error in assuming this.)

The situation that has occurred, is that people have 'en' in their 
Accept-Languages header, and the server has 'en-us' content available. It 
seems to me that these should match since 'en' is a prefix of 'en-us'.

As you mention, switching from en-us to en is one of several possible 
solutions. I'd argue against this switch however, for the following reasons:
         1) If a user has 'en-us' on the list, but not 'en', they won't get 
the     content. (The prefix rule is one-way). This is apparently quite common.
         2) Although there might not be much specifically American in 
the           translation, it will be American in subtle ways[2].

I hope this helps, one way or the other. Cheers,

  - Andrew
[1] My assumptions appear to at least match those in this post:
<http://groups.google.com/groups?selm=Pine.HPP.3.95a.1000121173010.24389J-100000%40hpplus01.cern.ch>
[2] For more information, see Section 2.2 of 
<http://kfa.univ.szczecin.pl/histvar/american.html>






More information about the Rt-devel mailing list