Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Punt to the operating system for character encodings #2

Merged
merged 1 commit into from
Dec 5, 2015

Commits on Dec 5, 2015

  1. Punt to the operating system for character encodings

    Without this, "may contain any Unicode characters" seemed too
    ambiguous.
    
    I wish there were cleaner references for the {language}.{encoding}
    locales like en_US.UTF-8 and UTF-8.  But [1,2] seems too glib, and I
    can't find a more targetted UTF-8 link than just dropping folks into a
    Unicode chapter (which is what [1] does):
    
      The Unicode Standard, Version 6.0, §3.9 D92, §3.10 D95 (2011)
    
    With the current v8.0 (2015-06-17), it's still §3.9 D92 and §3.10 D95.
    
    The TR35 link is for:
    
      In addition, POSIX locales may also specify the character encoding,
      which requires the data to be transformed into that target encoding.
    
    and the POSIX §6.2 link is for:
    
      In other locales, the presence, meaning, and representation of any
      additional characters are locale-specific.
    
    [1]: https://en.wikipedia.org/wiki/UTF-8
    [2]: https://en.wikipedia.org/wiki/Locale#POSIX_platforms
    
    Signed-off-by: W. Trevor King <wking@tremily.us>
    Reviewed-by: Jesse Butler <jeeves.butler@gmail.com>
    wking committed Dec 5, 2015
    Configuration menu
    Copy the full SHA
    3606bcf View commit details
    Browse the repository at this point in the history