setlocale
, localeconv
—select or query localeSynopsis
#include <locale.h> char *setlocale(int category, const char *locale); lconv *localeconv(void); char *_setlocale_r(void *reent, int category, const char *locale); lconv *_localeconv_r(void *reent);
Description
setlocale
is the facility defined by ANSI C to condition the
execution environment for international collating and formatting
information; localeconv
reports on the settings of the current
locale.
This is a minimal implementation, supporting only the required "POSIX"
and "C"
values for locale; strings representing other locales are not
honored unless _MB_CAPABLE is defined.
If _MB_CAPABLE is defined, POSIX locale strings are allowed, following the form
language[_TERRITORY][.charset][@modifier]
"language"
is a two character string per ISO 639, or, if not available
for a given language, a three character string per ISO 639-3.
"TERRITORY"
is a country code per ISO 3166. For "charset"
and
"modifier"
see below.
Additionally to the POSIX specifier, the following extension is supported
for backward compatibility with older implementations using newlib:
"C-charset"
.
Instead of "C-"
, you can also specify "C."
. Both variations allow
to specify language neutral locales while using other charsets than ASCII,
for instance "C.UTF-8"
, which keeps all settings as in the C locale,
but uses the UTF-8 charset.
The following charsets are recognized:
"UTF-8"
, "JIS"
, "EUCJP"
, "SJIS"
, "KOI8-R"
, "KOI8-U"
,
"GEORGIAN-PS"
, "PT154"
, "TIS-620"
, "ISO-8859-x"
with
1 <= x <= 16, or "CPxxx"
with xxx in [437, 720, 737, 775, 850, 852, 855,
857, 858, 862, 866, 874, 932, 1125, 1250, 1251, 1252, 1253, 1254, 1255, 1256,
1257, 1258].
Charsets are case insensitive. For instance, "EUCJP"
and "eucJP"
are equivalent. Charset names with dashes can also be written without
dashes, as in "UTF8"
, "iso88591"
or "koi8r"
. "EUCJP"
and
"EUCKR"
are also recognized with dash, "EUC-JP"
and "EUC-KR"
.
Full support for all of the above charsets requires that newlib has been
build with multibyte support and support for all ISO and Windows Codepage.
Otherwise all singlebyte charsets are simply mapped to ASCII. Right now,
only newlib for Cygwin is built with full charset support by default.
Under Cygwin, this implementation additionally supports the charsets
"GBK"
, "GB2312"
, "eucCN"
, "eucKR"
, and "Big5"
. Cygwin
does not support "JIS"
.
Cygwin additionally supports locales from the file /usr/share/locale/locale.alias.
(""
is also accepted; if given, the settings are read from the
corresponding LC_* environment variables and $LANG according to POSIX rules.)
This implementation also supports the modifiers "cjknarrow"
and
"cjkwide"
, which affect how the functions wcwidth
and wcswidth
handle characters from the "CJK Ambiguous Width" category of characters
described at http://www.unicode.org/reports/tr11/#Ambiguous.
These characters have a width of 1 for singlebyte charsets and a width of 2
for multibyte charsets other than UTF-8.
For UTF-8, their width depends on the language specifier:
it is 2 for "zh"
(Chinese), "ja"
(Japanese), and "ko"
(Korean),
and 1 for everything else. Specifying "cjknarrow"
or "cjkwide"
forces a width of 1 or 2, respectively, independent of charset and language.
If you use NULL
as the locale argument, setlocale
returns a
pointer to the string representing the current locale. The acceptable
values for category are defined in ‘locale.h
’ as macros
beginning with "LC_"
.
localeconv
returns a pointer to a structure (also defined in
‘locale.h
’) describing the locale-specific conventions currently
in effect.
_localeconv_r
and _setlocale_r
are reentrant versions of
localeconv
and setlocale
respectively. The extra argument
reent is a pointer to a reentrancy structure.
Returns
A successful call to setlocale
returns a pointer to a string
associated with the specified category for the new locale. The string
returned by setlocale
is such that a subsequent call using that
string will restore that category (or all categories in case of LC_ALL),
to that state. The application shall not modify the string returned
which may be overwritten by a subsequent call to setlocale
.
On error, setlocale
returns NULL
.
localeconv
returns a pointer to a structure of type lconv
,
which describes the formatting and collating conventions in effect (in
this implementation, always those of the C locale).
Portability
ANSI C requires setlocale
, but the only locale required across all
implementations is the C locale.
Notes
There is no ISO-8859-12 codepage. It’s also refused by this implementation.
No supporting OS subroutines are required.