Jump to content

C Programming/wchar.h: Difference between revisions

From Wikibooks, open books for an open world
[checked revision][checked revision]
Content deleted Content added
Rejected the last text change (by 27.251.143.2) and restored revision 2612872 by Fishpi
ce
Line 3: Line 3:
{{Wikipedia|template:C Standard library}}
{{Wikipedia|template:C Standard library}}
{{Wikipedia|wide character}}
{{Wikipedia|wide character}}
'''wchar.h''' is a [[C++ Programming#.h|header file]] in the [[C Programming/Standard libraries|C standard library]]. It is a part of the extension to the [[C (programming language)|C programming language]] standard done in 1995. It contains extended multibyte and wide character utilities. The standard header <wchar.h> is included to perform input and output operations on wide streams. It can also be used to manipulate the wide strings.<ref>http://www.qnx.com/developers/docs/6.4.1/dinkum_en/c99/wchar.html</ref>
'''wchar.h''' is a header file in the [[C Programming/Standard libraries|C standard library]]. It is a part of the extension to the C programming language standard done in 1995. It contains extended multibyte and wide character utilities. The standard header <wchar.h> is included to perform input and output operations on wide streams. It can also be used to manipulate the wide strings.<ref>http://www.qnx.com/developers/docs/6.4.1/dinkum_en/c99/wchar.html</ref>


==Wide Characters==
==Wide Characters==
C is a [[programming language]] that was developed in an environment where the dominant character set was the 7-bit [[ASCII]] code. Hence since then the 8-bit byte is the most common unit of encoding. However when a software is developed for an international purpose, it has to be able to represent different characters. For example character encoding schemes to represent the Indian, Chinese, Japanese writing systems should be available. The inconvenience of handling such varied multibyte characters can be eliminated by using characters that are simply a uniform number of bytes. [[ANSI C]] provides a type that allows manipulation of variable width characters as uniform sized data objects called [[wide characters]]. The wide character set is a [[superset]] of already existing character sets, including the 7-bit ASCII.<ref>http://books.google.co.in/books?id=4Mfe4sAMFUYC&pg=PT26&lpg=PT26&dq=wide+characters+as+superset&source=bl&ots=tPLP1nN4qh&sig=f2W0ys85Ms9lRdT4HBEf_yoNL2U&hl=en&ei=H3SMTpiJFpGzrAfzvOycAg&sa=X&oi=book_result&ct=result&resnum=2&sqi=2&ved=0CCIQ6AEwAQ#v=onepage&q&f=false</ref>
C is a programming language that was developed in an environment where the dominant character set was the 7-bit ASCII code. Hence since then the 8-bit byte is the most common unit of encoding. However when a software is developed for an international purpose, it has to be able to represent different characters. For example character encoding schemes to represent the Indian, Chinese, Japanese writing systems should be available. The inconvenience of handling such varied multibyte characters can be eliminated by using characters that are simply a uniform number of bytes. ANSI C provides a type that allows manipulation of variable width characters as uniform sized data objects called wide characters. The wide character set is a superset of already existing character sets, including the 7-bit ASCII.<ref>http://books.google.co.in/books?id=4Mfe4sAMFUYC&pg=PT26&lpg=PT26&dq=wide+characters+as+superset&source=bl&ots=tPLP1nN4qh&sig=f2W0ys85Ms9lRdT4HBEf_yoNL2U&hl=en&ei=H3SMTpiJFpGzrAfzvOycAg&sa=X&oi=book_result&ct=result&resnum=2&sqi=2&ved=0CCIQ6AEwAQ#v=onepage&q&f=false</ref>


==Declarations and Definitions==
==Declarations and Definitions==
Line 12: Line 12:
The standard header wchar.h contains the definitions or declarations of some constants.
The standard header wchar.h contains the definitions or declarations of some constants.
:'''NULL'''
:'''NULL'''
:It is a [[Null pointer]] constant. It never points to a real object.
:It is a null pointer constant. It never points to a real object.
:'''WCHAR_MIN'''
:'''WCHAR_MIN'''
:It indicates the lower limit or the minimum value for the type wchar_t.
:It indicates the lower limit or the minimum value for the type wchar_t.
Line 18: Line 18:
:It indicates the upper limit or the maximum value for the type wchar_t.
:It indicates the upper limit or the maximum value for the type wchar_t.
:'''WEOF'''
:'''WEOF'''
:It defines the return value of the type wint_t but the value does not correspond to any member of the extended character set. WEOF indicates the end of a character stream, the end of file([[EOF]]) or an error case.<ref>http://publib.boulder.ibm.com/infocenter/zos/v1r12/index.jsp?topic=%2Fcom.ibm.zos.r12.bpxbd00%2Fwcharh.htm</ref>
:It defines the return value of the type wint_t but the value does not correspond to any member of the extended character set. WEOF indicates the end of a character stream, the end of file (EOF) or an error case.<ref>http://publib.boulder.ibm.com/infocenter/zos/v1r12/index.jsp?topic=%2Fcom.ibm.zos.r12.bpxbd00%2Fwcharh.htm</ref>


===Data Types===
===Data Types===
Line 46: Line 46:
|-
|-
|<code>int wcscoll(const wchar_t *s1, const wchar_t *s2);</code>
|<code>int wcscoll(const wchar_t *s1, const wchar_t *s2);</code>
|compares two wide strings s1 and s2 using current [[locale]]'s [[collating order]].
|compares two wide strings s1 and s2 using current locale's collating order.
|-
|-
|<code>wchar_t *wcscpy(wchar_t *s1, const wchar_t s2);</code>
|<code>wchar_t *wcscpy(wchar_t *s1, const wchar_t s2);</code>
Line 55: Line 55:
|-
|-
|<code>size_t wcslen(const wchar_t *s);</code>
|<code>size_t wcslen(const wchar_t *s);</code>
|returns the number of wide characters(excluding the terminating null wide charater) in the wide string that s points to.
|returns the number of wide characters(excluding the terminating null wide character) in the wide string that s points to.
|}
|}



Revision as of 07:42, 4 June 2014

wchar.h is a header file in the C standard library. It is a part of the extension to the C programming language standard done in 1995. It contains extended multibyte and wide character utilities. The standard header <wchar.h> is included to perform input and output operations on wide streams. It can also be used to manipulate the wide strings.[1]

Wide Characters

C is a programming language that was developed in an environment where the dominant character set was the 7-bit ASCII code. Hence since then the 8-bit byte is the most common unit of encoding. However when a software is developed for an international purpose, it has to be able to represent different characters. For example character encoding schemes to represent the Indian, Chinese, Japanese writing systems should be available. The inconvenience of handling such varied multibyte characters can be eliminated by using characters that are simply a uniform number of bytes. ANSI C provides a type that allows manipulation of variable width characters as uniform sized data objects called wide characters. The wide character set is a superset of already existing character sets, including the 7-bit ASCII.[2]

Declarations and Definitions

Macros

The standard header wchar.h contains the definitions or declarations of some constants.

NULL
It is a null pointer constant. It never points to a real object.
WCHAR_MIN
It indicates the lower limit or the minimum value for the type wchar_t.
WCHAR_MAX
It indicates the upper limit or the maximum value for the type wchar_t.
WEOF
It defines the return value of the type wint_t but the value does not correspond to any member of the extended character set. WEOF indicates the end of a character stream, the end of file (EOF) or an error case.[3]

Data Types

mbstate_t
A variable of type mbstate_t contains all the information about the conversion state required from one call to a function to the other.
size_t
It is a size/count type, that stores the result or the returned value of the size of operator.
wchar_t
An object of type wchar_t can hold a wide character. It is also required for declaring or referencing wide characters and wide strings.
wint_t
This type is an integer type that can hold any value corresponding to the members of the extended character set. It can hold all values of the type wchar_t as well as the value of the macro WEOF. This type is unchanged by integral promotions.

Functions

Wide-character string functions

Name Notes
wchar_t *wcscat(wchar_t *s1, const wchar_t *s2); copies wide string that s2 points to, to the end of the wide string hat s1 points to.
wchar_t *wcschr(const wchar_t *s, wchar_t c); searches the wide string s for the wide character c.
int wcscmp(const wchar_t *s1, const wchar_t *s2); compares two wide strings that s1 and s2 point to.
int wcscoll(const wchar_t *s1, const wchar_t *s2); compares two wide strings s1 and s2 using current locale's collating order.
wchar_t *wcscpy(wchar_t *s1, const wchar_t s2); copies the wide string that s2 points to , to the location that s1 points to.
size_t wcscspn(const wchar_t *s1, const wchar_t *s2); searches for the very first element of s1 that equals any one of the elements of s2.
size_t wcslen(const wchar_t *s); returns the number of wide characters(excluding the terminating null wide character) in the wide string that s points to.

Wide-character array functions

Name Notes
wchar_t *wmemchr(const wchar_t *s, wchar_t c, size_t n); searches for the first element of the array of size n and that s points to, that equals c.
int wmemcmp(const wchar_t *s1, const wchar_t *s2, size_t n); compares the successive elements from two arrays that s1 and s2 point to, until it finds elements that are not equal.
wchar_t *wmemcpy(wchar_t *s1, const wchar_t *s2, size_t n); copies n wide characters from the array pointed to by s2 to the wide characters in an array pointed to by s1. If objects in s1 and s2 overlap, the behavior is undefined.
wchar *wmemmove(wchar_t *s1, const wchar_t *s2, size_t n); works like wmemcpy function even if objects in arrays s1 and s2 overlap.
wchar_t *wmemset(wchar_t *s, wchar_t c, size_t n) sets the first n elements of the array that s points to, to the wide character c.

[4]

Conversion Functions

Name Notes
wint_t btowc(int c); returns the result after converting c into its wide character equivalent and on error returns WEOF.
int wctob(wint_t c); returns the one byte or multibyte equivalent of c and on error returns WEOF.

Wide-Character I/O Functions

Name Notes
wint_t fgetwc(FILE *stream); reads a wide character from a file.
wchar_t *fgetws(wchar_t *s, int n, FILE *stream); reads a wide character string from a file.
wint_t fputwc(wchar_t *c, FILE *stream); writes a wide character to a file.
int fputws(const wchar_t *s, FILE *stream); writes a wide string to a file.
int fwprintf(FILE *stream, const wchar_t format,...); first generates a formatted text and then writes it to the file.
int fwscanf(FILE *stream, const wchar_t format,...); reads formatted text from a file.
wint_t getwc(FILE*stream); reads a wide character from a file.
wint_t getwchar() reads a wide character from stdin.
wint_t putwc(wchar_t c, FILE *stream); writes a wide character to a file.
wint_t putwchar(wchar_t c); writes a wide character to stdout.

References