src/utf8.c File Reference

Source - Unicode utf-8. More...

#include <stdio.h>
#include "check.h"

Go to the source code of this file.

Defines

#define UTF8_HEAD_7   0x00
 header bits added to one-byte utf8 characters
#define UTF8_HEAD_7_MASK   0x80
 header mask used to add bits to one-byte utf8 characters
#define UTF8_HEAD_11   0xc0
 header bits added to two-bytes utf8 characters
#define UTF8_HEAD_11_MASK   0xe0
 header mask used to add bits to two-bytes utf8 characters
#define UTF8_TAIL   0x80
 bits added to non-first bytes of multi-bytes utf8 characters
#define UTF8_TAIL_MASK   0xc0
 mask used to add bits to non-first bytes of multi-bytes utf8 characters
#define UTF8_TAIL_SHIFT   6
 multi-bytes utf8 characters bit shift
#define UTF8_SHIFTER(data, byte)   (data >> (byte*UTF8_TAIL_SHIFT))
 shift data to be coded in utf8
#define UTF8_HEADER_7(data)   (UTF8_HEAD_7 | (data & ~UTF8_HEAD_7_MASK))
 add utf-8 one-byte bits header
#define UTF8_HEADER_11(data)   (UTF8_HEAD_11 | (data & ~UTF8_HEAD_11_MASK))
 add utf-8 two-bytes bits header
#define UTF8_TAILER(data)   (UTF8_TAIL | (data & ~UTF8_TAIL_MASK))
 add utf-8 non-first bytes bits header
#define IS_UTF8_HEADER_7(byte)   ((unsigned char)(byte & UTF8_HEAD_7_MASK) == UTF8_HEAD_7)
 test if the byte is a one-byte utf8 character
#define IS_UTF8_HEADER_11(byte)   ((unsigned char)(byte & UTF8_HEAD_11_MASK) == UTF8_HEAD_11)
 test if the byte is a two-byte utf8 character
#define IS_UTF8_TAIL(byte)   ((unsigned char)(byte & UTF8_TAIL_MASK) == UTF8_TAIL)
 test if the byte is a part of (and not the first of) a multi-byte utf8 character
#define UTF8_UNSHIFTER(data, byte)   (data << (byte*UTF8_TAIL_SHIFT))
 unshift utf8 character to retrieve data
#define UTF8_UNHEADER_7(data)   (data & ~UTF8_HEAD_7_MASK)
 remove utf-8 one-byte bits header
#define UTF8_UNHEADER_11(data)   (data & ~UTF8_HEAD_11_MASK)
 remove utf-8 two-bytes bits header
#define UTF8_UNTAILER(data)   (data & ~UTF8_TAIL_MASK)
 remove utf-8 non-first bytes bits header

Functions

size_t iso8859_utf8 (const char *source, char *dest, const size_t dest_size)
 convert iso to unicode
size_t utf8_iso8859 (const char *source, char *dest, const size_t dest_size)
 convert iso to unicode

Detailed Description

Source - Unicode utf-8.

Author:
Julien Blitte
Version:
0.1

Definition in file utf8.c.


Function Documentation

size_t iso8859_utf8 ( const char *  source,
char *  dest,
const size_t  dest_size 
)

convert iso to unicode

Parameters:
source latin1 string to convert to utf8 - latin1
dest destination where new string is stored - utf8
dest_size size of dest
Returns:
size of new string

Definition at line 58 of file utf8.c.

References check, UTF8_HEADER_11, UTF8_HEADER_7, UTF8_SHIFTER, and UTF8_TAILER.

Referenced by db5_select_filename(), db5_shortname_to_localfile(), and log_dump_latin1().

size_t utf8_iso8859 ( const char *  source,
char *  dest,
const size_t  dest_size 
)

convert iso to unicode

Parameters:
source utf8 string to convert to latin1 - utf8
dest destination where new string is stored - latin1
dest_size size of dest
Returns:
size of new string

Definition at line 95 of file utf8.c.

References check, IS_UTF8_HEADER_11, IS_UTF8_HEADER_7, UTF8_UNHEADER_11, UTF8_UNHEADER_7, UTF8_UNSHIFTER, and UTF8_UNTAILER.

Referenced by db5_generate_row(), db5_insert(), and db5_longname_to_shortname().

 All Classes Files Functions Variables Typedefs Defines

Generated on Mon Jan 11 00:15:07 2010 for db5fuse by  doxygen 1.6.1