strusBase  0.17
Namespaces | Enumerations | Functions
utf8.hpp File Reference

Helpers for UTF-8 encoding/decoding. More...

#include "strus/base/bitOperations.hpp"
#include "strus/base/stdint.h"

Go to the source code of this file.

Namespaces

 strus
 Wrapper to structures needed for atomic counters.
 

Enumerations

enum  {
  strus::B11111111 =0xFF, strus::B01111111 =0x7F, strus::B00111111 =0x3F, strus::B00011111 =0x1F,
  strus::B00001111 =0x0F, strus::B00000111 =0x07, strus::B00000011 =0x03, strus::B00000001 =0x01,
  strus::B00000000 =0x00, strus::B10000000 =0x80, strus::B11000000 =0xC0, strus::B11100000 =0xE0,
  strus::B11110000 =0xF0, strus::B11111000 =0xF8, strus::B11111100 =0xFC, strus::B11111110 =0xFE,
  strus::B11011111 =B11000000|B00011111, strus::B11101111 =B11100000|B00001111, strus::B11110111 =B11110000|B00000111, strus::B11111011 =B11111000|B00000011,
  strus::B11111101 =B11111100|B00000001
}
 

Functions

static bool strus::utf8midchr (unsigned char ch)
 Return true, if the character passed as argument is a non start character of a multi byte encoded unicode character. More...
 
static const char * strus::utf8prev (char const *end)
 Skip to the begin of an UTF-8 encoded character from a pointer into it. More...
 
static unsigned char strus::utf8charlen (unsigned char ch)
 Get the lenght of an UTF-8 encoded character from its first byte. More...
 
int32_t strus::utf8decode (const char *itr, unsigned int charsize)
 Decoding of a single UTF-8 character in a string. More...
 
std::size_t strus::utf8encode (char *buf, int32_t chr)
 Encoding of a single UTF-8 character into a string buffer. More...
 

Detailed Description

Helpers for UTF-8 encoding/decoding.