In a recent issue of MSDN magazine I found a good article ((https://msdn.microsoft.com/magazine/mt763237?MC=CCPLUS&MC=Windows )) explaining how to convert between utf8 and utf16 encodings of std::string by using WIN 32 API functions. Besides a good written overview it provides the usable c++ code for download.
To write platform independent code, here I sum up, how far we can get with C++11 and STL library. Note: the code compiles and runs with C++0X capable compilers like VS 2010.
storage type
In order to store utf8 encoded strings we can use std::string. In order to store utf16 encoded string we should use std::u16string, which is basic_string with underlying type char16_t. In contrast to std::wstring the std::u16string is the same on all platforms.
utf8 -> utf16
#include <locale> #include <codecvt> typedef std::codecvt_utf8_utf16<char16_t> conversionFacet; std::u16string Utf8ToUtf16(const std::string& utf8) { std::u16string utf16; std::wstring_convert<conversionFacet, char16_t> converter; utf16 = converter.from_bytes(utf8); return(utf16); }
The working horse here is the template class std::wstring_convert. Despite its name it can be used not only with std::wstring, but due to its template parameter also with char16_t.
As a first template parameter it takes a individual facet. In this case we use the template class std::codevct_utf8_utf16 as a conversion facet.
The inverse conversion can be implemented in a similar way, just by calling member function to_bytes.
utf16 -> utf8
std::string Utf16ToUtf8(const std::u16string& utf16) { std::string utf8; std::wstring_convert<conversionFacet, char16_t> converter; utf8 = converter.to_bytes(utf16); return(utf8); }
summary
With c++11 we can use std::string, std::u16string and std::u32string to deal better with platform independent unicode support.
I showed simple example to convert between utf8 and utf16 encodings. One can implement other conversions in a similar way ,for example utf16 ->utf32.