Quantcast
Channel: Active questions tagged gcc - Stack Overflow
Viewing all articles
Browse latest Browse all 21994

C++ utf-8 literals in GCC and MSVC

$
0
0

Here i have some simple code:

#include <iostream>#include <cstdint>    int main()    {         const unsigned char utf8_string[] = u8"\xA0";         std::cout << std::hex << "Size: "<< sizeof(utf8_string) << std::endl;          for (int i=0; i < sizeof(utf8_string); i++) {            std::cout << std::hex << (uint16_t)utf8_string[i] << std::endl;          }    }

I see different behavior here with MSVC and GCC.MSVC sees "\xA0" as not encoded unicode sequence, and encodes it to utf-8.So in MSVC the output is:

C2A0

Which is correctly encoded in utf8 unicode symbol U+00A0.

But in case of GCC nonthing happens. It treats string as simple bytes. There's no change even if i remove u8 before string literal.

Both compilers encode to utf8 with output C2A0 if the string is set to: u8"\u00A0";

Why do compilers behave differently and which actually does it right?

Software used for test:

GCC 8.3.0

MSVC 19.00.23506

C++ 11


Viewing all articles
Browse latest Browse all 21994

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>