'Create UTF-8 file in Qt

I'm trying to create a UTF-8 coded file in Qt.

#include <QtCore>

int main()
{
    QString unicodeString = "Some Unicode string";
    QFile fileOut("D:\\Temp\\qt_unicode.txt");
    if (!fileOut.open(QIODevice::WriteOnly | QIODevice::Text))
    {
        return -1;
    }

    QTextStream streamFileOut(&fileOut);
    streamFileOut.setCodec("UTF-8");
    streamFileOut << unicodeString;
    streamFileOut.flush();

    fileOut.close();

    return 0;
}

I thought when QString is by default Unicode and when I set codec of the output stream to UTF-8 that my file will be UTF-8. But it's not, it's ANSI. What do I do wrong? Is something wrong with my strings? Can you correct my code to create UTF-8 file? Next step for me will be to read ANSI file and save it as UTF-8 file, so I'll have to perform a conversion on each read string but now, I want to start with a file. Thank you.



Solution 1:[1]

My experience to create txt encoding UTF-8 without BOM by QT as:

file.open(QIODevice::WriteOnly | QIODevice::Text);
QTextStream out(&file);
out.setCodec("UTF-8"); // ...
vcfline = ctn; //assign some utf-8 characters
out.setGenerateByteOrderMark(false);
out << vcfline; //.....
file.close();

And the file will be encoding UTF-8 without BOM.

Solution 2:[2]

Don't forget that UTF-8 encoding will encode ASCII characters as one byte. Only special or accentuated characters will be encoded with more bytes (from 2 to 6 bytes).

This means as long as you have ASCII characters (which is the case of your unicodeString), the file will only contain 8 bytes characters. Thus, you get backward compatibility with ASCII :

UTF-8 can represent every character in the Unicode character set, but unlike them, possesses the advantages of being backward-compatible with ASCII

To check if your code is working, you should put for instance some accentuated characters in your unicode.

I tested your code with accentuated characters, and it's working fine.

If you want to have a BOM at the beginning of your file, you could start by adding the BOM character (QChar(QChar::ByteOrderMark)).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 user2006121
Solution 2 lesenk