'C++: How do I write the contents of std::string to a UTF-8 encoded file?
I am using C++ on Windows. I have some data in a std::string that I want to write to a file with UTF-8 encoding. How do I do this?
Solution 1:[1]
I have some data in a std::string that I want to write to a file with UTF-8 encoding. How do I do this?
If the string contains the text in UTF-8 encoding, then simply write the data. You can use std::ofstream for example.
If the string doesn't contain the data in UTF-8, then before writing, you must first convert from the encoding that the data is currently in. C++ standard library doesn't have general character encoding conversion functions (disregarding a few that are deprecated). There's generally no guaranteed way to detect the current encoding. You should simply know it beforehand.
But when I check the encoding of the created file in notepad, it is ANSI and not UTF-8
Like I mentioned in previous section regarding detecting the source encoding of the string, there is no guaranteed way to do that. Notepad also doesn't have this superpower. It probably uses simplistic rules to guess the encoding. Sometimes it guesses wrong.
UTF-8 has the same representation for the characters in the 7 bit ASCII encoding as the ASCII itself (I'm guessing that notepad calls ASCII by the name "ANSI"). If your string contains only those characters, then the UTF-8 encoding of the string is indistinguishable from ASCII. In such case, notepad is likely going to guess wrong (although technically the guess is also correct since the UTF-8 would in that case incidentally be ASCII as well).
Solution 2:[2]
This is similar to How do I write a UTF-8 encoded string to a file in windows, in C++.
Note that writing to file across platforms is different (in windows you have CreateFile, WriteFile, ReadFile, CloseHandle, which is not limited to files only and can perform operation on Device-Drivers), were in linux you have different sets of fuctions. It's best to check the platform you're intending to use (in your case, Windows).
Solution 3:[3]
If some of your data is in MM/dd/YYYY and some M/d/yyyy this really makes for a bit of a mess. I would likely do something like this:
--Add a new varchar column (yes varchar) to save a copy of your bad data
ALTER TABLE dbo.YourData ADD BadDate varchar(10);
GO
--Change the data you have to the ISO format yyyyMMdd and store the bad data in the BadDate column
UPDATE dbo.YourData
SET DateColumn = CONVERT(varchar(8),TRY_CONVERT(date,DateColumn,101),112),
BadDate = CASE WHEN TRY_CONVERT(date,DateColumn,101) IS NULL THEN DateColumn END
WHERE DateColumn IS NOT NULL;
GO
--Change data type of your data column
ALTER TABLE dbo.YourData ALTER COLUMN DateColumn date NULL;
GO
--You can view your bad data with:
SELECT BadData
FROM dbo.YourData
WHERE BadData IS NOT NULL;
Solution 4:[4]
If you just want to locate rows with values that are obviously bad dates - disregarding any ambiguities - just use try_convert and check for NULLs
eg
with dates as (
select * from (values('01/02/2021'),('02/01/2021'),('33/02/2021'),('01/13/2021'))v(d)
)
select *
from dates
where Try_Convert(date, d) is null
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Nasser GHOSEIRI |
| Solution 3 | Larnu |
| Solution 4 | Stu |
