'java utf8 encoding outputstream not working
I need to write a program which is able to write UTF-8 data into a file.
I found out examples on the internet, however, I am not able to progress to desired result.
Code:
import java.io.BufferedWriter;
import java.io.FileOutputStream;
import java.io.OutputStreamWriter;
import java.io.Writer;
public class UTF8WriterDemo {
public static void main(String[] args) {
Writer out = null;
try {
out = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream("c://java//temp.txt"), "UTF-8"));
String text = "This texáát will be added to File !!";
out.write(text);
out.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
Everything run succesfully, but at the end I see special characters not showing properly: This texáát will be added to File !!
I tried several examples from the internet with the same result.
I use Visual Studio code.
Where could be the problem please?
Thank you
Solution 1:[1]
Your code is correct. You probably already have a file named temp.txt, and therefore Java writes text to the existing file (replacing previous content). What can be a problem is an encoding, that you already have set in your file.
In other words, you can't (or at least shouldn't) write UTF-8 text to the file with for example WINDOWS-1250 encoding or you would get an exact result as you have described.
If you didn't have this file, Java would automatically create a file with UTF-8 encoding.
Possible solutions:
- Change encoding of your current file (usually you can open it in any text editor, use Save as and then specify encoding as UTF-8.
- Remove this file and Java will create it automatically with proper encoding.
By the way, you should use StandardCharsets class instead of using String charsetName in order to avoid UnsupportedEncodingException:
new OutputStreamWriter(new FileOutputStream("temp.txt"), StandardCharsets.UTF_8)
Solution 2:[2]
When you say "I see special characters not showing properly", where are you seeing them?
What you say/show next looks like the string, utf-8 encoded (i.e. the accented a's are each represented by 2 chars, in what appears to be the appropriate encoding).
What I would expect the issue to be is that the java code is not outputting a BOM at the beginning of the file, leaving the interpretation of utf-8 sequences up to the discretion of the reader.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | user1664043 |
