'ByteArrayOutputStream.toString() generating extra characters

I have the following code:

ByteArrayOutputStream baos = new ByteArrayOutputStream();

int size = 4096;
byte[] bytes = new byte[size];

while (is.read(bytes, 0, size) != -1)
{
    baos.write(bytes);
    baos.flush();
}

When I do:

String s = baos.toString();

I get \u0000-s appended to my string. So, if my character data is only X bytes out of Y, the Y-Z will get prefilled with \u0000 making it impossible to check for equals. What am I doing wrong here? How should I be converting the bytes to a String in this case?



Solution 1:[1]

You should only be writing as much data as you are reading in each time through the loop:

ByteArrayOutputStream baos = new ByteArrayOutputStream();

int size;
byte[] bytes = new byte[4096];

while (size = is.read(bytes, 0, bytes.length) != -1)
{
    baos.write(bytes, 0, size);
}
baos.flush();
String s = baos.toString();

You might consider specifying a specific character set for converting the bytes to a String. The no-arg toString() method uses the platform default encoding.

Solution 2:[2]

The entire array (all 4096 bytes) is be written to the output - arrays have no idea of how much "useful data" they contain!

Store how much was read into a variable (InputStream.read returns a useful number) and specify that to the appropriate OutputStream.write overload to only write a portion (that which contains the useful data) of the array.

While the above change should "fix" the problem, it is generally recommended to use the string<->byte[] conversion forms that take in an explicit character set.

Solution 3:[3]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ted Hopp
Solution 2
Solution 3 afk5min