'How to convert a byte array back into a string?

I have written a code for Huffman Compression. My padded encoded string is converted into a byte array using the following code:

def make_byte_array(self, padded_text):
    byte_array = bytearray()
    for i in range(0, len(padded_text), 8):
        byte_array.append(int(padded_text[i:i + 8], 2))

    return byte_array

How can I convert the byte array back into the original string of my padded encoded text?

Edit:

Here is some more context that might make answering the question a bit easier. The padded compressed text I am converting into a byte array is saved into a binary file. When I read the binary file this is the output I get:

b'\xd8\xd2.\xfdc\xa9\xfd\xc4\xa2R\xf8\xack\xb4\xfe\x07&@'

The above string is what I need to convert back into the compressed padded text.

To explain my code further, I have only created the Compression part, and I used Huffman Compression. Below is the code for my Huffman Compression:

class HuffmanCoding:
    def __init__(self, text_to_compress):
        self.text_to_compress = text_to_compress  # text that will be compressed
        self.heap = []
        self.codes = {}  # will store the Huffman code of each character
        self.decompress_map = {}

    def get_frequency(self):  # method to find frequency of each character in text - RLE
        frequency_Dictionary = {}  # creates an empty dictionary where frequency of each character will be stored

        for character in self.text_to_compress:  # Iterates through the text to be compressed
            if character in frequency_Dictionary:
                frequency_Dictionary[character] = frequency_Dictionary[character] + 1  # if character already exists in
                # dictionary, its value is increased by 1
            else:
                frequency_Dictionary[character] = 1  # if character is not present in list, its value is set to 1

        return frequency_Dictionary

    def make_queue(self, frequency):  # creates the priority queue of each character and its associated frequency
        for key in frequency:
            node = HeapNode(key, frequency[key])  # create node (character) and store its frequency alongside it
            heapq.heappush(self.heap, node)  # Push the node into the heap

    def merge_nodes(
            self):  # creates HuffmanTree by getting the two minimum nodes and merging them together, until theres
        # only one node left
        while len(self.heap) > 1:
            node1 = heapq.heappop(self.heap)  # pop node from top of heap
            node2 = heapq.heappop(self.heap)  # pop next node which is now at the top of heap

            merged = HeapNode(None, node1.freq + node2.freq)  # merge the two nodes we popped out from heap
            merged.left = node1
            merged.right = node2

            heapq.heappush(self.heap, merged)  # push merged node back into the heap

    def make_codes(self, root, current_code):  # Creates Huffman code for each character
        check = 0

        if root == None:
            return

        if root.char != None:
            self.codes[root.char] = current_code
            self.decompress_map[current_code] = root.char

        self.make_codes(root.left, current_code + "0")  # Every time you traverse left, add a 0 - Recursive Call
        self.make_codes(root.right, current_code + "1")  # Every time you traverse right, add a 1 - Recursive Call
        if len(self.decompress_map) == len(self.get_frequency()) and check == 0:  #####################################################
            codeDict = open("Codes.txt", mode="w")
            codeDict.write(str(self.decompress_map))
            codeDict.close()
            check = 1



    def assignCodes(self):  # Assigns codes to each character
        root = heapq.heappop(self.heap)  # extracts root node from heap
        current_code = ""
        self.make_codes(root, current_code)

    def get_compressed_text(self, text):  # Replaces characters in original text with codes
        compressed_text = ""
        for character in text:
            compressed_text += self.codes[character]
        return compressed_text

    def pad_encoded_text(self, compressed_text):
        extra_padding = 8 - len(compressed_text) % 8  # works out how much extra padding is required
        for i in range(extra_padding):
            compressed_text += "0"  # adds the amount of 0's that are required

        return compressed_text

    def make_byte_array(self, padded_text):

        byte_array = bytearray()
        for i in range(0, len(padded_text), 8):
            byte_array.append(int(padded_text[i:i + 8], 2))

        return byte_array

    def show_compressed_text(self):

        frequency = self.get_frequency()
        self.make_queue(frequency)
        self.merge_nodes()
        self.assignCodes()

        encoded_text = self.get_compressed_text(self.text_to_compress)
        padded_encoded_text = self.pad_encoded_text(encoded_text)

        byte_array = self.make_byte_array(padded_encoded_text)
        return bytes(byte_array)

The class HuffmanCoding takes in the text to be compressed. The show_compressed_text then spits out the byte_array of my compressed text. I then get this byte array, and write it to a binary file. I want to open this binary file, convert the byte array inside back into the string of 0's and 1's which represents my compressed text so that I can work on decompressing it. Hopefully that makes sense.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'How to convert a byte array back into a string?

Sources

Related Questions