'JSON to Markdown table formatting

I'm trying to build out a function to convert JSON data into a list to then be used as base for building out markdown tables.

I have a first prototype:

#!/usr/bin/env python3
import json

data = {
  "statistics": {
    "map": [
      {
        "map_name": "Location1",
        "nan": "loc1",
        "dont": "ignore this",
        "packets": "878607764338"
      },
      {
        "map_name": "Location2",
        "nan": "loc2",
        "dont": "ignore this",
        "packets": "67989088698"
      },
    ],
    "map-reset-time": "Thu Jan  6 05:59:47 2022\n"
  }
}
headers = ['Name', 'NaN', 'Packages']

def jsonToList(data):
    """adds the desired json fields"""
    # Wil be re-written to be more acceptant to different data fields. 
    json_obj = data

    ips = []
    for piece in json_obj['statistics']['map']:
        this_ip = [piece['map_name'], piece['nan'], piece['packets']]
        ips.append(this_ip)

    return ips 

def markdownTable(data, headers):
  # Find maximal length of all elements in list
    n = max(len(x) for l in data for x in l)
    # Print the rows
    headerLength = len(headers)
  
    # expected "|        Name|         NaN|    Packages|"
    for i in range(len(headers)):
      # Takes the max number of characters and subtracts the length of the header word
      hn = n - len(headers[i])
      # Prints | [space based on row above][header word]
      print("|" + " " * hn + f"{headers[i]}", end='')
      # If last run is meet add ending pipe
      if i == headerLength-1:
        print("|") # End pipe for headers

        # expected |--------|--------|--------|
        print("|", end='') # Start pipe for sep row
        for i in   range(len(headers)):
          print ("-" *n + "|", end='')

        # seams to be adding an extra line however if its not there,
        # Location1 
        print("\n", end='') 
        
    dataLength = len(data)
    for row in data:
      for x in row:
        hn = n - len(x)
        print(f"|" + " " * hn + x, end='')
      print("|")
 

if __name__ == "__main__":
    da = jsonToList(data)
    markdownTable(da, headers)

This code outputs as expected a table that can be used as markdown.

|        Name|         NaN|    Packages|
|------------|------------|------------|
|   Location1|        loc1|878607764338|
|   Location2|        loc2| 67989088698|

I was wondering if anyone have any good ideas regarding the placement of the words (centralized) currently I'm utilizing a n = max(len(x) for l in data for x in l) and then subtracts the length of the current string and ands it at the end of the output, this works well for left align but if would like to have them centered there's an issue.

Additionally general feedback on ways to optimize the code is much appreciated, if someone has build a similar function before this is my first attempt or ways to go directly from JSON.



Solution 1:[1]

If you are looking to justify text within Python there are some built-in methods to do so.

ljust, center, and rjust

The ljust, center, and rjust methods can be called on a str instance and return a string which is padded to the given length with the given fill character (space by default).

>>> s = 'foo'
>>> s.ljust(10)
'foo       '
>>> s.center(10)
'   foo    '
>>> s.rjust(10)
'       foo'
>>> # Use a different fill character
>>> s.center(11, '*')
'****foo****'

Format String Syntax

Alternatively, you can use the Format String Syntax (demonstrated using f-strings below). These are especially useful if you need to combine the padded string with other text as that can be included in the same string.

>>> f'{s:<10}'
'foo       '
>>> f'{s:^10}'
'   foo    '
>>> f'{s:>10}'
'       foo'
>>> # Pass in a length
>>> length = 11
>>> f'{s:^{length}}'
'    foo    '
>>> # Specify a fill character
>>> f'{s:*^11}'
'****foo****'

Solution 2:[2]

general feedback on ways to optimize the code is much appreciated

I might do something like this:

data = {
  "statistics": {
    "map": [
      {
        "map_name": "Location1",
        "nan": "loc1",
        "dont": "ignore this",
        "packets": "878607764338"
      },
      {
        "map_name": "Location2",
        "nan": "loc2",
        "dont": "ignore this",
        "packets": "67989088698"
      },
    ],
    "map-reset-time": "Thu Jan  6 05:59:47 2022\n"
  }
}

header_map = {
    'Name': 'map_name',
    'NaN': 'nan',
    'Packages': 'packets'
}

def markdownTable(data):
    rows = []
    length = max(len(v) for d in data['statistics']['map'] for k, v in d.items() if k in header_map.values()) + 2
    # build header
    rows.append('|'.join(s.center(length) for s in header_map.keys()))
    rows.append('|'.join('-' * length for x in range(len(header_map))))
    # build body
    for item in data['statistics']['map']:
        rows.append('|'.join(v.center(length) for k, v in item.items() if k in header_map.values()))
    # Print rows
    for row in rows:
        print(f'|{row}|')

if __name__ == "__main__":
    markdownTable(data)

Note that there is no need to restructure the data with your jsonToList function. Just iterate through the existing data structure. Also, I created a header_map dictionary, which maps the table headers to the keys within the source data. In the code, just call either the header_map.keys() or header_map.values() as appropriate.

As a general run, I try to avoid doing string concoctions with + unless absolutely necessary. Therefore, my use of '|'.join(). And of course, each call to join is passed a list comprehension for that row.

Finally, I first build up a collection of rows, then iterated over them and printed each row (adding the opening and closing pipes). I could have printed the first tine round, but including the entire joined list comprehension within an f-string is not very readable. For example:

print(f"|{'|'.join(s.center(length) for s in header_map.keys())}|")

Also, with the second loop, I only need to define the format for the opening an closing pipes once which is more DRY (Don't Repeat Yourself). Looking back, I could have also done the join in the second iteration for the same reason. This would have had the additional benefit of having the rows hold the raw data, which could have additional processing done to it if needed.

The output is as follows:

|     Name     |     NaN      |   Packages   |
|--------------|--------------|--------------|
|  Location1   |     loc1     | 878607764338 |
|  Location2   |     loc2     | 67989088698  |

Notice that with center justification, an odd number of characters results in favoring the right by one character. Therefore, the Name column and the last value in the Packages column appear to be off center by half a character. Probably, the number should be right-justified, which would complicate the code.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Waylan
Solution 2 Waylan