'Parsing (reading) prettytable text tables

I couldn't find any information about reading ascii tables (prettytable-looking). I need to parse some tables that look like this:

+--------+--------+-------------------+
|               Planets               |
+--------+--------+-------------------+
| Planet | R (km) | mass (x 10^29 kg) |
+--------+--------+-------------------+
|  Sun   | 696000 |    1989100000     |
|(Solar) |        |                   |
+--------+--------+-------------------+
| Earth  |  6371  |      5973.6       |
+--------+--------+-------------------+
|  Moon  |  1737  |       73.5        |
+--------+--------+-------------------+
|  Mars  |  3390  |      641.85       |
+--------+--------+-------------------+

As you can see, this table contains heading and subtitle, but it isn't a main problem. Here how I tried to parse it (all tries was in Python, maybe exist more suitable language):

  • module petl - doesn't work, coud not read +--------+--------+-------------------+
  • np.fromtxt or smth doesnt work too, because there's no array
  • module asciitable doesn't work,
asciitable.read("sample.txt",delimiter='|',guess=False,numpy=False, quotechar="'")
InconsistentTableError: Number of header columns (1) inconsistent with data columns (9) at data line 0
Header values: ['+----------+------------------------------------------------------------+---------------+-----------------+----------------+--------------------+-----------------------+']

and of course I tried all the combinations of parametrs

  • And yes, tried simple (but not) way and used regular expressions. But in case if "row" have more than 1 line there's too hard to catch all exceptions.
  • Also tried simple split, but it unhappy way too...

I heard about numpy substitute by 0 and 1, but its too hard with my table. Please help.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source