Category "tabula-py"

Easiest way to ignore or drop one header row from first page, when parsing table spanning several pages

I am parsing a PDF with tabula-py, and I need to ignore the first two tables, but then parse the rest of the tables as one, and export to a CSV. On the first re

Tabula py not reading all rows for PDFs with alternating colors for each row when Lattice is set to True

I am trying to extract all rows from the PDF attached here. Here is the code I used: def parse_latticepdf_pages(pdf): pages = read_pdf( pdf,