'Delete empty row - openpyxl

After spending the last few hours trying to find a way to do this, I've decided to just ask.

I've gone through the openpyxl docs more than a few times, as well as going through the questions asked here and here, and even the chapter from this online book, and none of it really answers what I'm attempting to do.

Here's the code I have right now:

for row in ws.iter_rows():
    i = 1
    if row[i].internal_value() == None:
        ws.Rows(i).Delete()
    else:
        i + 1

I've tried many different things with this, and right now I'm getting an error:

TypeError: 'NoneType' object is not callable

What am I doing wrong, and how can I fix it so that I'm iterating over all of the rows and deleting any that either are completely empty, or (if it's easier to implement) have an empty first cell?

Thanks



Solution 1:[1]

May be for someone next code will be useful:

index_row = []

# loop each row in column A
for i in range(1, ws.max_row):
    # define emptiness of cell
    if ws.cell(i, 1).value is None:
        # collect indexes of rows
        index_row.append(i)

# loop each index value
for row_del in range(len(index_row)):
    ws.delete_rows(idx=index_row[row_del], amount=1)
    # exclude offset of rows through each iteration
    index_row = list(map(lambda k: k - 1, index_row))

Solution 2:[2]

2018 update: I was searching how to delete a row today and found that the functionality is added in openpyxl 2.5.0-b2. Just tried and it worked perfectly. Here's the link where I found the answer: https://bitbucket.org/openpyxl/openpyxl/issues/964/delete_rows-does-not-work-on-deleting

And here's the syntax to delete one row:

ws.delete_rows(index, 1)

where: 'ws' is the worksheet, 'index' is the row number, and '1' is the number of rows to delete.

There's also the ability to delete columns, but I haven't tried that.

Solution 3:[3]

One of the reasons I found is you are initializing the value of i to 1 every time the loop is running make it as follows:

i=1
for row in ws.iter_rows():
    if row[i].internal_value() == None:
        ws.Rows(i).Delete()
    else:
        i + 1

Rest can answer after observing the full code.

Solution 4:[4]

The same logic can be applied to delete empty columns.

from openpyxl import *
import numpy as np
import os


path = "filepath"

workbooks = os.listdir(path)
workbooks = [_ for _ in workbooks if not _.startswith('~')]



for workbook in workbooks:
    wb2 = load_workbook(os.path.join(path, workbook))
    for sheet in wb2.worksheets:
        max_row_in_sheet = sheet.max_row
        max_col_in_sheet = sheet.max_column
        array_3 = np.array([])
        array_4 = np.array([])
        r = 1
        c = 1
        for r in range(1, max_row_in_sheet+1):
            array_1 = np.array([])
            array_2 = np.array([])
            for c in range (1, max_col_in_sheet+1):
                if sheet.cell(row = r, column = c).value == None:
                    array_1 = np.append(array_2, c)
                    array_2 = array_1
            if len(array_1) == max_col_in_sheet:
                array_3 = np.append(array_4, r)
                array_4 = array_3
                array_3 = array_3.astype(int)
        if len(array_3) != 0:
            index_of_last_array_element = len(array_3) - 1
            while index_of_last_array_element != -1:
                sheet.delete_rows(array_3[index_of_last_array_element], 1)
                index_of_last_array_element = index_of_last_array_element - 1
    wb2.save(workbook)

Solution 5:[5]

There are lots of reasons why openpyxl doesn't provide this possibility but you might be able to work something out based on this snippet: https://bitbucket.org/snippets/openpyxl/qyzKn

Otherwise have a look at xlwings for remote controlling Excel without having to mess with COM.

Solution 6:[6]

openpyxl.worksheet.worksheet.Worksheet.insert_rows()
openpyxl.worksheet.worksheet.Worksheet.insert_cols()
openpyxl.worksheet.worksheet.Worksheet.delete_rows()
openpyxl.worksheet.worksheet.Worksheet.delete_cols()

Specific row:

ws.insert_rows(7)

Column range (same for row):

ws.delete_cols(6, 3)

From the OpenPyXL documentation

(This is a 2018 feature, so remember to upgrade: python3 -m pip install openpyxl --upgrade)

Solution 7:[7]

This script iterates through all worksheets in the workbook and delete rows in list "rows_to_delete". Be sure to remove any table formatting before running the script. In other words, you want to convert your table to a normal range first.

import openpyxl
rows_to_delete = [None, '', ' ']
for i in wb.sheetnames:
    print(f'Now in sheet: {i}')
    ws = wb[i]
    # loop each row in column B
    column_b = range(1, ws.max_row)
    for i in reversed(column_b):
        if ws.cell(i, 2).value in rows_to_delete:
            print(f'Deleting Row: {ws.cell(i,2).row}')
            ws.delete_rows(ws.cell(i,2).row)

Solution 8:[8]

this works for both rows and cols:

import openpyxl
from openpyxl import *
import numpy as np

wb2 = openpyxl.load_workbook('/content/Drafts .xlsx')

for sheet in wb2.worksheets: 
    print ('Your currently in ', sheet)  
    max_row_in_sheet = sheet.max_row  
    max_col_in_sheet = sheet.max_column 
    print (max_row_in_sheet, max_col_in_sheet)

    array_3 = np.array([]) 
    array_4 = np.array([]) 

    r = 1  # initially declaring row as 1
    c = 1  # initially declaring column as 1
    for r in range(1, max_row_in_sheet + 1):  # 31 row
        array_1 = np.array([])
        array_2 = np.array([])
        for c in range(1, max_col_in_sheet + 1):  # 9 cols
            if sheet.cell(row=r, column=c).value == None:  # (9,1)
                array_1 = np.append(array_2, c)
                array_2 = array_1  # 1,2,3,4,5,6,7,8,9
        if len(array_1) == max_col_in_sheet:  # ( 9 == 9 )
            array_3 = np.append(array_4, r)  # 9
            array_4 = array_3
            array_3 = array_3.astype(int)
    if len(array_3) != 0:  # 11len
        index_of_last_array_element = len(array_3) - 1
        while index_of_last_array_element != -1:
            sheet.delete_rows(array_3[index_of_last_array_element], 1)
            index_of_last_array_element = index_of_last_array_element \
                - 1

    max_row_in_sheet = sheet.max_row  # maximum enterd row
    max_col_in_sheet = sheet.max_column  # maximum entered column

    print 'Maximum Rows and Cols after Removing'
    print (max_row_in_sheet, max_col_in_sheet)
    print '======================================'
    col_arr = []
    for x in range(1, sheet.max_column + 1):
        col_arr.append(0)

    for r in range(1, max_row_in_sheet + 1):
        array_1 = np.array([])
        array_2 = np.array([])
        for c in range(1, max_col_in_sheet + 1):
            if sheet.cell(row=r, column=c).value == None:
                array_1 = np.append(array_2, c)
                array_2 = array_1
                col_arr[c - 1] += 1
    print col_arr

    array_2 = [int(x) for x in array_2]
    print len(array_2)
    print array_2
    if len(array_2) != 0:
        index = len(array_2) - 1
        print index
        while index != -1:
            temp = array_2[index]

            # print(temp)

            sheet.delete_cols(temp, 1)
            index = index - 1

wb2.save('/content/outputs.xlsx')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Oleksandr Shevelskyi
Solution 2
Solution 3 Yugansh Jindal
Solution 4
Solution 5 Charlie Clark
Solution 6 Punnerud
Solution 7 Tony S
Solution 8 joanis