'Delete empty row - openpyxl
After spending the last few hours trying to find a way to do this, I've decided to just ask.
I've gone through the openpyxl docs more than a few times, as well as going through the questions asked here and here, and even the chapter from this online book, and none of it really answers what I'm attempting to do.
Here's the code I have right now:
for row in ws.iter_rows():
i = 1
if row[i].internal_value() == None:
ws.Rows(i).Delete()
else:
i + 1
I've tried many different things with this, and right now I'm getting an error:
TypeError: 'NoneType' object is not callable
What am I doing wrong, and how can I fix it so that I'm iterating over all of the rows and deleting any that either are completely empty, or (if it's easier to implement) have an empty first cell?
Thanks
Solution 1:[1]
May be for someone next code will be useful:
index_row = []
# loop each row in column A
for i in range(1, ws.max_row):
# define emptiness of cell
if ws.cell(i, 1).value is None:
# collect indexes of rows
index_row.append(i)
# loop each index value
for row_del in range(len(index_row)):
ws.delete_rows(idx=index_row[row_del], amount=1)
# exclude offset of rows through each iteration
index_row = list(map(lambda k: k - 1, index_row))
Solution 2:[2]
2018 update: I was searching how to delete a row today and found that the functionality is added in openpyxl 2.5.0-b2. Just tried and it worked perfectly. Here's the link where I found the answer: https://bitbucket.org/openpyxl/openpyxl/issues/964/delete_rows-does-not-work-on-deleting
And here's the syntax to delete one row:
ws.delete_rows(index, 1)
where: 'ws' is the worksheet, 'index' is the row number, and '1' is the number of rows to delete.
There's also the ability to delete columns, but I haven't tried that.
Solution 3:[3]
One of the reasons I found is you are initializing the value of i to 1 every time the loop is running make it as follows:
i=1
for row in ws.iter_rows():
if row[i].internal_value() == None:
ws.Rows(i).Delete()
else:
i + 1
Rest can answer after observing the full code.
Solution 4:[4]
The same logic can be applied to delete empty columns.
from openpyxl import *
import numpy as np
import os
path = "filepath"
workbooks = os.listdir(path)
workbooks = [_ for _ in workbooks if not _.startswith('~')]
for workbook in workbooks:
wb2 = load_workbook(os.path.join(path, workbook))
for sheet in wb2.worksheets:
max_row_in_sheet = sheet.max_row
max_col_in_sheet = sheet.max_column
array_3 = np.array([])
array_4 = np.array([])
r = 1
c = 1
for r in range(1, max_row_in_sheet+1):
array_1 = np.array([])
array_2 = np.array([])
for c in range (1, max_col_in_sheet+1):
if sheet.cell(row = r, column = c).value == None:
array_1 = np.append(array_2, c)
array_2 = array_1
if len(array_1) == max_col_in_sheet:
array_3 = np.append(array_4, r)
array_4 = array_3
array_3 = array_3.astype(int)
if len(array_3) != 0:
index_of_last_array_element = len(array_3) - 1
while index_of_last_array_element != -1:
sheet.delete_rows(array_3[index_of_last_array_element], 1)
index_of_last_array_element = index_of_last_array_element - 1
wb2.save(workbook)
Solution 5:[5]
There are lots of reasons why openpyxl doesn't provide this possibility but you might be able to work something out based on this snippet: https://bitbucket.org/snippets/openpyxl/qyzKn
Otherwise have a look at xlwings for remote controlling Excel without having to mess with COM.
Solution 6:[6]
openpyxl.worksheet.worksheet.Worksheet.insert_rows()
openpyxl.worksheet.worksheet.Worksheet.insert_cols()
openpyxl.worksheet.worksheet.Worksheet.delete_rows()
openpyxl.worksheet.worksheet.Worksheet.delete_cols()
Specific row:
ws.insert_rows(7)
Column range (same for row):
ws.delete_cols(6, 3)
From the OpenPyXL documentation
(This is a 2018 feature, so remember to upgrade: python3 -m pip install openpyxl --upgrade)
Solution 7:[7]
This script iterates through all worksheets in the workbook and delete rows in list "rows_to_delete". Be sure to remove any table formatting before running the script. In other words, you want to convert your table to a normal range first.
import openpyxl
rows_to_delete = [None, '', ' ']
for i in wb.sheetnames:
print(f'Now in sheet: {i}')
ws = wb[i]
# loop each row in column B
column_b = range(1, ws.max_row)
for i in reversed(column_b):
if ws.cell(i, 2).value in rows_to_delete:
print(f'Deleting Row: {ws.cell(i,2).row}')
ws.delete_rows(ws.cell(i,2).row)
Solution 8:[8]
this works for both rows and cols:
import openpyxl
from openpyxl import *
import numpy as np
wb2 = openpyxl.load_workbook('/content/Drafts .xlsx')
for sheet in wb2.worksheets:
print ('Your currently in ', sheet)
max_row_in_sheet = sheet.max_row
max_col_in_sheet = sheet.max_column
print (max_row_in_sheet, max_col_in_sheet)
array_3 = np.array([])
array_4 = np.array([])
r = 1 # initially declaring row as 1
c = 1 # initially declaring column as 1
for r in range(1, max_row_in_sheet + 1): # 31 row
array_1 = np.array([])
array_2 = np.array([])
for c in range(1, max_col_in_sheet + 1): # 9 cols
if sheet.cell(row=r, column=c).value == None: # (9,1)
array_1 = np.append(array_2, c)
array_2 = array_1 # 1,2,3,4,5,6,7,8,9
if len(array_1) == max_col_in_sheet: # ( 9 == 9 )
array_3 = np.append(array_4, r) # 9
array_4 = array_3
array_3 = array_3.astype(int)
if len(array_3) != 0: # 11len
index_of_last_array_element = len(array_3) - 1
while index_of_last_array_element != -1:
sheet.delete_rows(array_3[index_of_last_array_element], 1)
index_of_last_array_element = index_of_last_array_element \
- 1
max_row_in_sheet = sheet.max_row # maximum enterd row
max_col_in_sheet = sheet.max_column # maximum entered column
print 'Maximum Rows and Cols after Removing'
print (max_row_in_sheet, max_col_in_sheet)
print '======================================'
col_arr = []
for x in range(1, sheet.max_column + 1):
col_arr.append(0)
for r in range(1, max_row_in_sheet + 1):
array_1 = np.array([])
array_2 = np.array([])
for c in range(1, max_col_in_sheet + 1):
if sheet.cell(row=r, column=c).value == None:
array_1 = np.append(array_2, c)
array_2 = array_1
col_arr[c - 1] += 1
print col_arr
array_2 = [int(x) for x in array_2]
print len(array_2)
print array_2
if len(array_2) != 0:
index = len(array_2) - 1
print index
while index != -1:
temp = array_2[index]
# print(temp)
sheet.delete_cols(temp, 1)
index = index - 1
wb2.save('/content/outputs.xlsx')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Oleksandr Shevelskyi |
| Solution 2 | |
| Solution 3 | Yugansh Jindal |
| Solution 4 | |
| Solution 5 | Charlie Clark |
| Solution 6 | Punnerud |
| Solution 7 | Tony S |
| Solution 8 | joanis |
