'Duplicated data in excel using Openpyxl
I have created a python script that will append data in excel. However, data that are being transferred in excel is having multiple duplication. Can someone help me fix my script?
tree = ET.parse('users.xml')
root = tree.getroot()
#create excel
wb = Workbook()
ws = wb.active
ws.title = ("Active Users")
df=pd.DataFrame(columns=["Login", "User Name", "Role", "Status"])
for user in root.findall('user'):
login = user.find('login').text
for m in tls.getUserByLogin(login):
user_status = int(m.get("isActive"))
if user_status == 1:
lastname = m.get("lastName")
firstname = m.get("firstName")
userLogin = m.get("login")
activeStatus = ("Active User")
role = m.get("globalRole")
tproject = m.get("tprojectRoles")
print("Login: " + userLogin + " " + lastname + " " + firstname + " Role: " + str(role['name']) + " " + str(activeStatus))
df.loc[len(df.index)] =[userLogin, lastname, str(role['name']), str(activeStatus)]
for row in dataframe_to_rows(df, index = False):
ws.append(row)
else:
inactive = (str(m.get("firstName")) + " " + str(m.get("lastName")) +": User is not Active")
print(inactive)
wb.save(filename = 'userData.xlsx')
The output in excel is this: Login = A1 , User Name = B1, Role = C1, Status = D1
- Login User Name Role Status
- admin Administrator Admin Active
- Login User Name Role Status
- admin Administrator Admin Active
- user1 Pedro leader Active
- Login User Name Role Status
- admin Administrator Admin Active
- user1 Pedro leader Active
- user2 Juan leader Active
Also, for my else loop for inactive users, is it possible to append them in the same excel file to another sheet? Thank you all
Solution 1:[1]
The ws.append() and ws.save should be outside of the ALL for loops, including the first one. Updated code here.
tree = ET.parse('users.xml')
root = tree.getroot()
#create excel
wb = Workbook()
ws = wb.active
ws.title = ("Active Users")
df=pd.DataFrame(columns=["Login", "User Name", "Role", "Status"])
for user in root.findall('user'):
login = user.find('login').text
for m in tls.getUserByLogin(login):
user_status = int(m.get("isActive"))
if user_status == 1:
lastname = m.get("lastName")
firstname = m.get("firstName")
userLogin = m.get("login")
activeStatus = ("Active User")
role = m.get("globalRole")
tproject = m.get("tprojectRoles")
print("Login: " + userLogin + " " + lastname + " " + firstname + " Role: " + str(role['name']) + " " + str(activeStatus))
df.loc[len(df.index)] =[userLogin, lastname, str(role['name']), str(activeStatus)]
else:
inactive = (str(m.get("firstName")) + " " + str(m.get("lastName")) +": User is not Active")
print(inactive)
### MOVED code here - note it should be outside ALL for loops ####
for row in dataframe_to_rows(df, index = False):
ws.append(row)
wb.save(filename = 'userData.xlsx')
Solution 2:[2]
Are you sure that users.xml only contains a unique user?
If you're not sure, I think it's better to check existing user logic.
to achieve that you can use a dictionary or array to temporary store your user in a loop and check if the current user was exists
. . .
user_tmp = []
for user in root.findall('user'):
login = user.find('login').text
# Check if login is in the list
if login not in user_tmp:
user_tmp.append(login)
else:
# if login is in the list, continue the loop
continue
. . .
since you are using the Pandas data frame, you can generate multiple sheets when saving the data frame with toExcel
# Example, you generate an active user in df_active and inactive user in # create a excel writer object
with pd.ExcelWriter("path to file\filename.xlsx") as writer:
# use to_excel function and specify the sheet_name and index
# to store the dataframe in specified sheet
df_active.to_excel(writer, sheet_name="Active", index=False)
df_inactive.to_excel(writer, sheet_name="Inactive", index=False)
I hope you can get hints to solve your issues from my suggestions.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Redox |
| Solution 2 | taipei |
