'Is there a better and fast way to check user input beside IF statement using python and streamlit?
I have a dataframe that includes about 22 columns. I want to allow the user to perform a custom filter based on his input. Where the app displays a list of checkboxes that the filter is made based on the checked one.
Example of dataframe:
data = {'name':['Tom', 'nick', 'krish', 'jack', 'Tom'],
'nickname':['jack','krish','karim','joe', 'joe'],
'date':['2013','2018','2022','2013','2013'],
'loc':['loc1','loc2','loc1','loc3','loc2'],
'dep':['manager','accounting','sales','sales','HR'],
'status':['in','out','out','in','in'],
'desc':['the boss ','employee with good attitude','can work harder',' he got the will to work in a team',''],
'age':[20, 18, 19, 18, 22]}
Each field can be checked to take the user input.
The more I have fields the more I will use IF statement. Is there a better and fast way to do it?
The code:
import streamlit as st
import pandas as pd
data = {'name':['Tom', 'nick', 'krish', 'jack', 'Tom'],
'nickname':['jack','krish','karim','joe', 'joe'],
'date':['2013','2018','2022','2013','2013'],
'loc':['loc1','loc2','loc1','loc3','loc2'],
'dep':['manager','accounting','sales','sales','HR'],
'status':['in','out','out','in','in'],
'desc':['the boss ','employee with good attitude','can work harder',' he got the will to work in a team','']
'age':[20, 18, 19, 18, 22]}
df = pd.DataFrame(data)
st.write(df)
df_result_search = pd.DataFrame()
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(by='date',ascending=True)
date_sort=df.date.unique()
searchcheckbox_name_nickname = st.checkbox("Name or Nickname ",value = False,key=1)
searchcheckbox_age = st.checkbox("age",value = False,key=2)
searchcheckbox_date = st.checkbox("Date",value = False,key=3)
searchcheckbox_loc = st.checkbox("Loc",value = False,key=4)
if searchcheckbox_name_nickname:
name_search = st.text_input("name")
nickname_search = st.text_input("nickname")
else:
name_search = ''
nickname_search = ''
if searchcheckbox_age:
age_search = st.number_input("age",min_value=0)
else:
age_search = 0
if searchcheckbox_date:
date_search = st.select_slider("Select date",date_sort,key=1)
else:
date_search = ''
if searchcheckbox_loc:
loc_search = st.multiselect("Select location",df['loc'].unique())
else:
loc_search = ''
if st.button("search"):
# 1. only name/nickname is checked
if searchcheckbox_name_nickname and not searchcheckbox_age and not searchcheckbox_date and not searchcheckbox_loc:
# if name is specified but not the nickname
if name_search != '' and nickname_search == '':
df_result_search = df[df['name'].str.contains(name_search, case=False, na=False)]
# if nickname is specified but not the name
elif name_search == '' and nickname_search != '':
df_result_search = df[df['nickname'].str.contains(nickname_search, case=False, na=False)]
# if both name and nickname are specified
elif name_search != '' and nickname_search != '':
df_result_search = df[(df['name'].str.contains(name_search, case=False, na=False)) & (df['nickname'].str.contains(nickname_search, case=False, na=False))]
# if user does not enter anything
else:
st.warning('Please enter at least a name or a nickname')
# . name/nickname + loc is checked
elif searchcheckbox_name_nickname and searchcheckbox_loc and not searchcheckbox_date and not searchcheckbox_age:
if name_search != '' and nickname_search == '' and loc_search !='':
df_result_search = df[df['name'].str.contains(name_search, case=False, na=False)& (df['loc'].isin(loc_search))]
# if nickname is specified but not the name
elif name_search == '' and nickname_search != '' and loc_search !='':
df_result_search = df[df['nickname'].str.contains(nickname_search, case=False, na=False) & (df['loc'].isin(loc_search))]
# if both name and nickname are specified
elif name_search != '' and nickname_search != '' and loc_search !='':
df_result_search = df[(df['name'].str.contains(name_search, case=False, na=False)) & (df['nickname'].str.contains(nickname_search, case=False, na=False)) & (df['loc'].isin(loc_search))]
# . name/nickname + date is checked
elif searchcheckbox_name_nickname and searchcheckbox_date and not searchcheckbox_age:
if name_search != '' and nickname_search == '' and date_search !='':
df_result_search = df[df['name'].str.contains(name_search, case=False, na=False)& (df['date'] == date_search)]
# if nickname is specified but not the name
elif name_search == '' and nickname_search != '' and date_search !='':
df_result_search = df[df['nickname'].str.contains(nickname_search, case=False, na=False) & (df['date'] == date_search)]
# if both name and nickname are specified
elif name_search != '' and nickname_search != '' and date_search !='':
df_result_search = df[(df['name'].str.contains(name_search, case=False, na=False)) & (df['nickname'].str.contains(nickname_search, case=False, na=False)) & (df['date'] == date_search)]
# . only age is checked
elif not searchcheckbox_name_nickname and not searchcheckbox_date and searchcheckbox_age:
if age_search != 0:
df_result_search = df[df['age'] == age_search]
# . only date is checked
elif not searchcheckbox_name_nickname and not searchcheckbox_age and searchcheckbox_date:
if date_search != '':
df_result_search = df[df['date']==date_search]
# . only loc is checked
elif not searchcheckbox_name_nickname and not searchcheckbox_age and not searchcheckbox_date and searchcheckbox_loc:
if loc_search != '':
df_result_search = df[df['loc'].isin(loc_search)]
# . if all are checked
else:
df_result_search = df[(df['name'].str.contains(name_search, case=False, na=False)) & (df['nickname'].str.contains(nickname_search, case=False, na=False)) & (df['age'] == age_search) & (df['date'] == date_search) & (df['loc'] == loc_search)]
st.write("{} Records ".format(str(df_result_search.shape[0])))
st.dataframe(df_result_search)
based on the answer of @Jérôme Richard this the error that is displayed how to fix it and where is the error ?
Solution 1:[1]
Assuming the filtering of each field is independent, you can filter the row of the dataframe by producing a column (ie. Numpy array) of boolean for each filter. Then you can apply several logical-ORs and logical-ANDs to mix the result of each filter so to produce a final filtering mask. Here is a simplified example:
import numpy as np
names_filter_mask = np.ones(len(df), dtype=bool)
nickname_filter_mask = np.ones(len(df), dtype=bool)
# One condition per column.
# If there is many column you can use a for loop
# iterating over a predefined list of filtered column
# and store the filter mast in a N x len(df) array.
if should_filter_name():
name_search = get_name_from_user_interface()
names_filter_mask = df['name'].str.contains(name_search, case=False, na=False).to_numpy()
if should_filter_nickname():
nickname_search = get_nickname_from_user_interface()
nickname_filter_mask = df['nickname'].str.contains(nickname_search, case=False, na=False).to_numpy()
# Mix the filters
merged_filter_mask = names_filter_mask & nickname_filter_mask
# Get the filtered lines
df_result_search = df[merged_filter_mask]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jérôme Richard |

