'Is there a better and fast way to check user input beside IF statement using python and streamlit?

I have a dataframe that includes about 22 columns. I want to allow the user to perform a custom filter based on his input. Where the app displays a list of checkboxes that the filter is made based on the checked one.

Example of dataframe:

data = {'name':['Tom', 'nick', 'krish', 'jack', 'Tom'],
        'nickname':['jack','krish','karim','joe', 'joe'],
        'date':['2013','2018','2022','2013','2013'],
        'loc':['loc1','loc2','loc1','loc3','loc2'],
        'dep':['manager','accounting','sales','sales','HR'],
        'status':['in','out','out','in','in'],
        'desc':['the boss ','employee with good attitude','can work harder',' he got the will to work in a team',''],
        'age':[20, 18, 19, 18, 22]}

Each field can be checked to take the user input.

The more I have fields the more I will use IF statement. Is there a better and fast way to do it?

The code:

import streamlit as st 
import pandas as pd

data = {'name':['Tom', 'nick', 'krish', 'jack', 'Tom'],
        'nickname':['jack','krish','karim','joe', 'joe'],
        'date':['2013','2018','2022','2013','2013'],
        'loc':['loc1','loc2','loc1','loc3','loc2'],
        'dep':['manager','accounting','sales','sales','HR'],
        'status':['in','out','out','in','in'],
        'desc':['the boss ','employee with good attitude','can work harder',' he got the will to work in a team','']
        'age':[20, 18, 19, 18, 22]}

df = pd.DataFrame(data)
st.write(df)
df_result_search = pd.DataFrame() 


df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(by='date',ascending=True)
date_sort=df.date.unique()

searchcheckbox_name_nickname = st.checkbox("Name or Nickname ",value = False,key=1)
searchcheckbox_age = st.checkbox("age",value = False,key=2)
searchcheckbox_date = st.checkbox("Date",value = False,key=3)
searchcheckbox_loc = st.checkbox("Loc",value = False,key=4)

if searchcheckbox_name_nickname:
    name_search = st.text_input("name")
    nickname_search = st.text_input("nickname")
else:
    name_search = ''
    nickname_search = ''

if searchcheckbox_age:   
    age_search = st.number_input("age",min_value=0)
else:
    age_search = 0

if searchcheckbox_date:
    date_search = st.select_slider("Select date",date_sort,key=1)
   
else:
    date_search = ''

if searchcheckbox_loc:
    loc_search = st.multiselect("Select location",df['loc'].unique())
   
else:
    loc_search = ''


if st.button("search"):
    # 1. only name/nickname is checked
    if searchcheckbox_name_nickname and not searchcheckbox_age and not searchcheckbox_date and not searchcheckbox_loc:
        # if name is specified but not the nickname
        if name_search != '' and nickname_search == '':
            df_result_search = df[df['name'].str.contains(name_search, case=False, na=False)]
        # if nickname is specified but not the name
        elif name_search == '' and nickname_search != '':
            df_result_search = df[df['nickname'].str.contains(nickname_search, case=False, na=False)]
        # if both name and nickname are specified
        elif name_search != '' and nickname_search != '':
            df_result_search = df[(df['name'].str.contains(name_search, case=False, na=False)) & (df['nickname'].str.contains(nickname_search, case=False, na=False))]
        # if user does not enter anything
        else:
            st.warning('Please enter at least a name or a nickname')
    
    # .  name/nickname + loc is checked
    elif searchcheckbox_name_nickname and searchcheckbox_loc and not searchcheckbox_date and not searchcheckbox_age:
        if name_search != '' and nickname_search == '' and loc_search !='':
            df_result_search = df[df['name'].str.contains(name_search, case=False, na=False)& (df['loc'].isin(loc_search))]
        # if nickname is specified but not the name
        elif name_search == '' and nickname_search != '' and loc_search !='':
            df_result_search = df[df['nickname'].str.contains(nickname_search, case=False, na=False) & (df['loc'].isin(loc_search))]
        # if both name and nickname are specified
        elif name_search != '' and nickname_search != '' and loc_search !='':
            df_result_search = df[(df['name'].str.contains(name_search, case=False, na=False)) & (df['nickname'].str.contains(nickname_search, case=False, na=False)) & (df['loc'].isin(loc_search))]
       
    # . name/nickname + date is checked
    elif searchcheckbox_name_nickname and searchcheckbox_date and not searchcheckbox_age:
        if name_search != '' and nickname_search == '' and date_search !='':
            df_result_search = df[df['name'].str.contains(name_search, case=False, na=False)& (df['date'] == date_search)]
        # if nickname is specified but not the name
        elif name_search == '' and nickname_search != '' and date_search !='':
            df_result_search = df[df['nickname'].str.contains(nickname_search, case=False, na=False) & (df['date'] == date_search)]
        # if both name and nickname are specified
        elif name_search != '' and nickname_search != '' and date_search !='':
            df_result_search = df[(df['name'].str.contains(name_search, case=False, na=False)) & (df['nickname'].str.contains(nickname_search, case=False, na=False)) & (df['date'] == date_search)]
       

    # . only age is checked
    elif not searchcheckbox_name_nickname and not searchcheckbox_date and searchcheckbox_age:
        if age_search != 0:
            df_result_search = df[df['age'] == age_search]
            
    # . only date is checked
    elif not searchcheckbox_name_nickname and not searchcheckbox_age and searchcheckbox_date:
        if date_search != '':
            df_result_search = df[df['date']==date_search]
    
    # . only loc is checked         
    elif not searchcheckbox_name_nickname and not searchcheckbox_age and not searchcheckbox_date and searchcheckbox_loc:
        if loc_search != '':
            df_result_search = df[df['loc'].isin(loc_search)]
    
    # . if all are checked
    else:
        df_result_search = df[(df['name'].str.contains(name_search, case=False, na=False)) & (df['nickname'].str.contains(nickname_search, case=False, na=False)) & (df['age'] == age_search) & (df['date'] == date_search) & (df['loc'] == loc_search)]
        
                    
    st.write("{} Records ".format(str(df_result_search.shape[0])))
    st.dataframe(df_result_search)

based on the answer of @Jérôme Richard this the error that is displayed how to fix it and where is the error ?

enter image description here



Solution 1:[1]

Assuming the filtering of each field is independent, you can filter the row of the dataframe by producing a column (ie. Numpy array) of boolean for each filter. Then you can apply several logical-ORs and logical-ANDs to mix the result of each filter so to produce a final filtering mask. Here is a simplified example:

import numpy as np

names_filter_mask = np.ones(len(df), dtype=bool)
nickname_filter_mask = np.ones(len(df), dtype=bool)

# One condition per column.
# If there is many column you can use a for loop 
# iterating over a predefined list of filtered column
# and store the filter mast in a N x len(df) array.

if should_filter_name():
    name_search = get_name_from_user_interface()
    names_filter_mask = df['name'].str.contains(name_search, case=False, na=False).to_numpy()

if should_filter_nickname():
    nickname_search = get_nickname_from_user_interface()
    nickname_filter_mask = df['nickname'].str.contains(nickname_search, case=False, na=False).to_numpy()

# Mix the filters
merged_filter_mask = names_filter_mask & nickname_filter_mask

# Get the filtered lines
df_result_search = df[merged_filter_mask]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jérôme Richard