'Compare two timetables

I am trying to compare 2 timetables for a college project to look for free spaces in one and then check to see events taking place in the other. For example if the student has a free slot from 4-5 on a Tuesday, My code would check to see if there are any events on at that time. I scraped the events website in order to check this. I can not seem to get the comparison part right and was looking for help in this section this is my code

import pandas as pd
import time
from selenium import webdriver 
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import Select

this is for scraping the events page

event_dates = {} #dictionary to store event dates and time
df = pd.read_html('https://www.dcu.ie/students/events')[0]['Event date']
event_dates.update(df)
print(event_dates)

this is for the timetable, checking for free slots in it

s = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=s)
driver.maximize_window() #opens up website, and fills in information  

driver.implicitly_wait(30)

driver.get('https://opentimetable.dcu.ie/')

select = Select(driver.find_element_by_tag_name("select")) #searchs for the select on the timetable
select.select_by_visible_text("Programmes of Study") # goes to programmes of study 

search = driver.find_element_by_id("textSearch")
search.send_keys("CASE2") #types course code into search box

checkbox = driver.find_element_by_xpath('.//input[following-sibling::div[contains(text(), "CASE2")]]')

checkbox.click() #clicks button for course code

time.sleep(3)

html = driver.find_element_by_id("week-pdf-content").get_attribute('outerHTML')
df2 = pd.read_html(html)[0] 


#For printing free slots on timetable 
df3 = df2.set_index('Unnamed: 0') 
#print(df3.head(10).to_dict()) #for checking dataframe 
for column in df3:
    print("You have free slots at the following times:")
    print(f'{column}:{", ".join(df3[df3[column].isna()].index.drop_duplicates().to_list())}')

So currently I get this output from the events page : Events

and from the timetable I get this output

timetable

So my expected output is a string saying you have a free slot on e.g tuesday at 3-4, this event is on at this time.

Any help appricated thanks



Solution 1:[1]

Best option would be to convert the datetime string to datetime object and then do your operations. I am using example to explain my logic. You can create a function and scale this.

Convert the string 'February 25, 09:00 - February 25, 17:00' to 2 datetime objects (start and end).

import datetime as dt


time_range_string = 'February 25, 09:00  - February 25, 17:00'

# Split the string to start and end time
time_range_list = time_range_string.split("-")

# getting the start and end time string
start_time_string, end_time_string = list(map(lambda string: string.strip(), time_range_list))

start_time = dt.datetime.strptime(start_time_string, '%B %d, %H:%M')
end_time = dt.datetime.strptime(end_time_string, '%B %d, %H:%M')

Now you have 2 datetime objects (start_time and end_time). Since we don't have year in the string, the default year 1900 is set to the datetime objects.

Consider the string '8:00, 9:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00, 18:00, 19:00, 20:00, 21:00'. Split these to a list.

time_string = '8:00, 9:00, 10:00, 11:00, 12:00, 13:00, 14:00, 15:00, 16:00, 17:00, 18:00, 19:00, 20:00, 21:00'
time_list = time_string.split(',')

Now combine the required date (In this case datetime(1900, 2, 25)) and time to form a new datetime object. Then we can check whether the new object is between the decided time.

for time_string in time_list:
    # Combine date and time
    current_date = dt.datetime.combine(dt.datetime(1900, 2, 25), dt.datetime.strptime(time_string.strip(), '%H:%M').time())
    # Checking whether full_date is between start_time and end_time
    if start_time <= current_date <= end_time:
        print(current_date)

This will print all times that are between the start_time and end_time. You may change the code and logic to get the required output. Create a function and then scale it to all cases.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Fahid Latheef A