'How to find the sessions time from a stream of user login events in python?
Given User Login Events continuously appearing in a stream processing system similar to this, where each line is 1 event (you can assume the pipeline is external to this program, this is a line-by-line function call):
1532926994 User01 LogOutSuccessful
1532926981 User02 LogInSuccessful
1532926982 User04 LogInFailed
1532926992 User01 LogInSuccessful
1532926986 User02 LogOutSuccessful
1532927003 User03 LogOutSuccessful
Implement a module with standard libraries (e.g. no Spark) that continuously processes these events and outputs user session durations as soon as a successful logout occurs in a structured format similar to:
{ "username": "User02”, "session_duration": 5 }
{ "username": "User05”, "session_duration": 10 }
I was asked this question in an interview and I could parse and extract the data but I was not able to store the user events and use it to compute the session time. Any guidance would be appreciated.
We have to write a python function which takes one line at a time and takes some action when we receive LogInSuccessful or LogOutSuccessful. I think we can ignore LogInFailed for the moment.
def get_user_session_time(stream_text):
user_info = stream_text.split()
if user_info[2] == 'LogInSuccessful':
#store the data somewhere
elif user_info[2] == 'LogOutSuccessful':
#get the data stored in the above step
#compute the session time
#print the key value pair { "username": "User02”,
#"session_duration": 5 }
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
