Category "pandas"

Parse Year Week columns to Date

I have a data frame with columns Year and Week that I am trying to parse to date into a new column called Date. import datetime df['Date']=datetime.datetime.fro

How can I apply multiple conditions in Pandas, with Python?

How can I apply multiple conditions in pandas? For example I have this dataframe Country VAT RO RO1449488 RO RO1449489 RO RO1449486

How to transform TfidfVectorizer() outputs in dataframes

I found this answer about the model and specific outputs (How to get top n terms with highest tf-idf score - Big sparse matrix). It was great. I would like to k

Is there a method to split multilines to seperate rows

I have a data frame ( as shown in the image link ), some cells contain multiple line within the cell and also have unequal number of values. how can i split an

How to import a .sql file into DuckDB database?

I'm exploring DuckDB for one of my project. Here I have a sample Database file downloaded from https://www.wiley.com/en-us/SQL+for+Data+Scientists%3A+A+Beginner

Pandas - Key Exception - Length mismatch: Some times Expected axis has 3 elements, Some times has 2 elemnts

I have built a script to update stock values from yahoo finance with pandas. Sometimes the script works fine, but at some point it gets an error: Key Exception

Polynomial Expansion without sklearn

I want to try and recreate this functions from scratch (without using sklearn): # The matrix is M which is 1000x10 matrix. from sklearn.preprocessing import Po

Problem with websocket output into dataframe with pandas

I have a websocket connection to binance in my script. The websocket runs forever as usual. I got each pair's output as seperate outputs for my multiple stream

Change labels on facet plots

I currently have a plot with 6 facets labeled 1 to 6 in pandas, i wish to change these labels to road type if possible (Motorway, A Road, B Road etc) the code i

I am trying to merge two dataframes

I have this dataframe firm formtype Date_Filed GameStop Corp. 8-K 2021-04-01 I want to change the Date_Filed to 2021-04-01 00:00:00. I am using

pandas.query in a chain not giving expected results

I have a pandas dataframe that looks like this: df = pd.DataFrame( { "ID": [1, 2, 3, 4], "Name": ["Alpha", "Beta", "Gamma", "Delta"],

Calculations on a pandas DataFrame column conditional on another column

I notice several 'set value of new column based on value of another'-type questions, but from what I gather, I have not found that they address dividing values

Type error for a column that exists within the dataframe I am trying to call

Essentially, I am getting a key error in my jupyter notebook when trying to merge two data frames. As I understand it, a key error will only occur if said colum

DataFrame challenge: mapping ID to value in different row. Preferably with Polars

Consider this example: import polars as pl df = pl.DataFrame({ 'ID': ['0', '1', '2', '3', '4', '5','6', '7', '8', '9', '10'], 'Name' : ['A','','','','B

Select Value from largest index for each year [duplicate]

New to the Python world and I'm working through a problem where I need to pull a value for the largest index value for each year. Will provide

How to evenly spread out date data (pandas)

I'm working on a project and I'm struggling with some formats of dataframes. I have two dataframes, each containing a different number of months. I want all the

Flatten XML data as a pandas dataframe

How can I convert this XML file at this address into a pandas dataframe? I have downloaded the XML as a file and called it '058com.xml' and run the code below,

How to create a dummy only if a column has non-zero values for certain dates but zero for other dates

Let's say, I want to identify traders who only traded during bull runs but did not trade (zero values) during downturns or stable periods. Let's say we have two

Retrieve name of column from its Index in Pandas

I have a pandas dataframe and a numpy array of values of that dataframe. I have the index of a specific column and I already have the row index of an important

Python, Pandas and intersection - not PIVOT

This isn't a straightforward pivot question. I don't want to create new named columns (or numbered ones). What I am looking for is to find a way to search for