Category "pandas"

Quick way to visualise multiple columns in Altair with regression lines

So the way I have been visualising multiple columns quickly in Altair is to use repeat. This method is ok until I want to add regression lines using transform_r

Pandas subplot date ticks appear unevenly spaced with irregular time series

I created this example after seeing the issue multiple times. This helped me realize that the problem comes when plotting the time series of a data frame with i

Add a column to pandas dataframe containing the proportions for a particular column, based on grouping column

I have some data for which I want to do the following: group by a set of columns G for each grouping find the proportion of a particular column within the group

MACD stock indicator function using ewm() from pandas library

Here is the test code for my macd function, however, the values I am getting are incorrect. I don't know if it is because my span is in days and my data is in 2

How to export large pandas Data Frame to excel format?

I have converted binary files to NumPy array and then pandas data frame. The final shape is 217 rows × 524289 columns. When I tried to save it as .xlsx fo

DataFrame append to DataFrame row by row and reset if condition is matched

I have a DataFrame which I want to slice into many DataFrames by adding rows by one until the sum of column Score of the DataFrame is greater than 50,000. Once

pandas, access a series of lists as a set and take the set difference of 2 set series

Given 2 pandas series, both consisting of lists (i.e. each row in the series is a list), I want to take the set difference of 2 columns For example, in the data

Groupby by a column and select specific value from other column in pandas dataframe

Input dataframe: +-------------------------------+ |ID Owns_car owns_bike| +-------------------------------+ | 1 1 0 | | 5

reshaping the dataset in python

I have this dataset: Account lookup FY11USD FY12USD FY11local FY12local Sales CA 1000 5000 800 4800 Sales JP 5000 6500 10 15 Trying to arrive to get the data

Can we append a dataframe to snowflake table having some data, when some columns are same and some columns are different?

I have a dataframe which contains some columns and snowflake table is having some columns. Some columns are same and some columns are different between them. As

How to store the variables output inside a function during concurrent.futures.ProcessPoolExecutor from concurrent.futures

I am currently trying to store the output obtained in a function during multiprocessing by using concurrent.futures.ProcessPoolExecutor from concurrent.futures

how to covert a json to pandas dataframe when the value is completely in the string fomat

I am trying to convert the data from a json to dataframe. My son {"data":"key=IAfpK, age=58, key=WNVdi, age=64, key=jp9zt, age=47, key=0Sr4C, age=68, key=CGEqo,

How to line plot timeseries data on a bar plot

I have the following data frame: data = {'date': ['3/24/2020', '3/25/2020', '3/26/2020', '3/27/2020'], 'Total1': [133731.9147, 141071.6383, -64629.74024

group by and count unique entries with more than one entry in time period

I have a pandas df as follows: Date UserID 2022-01-01 ABC 2022-01-02 ABC 2022-01-03 ABC 2022-01-01 DEF 2022-01-05 DEF

Converting tensorflow dataset to pandas dataframe

I am very new to the deep learning and computer vision. I want to do some face recognition project. For that I downloaded some images from Internet and converte

Timeseries dataframe returns an error when using Pandas Align - valueError: cannot join with no overlapping index names

My goal: I have two time-series data frames, one with a time interval of 1m and the other with a time interval of 5m. The 5m data frame is a resampled version o

How can i extract day of week from timestamp in pandas

I have a timestamp column in a dataframe as below, and I want to create another column called day of week from that. How can do it? Input: Pickup date/time

Probability of selling the same items again in a pandas dataframe

I need to know the probability of selling similar items together, based on a sales history formatted like this: pd.DataFrame({"sale_id": [1, 1, 1, 2, 2, 3, 3, 3

thresh in dropna for DataFrame in pandas in python

df1 = pd.DataFrame(np.arange(15).reshape(5,3)) df1.iloc[:4,1] = np.nan df1.iloc[:2,2] = np.nan df1.dropna(thresh=1 ,axis=1) It seems that no nan value has bee

What causes these Int64 columns to cause a TypeError?

I have a pandas DataFrame with several flag/dummy variables of type Int64. I am aggregating on other fields and taking the mean value in order to calculate a pe