Category "pandas"

Pandas: transform column names to row values

I'm trying to achieve the transformation below on a pandas DataFrame. The Date columns are essentially being expanded to multiple rows and we get an entry per m

Pandas: Rolling window to count the frequency - Fastest approach

I would like to count the frequency of a value for the past x days. In the example below, I would like to count the frequency of value in the Name column for th

Dropping invalid columns FutureWarning

# Select days that are sunny: sunny sunny = df_clean.loc[df_clean['sky_condition']=='CLR'] # Select days that are overcast: overcast overcast = df_clean.loc[df

returning results from python script to variable in Jupyter notebook

I have a python script that returns a pandas dataframe and I want to run the script in a Jupyter notebook and then save the results to a variable. The data are

Retrieve click data from Python Holoviews / Datashader

I'm coming from Python-Dash trying to achieve an interactive graphing functionality by creating a second graph using the click data of the first one. Similar to

Drawing zone over plt.imshow

I'm plotting some .tiff images using GDAL and matplotlib. Currently images look like the one in the example and I would like to mark a zone over the image.I hav

Input contains NaN, infinity or a value too large for dtype('float64') but i've manually changed Nan values in my database to equal 0

I've been having trouble with my regression formula. my dataset hasn't got any Nan values as I went through my database and replaced any blank cells with the va

Input contains NaN, infinity or a value too large for dtype('float64') but i've manually changed Nan values in my database to equal 0

I've been having trouble with my regression formula. my dataset hasn't got any Nan values as I went through my database and replaced any blank cells with the va

Python pandas dataframe populate hierarchical levels from parent child

I have the following dataframe which contains Parent child relation: data = pd.DataFrame({'Parent':['a','a','b','c','c','f','q','z','k'],

Quick way to visualise multiple columns in Altair with regression lines

So the way I have been visualising multiple columns quickly in Altair is to use repeat. This method is ok until I want to add regression lines using transform_r

Pandas subplot date ticks appear unevenly spaced with irregular time series

I created this example after seeing the issue multiple times. This helped me realize that the problem comes when plotting the time series of a data frame with i

Add a column to pandas dataframe containing the proportions for a particular column, based on grouping column

I have some data for which I want to do the following: group by a set of columns G for each grouping find the proportion of a particular column within the group

MACD stock indicator function using ewm() from pandas library

Here is the test code for my macd function, however, the values I am getting are incorrect. I don't know if it is because my span is in days and my data is in 2

How to export large pandas Data Frame to excel format?

I have converted binary files to NumPy array and then pandas data frame. The final shape is 217 rows × 524289 columns. When I tried to save it as .xlsx fo

DataFrame append to DataFrame row by row and reset if condition is matched

I have a DataFrame which I want to slice into many DataFrames by adding rows by one until the sum of column Score of the DataFrame is greater than 50,000. Once

pandas, access a series of lists as a set and take the set difference of 2 set series

Given 2 pandas series, both consisting of lists (i.e. each row in the series is a list), I want to take the set difference of 2 columns For example, in the data

Groupby by a column and select specific value from other column in pandas dataframe

Input dataframe: +-------------------------------+ |ID Owns_car owns_bike| +-------------------------------+ | 1 1 0 | | 5

reshaping the dataset in python

I have this dataset: Account lookup FY11USD FY12USD FY11local FY12local Sales CA 1000 5000 800 4800 Sales JP 5000 6500 10 15 Trying to arrive to get the data

Can we append a dataframe to snowflake table having some data, when some columns are same and some columns are different?

I have a dataframe which contains some columns and snowflake table is having some columns. Some columns are same and some columns are different between them. As

How to store the variables output inside a function during concurrent.futures.ProcessPoolExecutor from concurrent.futures

I am currently trying to store the output obtained in a function during multiprocessing by using concurrent.futures.ProcessPoolExecutor from concurrent.futures