Category "pandas"

Drop rows of dataframe if the rows have continuously the same value

I am dealing with metered time series data, that should not have the exact same value for more than n steps. I want to build a script that, given a threshold n,

Python sum of values in dataset

I have this dataframe (ID is a string and Value a float): ID Value 1 0.0 1.1 0.0 1.2 0.0 1.2.1 2750

Unmanaged memory jamming cluster during dask's merge_asof method

I am trying to merge large dataframes using dask.dataframe.multi.merge_asof, but I am running into issues with accumulating unmanaged memory on the cluster. I h

Extract a value inside a json column in pandas

I have a json column in a pandas dataframe and I need to create a new column based on a value in the json column. case# json_col 123 [{'priorit

Dynamic bin per row in a dataset in Pandas

I am having trouble dynamically binning my dataset for further calculation. My goal is to have specific bin/labels for each individual row in my dataframe, base

create a list from given data to use in read_fwf

in load_fwf the parameter colspecs assigned as a list like this example data2 = pd.read_fwf("sample.txt",index_col='Order number',names=['Order number', 'code',

How to turn the item in the column to multiple columns?

I am trying to do this. So, currently my df look like this. col_names = ['movie_id', 'movie_title', 'genres'] df = pd.read_csv('/content/drive/MyDrive/testing/m

Create a function that will accept a DataFrame as input and return pie-charts for all the appropriate Categorical features

I can create 1 pie-chart using the 'Churn' column to group the data, however, not sure how to create a function that will accept a DataFrame as input and return

Pandas read_csv specify AWS Profile

Pandas (v1.0.5) use s3fs library to connect with AWS S3 and read data. By default, s3fs uses the credentials found in ~/.aws/credentials file in default profile

python -- pandas dataframe to nested json with hierarchy level

To be able to generate a checkboxes, I need to convert pandas DataFrame to a JSON format. First, I have a pandas Dataframe: cast title type Daniel Craig Sky Fa

Categorize and order bar chart by Hue

I have a problem. I want to show the two highest countries of each category. But unfortunately I only get the below output. However, I would like the part to be

Why do I get a 'FutureWarning' with pandas.concat?

Does anyone meet this similar FutureWarning? I got this when I was using Tiingo+pandas_datareader? The warning is like: python3.8/site-packages/pandas_datareade

problem in reading products CSV file with pandas python

I have products CSV file and I am trying to read this file with pandas python but i get this error my code import pandas as pd df = pd.read_csv('D:\\work\\am

How to match the unique ids that I created in df1 to df2 based on two column values?

I have two dataframes, and I am struggling to match the unique ids that I created in df1 to df2 based on 'name' and 'version' values. I need to add a column to

How to convert DataFrame.append() to pandas.concat()?

In pandas 1.4.0: append() was deprecated, and the docs say to use concat() instead. FutureWarning: The frame.append method is deprecated and will be removed fr

Pivot dataframe with duplicate index by aggregating per group

I am facing the following dataframe. Date Security Field Value 0 2022-05-03 08:00:12.394000 CFI2Z2 VALUE 83.3 1 2022-05-03 08:00:12.394000 CFI2Z2 VOLUME 1 2 2

How to merge every two columns, with pandas, substituting only if the left column value is nan or 0 [duplicate]

I have 2n columns and each pair looks like this: 1 0 2 0 45 1 44 10 43 22 0 55 0 46 0 75 I want to turn each pair of columns int

AttributeError: 'TimedeltaProperties' object has no attribute 'minute'

I have a dataframe that looks like this df [output]: date time 2020-02-28 00:30:45 2020-02-28 00:30:45 2020-03-09 00:21:06 2020-03-09 00:21:06 2020-

efficient way of computing a list with mean of values in another list

I need to compute a list with the mean values of another list. To be more precise, the input list have this form: input_list = ['1.538075/42.507325', '1.53796

Convert a Pandas DataFrame into a single row DataFrame

I've seen similar questions but mine is more direct and abstract. I have a dataframe with "n" rows, being "n" a small number.We can assume the index is just th