Category "dataframe"

Most efficient way to search over a DataFrame in Python [duplicate]

I have a DataFrame having these kind of data : df = pd.DataFrame({ 'id' : ['a', 'a', 'b', 'b', 'c', 'c'], 'alias' : ['value'+str(i) fo

Create dataframe based on matching

I want to create a df in R with two variables, they have different number of rows. This is an abstract example: I want to match a 3 to "Fail" (without writing i

How to conditionally assign values from another dataframe?

I want to merge 2 dataframes without using the function '.merge' and I try to assign a value to a dataframe column based on an interval and an id. intervals = p

PerformanceWarning: DataFrame is highly fragmented. How to convert in to a more efficient way via pd.concat with designated column name

I got following warning while running under python 3.8 with the newest pandas. PerformanceWarning: DataFrame is highly fragmented. this is the place where I c

Drop rows of dataframe if the rows have continuously the same value

I am dealing with metered time series data, that should not have the exact same value for more than n steps. I want to build a script that, given a threshold n,

Why do factors get coerced to a number subsetting a data frame?

I was trying to get the diagonal of the iris data set and wrote the following for loop: diagonal_list <- list() for (j in seq_len(ncol(iris))) { diagon

Unmanaged memory jamming cluster during dask's merge_asof method

I am trying to merge large dataframes using dask.dataframe.multi.merge_asof, but I am running into issues with accumulating unmanaged memory on the cluster. I h

How do I plot my datetime on the x axis when this value is used as index?

I have a short question. This is my dataframe: gradient result date 2022-04-15 09:43:20 0.206947 0.10

create a list from given data to use in read_fwf

in load_fwf the parameter colspecs assigned as a list like this example data2 = pd.read_fwf("sample.txt",index_col='Order number',names=['Order number', 'code',

problem in reading products CSV file with pandas python

I have products CSV file and I am trying to read this file with pandas python but i get this error my code import pandas as pd df = pd.read_csv('D:\\work\\am

How to match the unique ids that I created in df1 to df2 based on two column values?

I have two dataframes, and I am struggling to match the unique ids that I created in df1 to df2 based on 'name' and 'version' values. I need to add a column to

How to convert DataFrame.append() to pandas.concat()?

In pandas 1.4.0: append() was deprecated, and the docs say to use concat() instead. FutureWarning: The frame.append method is deprecated and will be removed fr

generate dict from datarame with grouping columns

I try to generate a json file or dict rom my datframe (grouping the columns) my datFrame is df1 = pd.DataFrame({ 'USER': ['ALL','ALL','BOB','STEVE',

generate dict from datarame with grouping columns

I try to generate a json file or dict rom my datframe (grouping the columns) my datFrame is df1 = pd.DataFrame({ 'USER': ['ALL','ALL','BOB','STEVE',

Add a new logic in pyhton

Want to add logic that calculates and outputs truckloads able to be built each day. Still want this broken out by ship-to party (so 1 ship-to party per shipment

How to get PRAW (the Python Reddit API Wrapper) to read submission ID?

Goal: I have collected hundreds of reddit posts' details in Excel sheets. Now, I want to collect comments on these Reddit posts using PRAW. Method: At first, I

data time Format recognition in exported excel with xlsxwriter

I didn't find a solution for this: From a dataframe I generate an excel and some columns need to be in format hh:mm:ss (with no limit to 24h, for example a valu

Which Java class is compatible with python Pandas DataFrame when using DJL(Deep Java Library)?

I'm trying to import Python Tensorflow custom model to spring-boot using DJL Tensorflow, and the model gets Pandas DataFrame as both input and output. I'm wonde

How to convert symmetric matrix to adjacency table

How could I convert symmetric matrix: A B C D A 1 2 3 4 B 2 1 2 3 C 3 2 1 2 D 4 3 2 1 into adjacency matrix?: A A 1 A B 2 A C 3 A D 4 B A 3 B B 1 B C 2 B D

Linearregression of two dataframes

I have two dataframes: df = pd.DataFrame([{'A': -4, 'B': -3, 'C': -2, 'D': -1, 'E': 2, 'F': 4, 'G': 8, 'H': 6, 'I': -2}]) df2 looks like this (just a cutout; i