Category "dataframe"

Merge two dfs with multiple entries of same value in joining column

I have two data frames. The first is input which looks like the following: Merchant SKU Quantity Per Box NOB Shipment Status id_using_regex prepped_by_in

How to convert the dummy variable columns in to several columns?

I know how to unstack rows into columns, but how to deal with the following dataframe? date dummy avg lable 1-19 1 20 l1 1-19 0 40 l1 1-27 1 100 l2 1-27 0 140

changing dtype in polars

i created a data frame using polars. when datas are inserted, dtype of the coulmn automatically changes to what inserted. (i think its a feature of polars?) but

How to import data with dates as index from excel with pandas

I am importing the data with this command df = pd.read_excel('C:/Users/Me/Data.xlsx', sheet_name='Prices') and this is the result: The date is a common column

Change df columns from lists to vectors

I've been using R for a while, but lists perplex me. For some reason in some cases my function outputs a data frame of lists: str() returns something like:

How to locate print output? or convert it into jpeg?

I'm trying to show more than one dataframe with using tkinter. There are 2 options for me, showing dataframe directly by using print() and saving dataframe as j

Calculate MAPE and apply to PySpark grouped Dataframe [@pandas_udf]

Goal: Calculate mean_absolute_percentage_error (MAPE) for each unique ID. y - real value yhat - predicted value Sample PySpark Dataframe: join_df +----------+--

ValueError: X has 19 features, but LinearRegression is expecting 20 features as input

I'm trying to do polynomial regression using this code here: x_train,x_test,y_train,y_test = train_test_split(self.X, self.y, test_size=split, random_state=rand

Batch conversion of xlsx files to txt in Python

I am trying to convert files with the extension xlxs to txt files. All items have the same name and are marked with a number. The problem is that there are no n

How to deal with SettingWithCopyWarning in Pandas

Background I just upgraded my Pandas from 0.11 to 0.13.0rc1. Now, the application is popping out many new warnings. One of them like this: E:\FinReporter\FM_EXT

data column not recognized in the ggplot geom_hline

I was wondering why variable mean_y is not recognized by my geom_hline(yintercept = unique(mean_y)) call? library(tidyverse) set.seed(20) n_groups <- 2 n_in

how to fill a row in a subcolumn inside a multi column dataframe?

I have a multicolumn dataframe called full_week that the first column is the employees names and the other columns are columns with each weekday name starting f

combine two rows with negligible threshold on a groupby dataframe

I have a raw dataframe(simplified) as below: ColumnA startime endtime A 2022-02-23 08:22:32.113000+00:00 2022-02-23 10:54:04.163000+00:00 A 2022-02-23 10:54:04

Convert text file into dataframe with custom multiple delimiter in python

i'am new to python. I have one txt file. it contains some data like 0: 480x640 2 persons, 1 cat, 1 clock, 1: 480x640 2 persons, 1 chair, Done. date (0.635s) Tue

Apply function to multiple row pandas

Suppose I have a dataframe like this 0 5 10 15 20 25 ... action_0_Q0 0.299098 0.093973 0.761735 0.0

How to get this single column data into data frame with appropriate columns

I am learning pandas and Data Science and am a beginner. I have a data as following Rahul 1 2 5 Suresh 4 2 1 Dharm 1 3 4 I would like it in my dataframe as Rah

extract emotions from text in dataframe in senticnet

I am very novice in python and I treat to extract emotions from sentence in datafram though senticNet this my code but its not correct I don't know what's the

Build an Authenticated GET API in R

I can't figure out how to set up an API correctly. I have an example in Python and would like to understand how to reproduce it with R, how to correctly choose

Create new column using keys pair value from a dataframe column

I have a data frame with many column. One of the column is named 'attributes' and in it has a list of dictionary with keys and values. I want to extract each ke

String-join pandas dataframe colums and skip nan values

I'm trying to join column values into new column but I want to skip nan values: df['col'] = 'df['col1'].map(str) + ',' + df['col2'].map(str) + ',' + df['col3'].