Category "dataframe"

Add 'total' row for each group in a column in df

I have a dataframe where the column size can be grouped. When the dataframe is arranged by size, I would like to show the totals of each column for each group a

Pandas- DataError: No numeric types to aggregate

I have a DataFrame with 5 columns, where the column i need to aggregate is of a string, and has NaN values. I tried replacing the nan values with 0 and then con

Combinations divided into rows with no repetition of elements

I'm working with lists in Python. I have a list of colleagues which is colleagues=['Jack', 'Jessica' 'John', 'Mark', 'Mary', 'Paul'] I want to calculate all po

Rearrange pandas Dataframes

I decide to simplify my post and replace images with code which has the same structure (and problem) inside and everyone could 'copy-paste' this example to try

How to replace column values with dictionary keys

I have a df, A B one six two seven three level five one and a dictionary my_dict={1:"one,two",2:"three,four"} I want to replace df.A with my_di

df.to_csv function prints out the content instead of writing data to a file

df.to_csv(output_file) is supposed to write the content of a DataFrame to a file. While the function is working for 99.9% of the file in my directory, there is

Python Pandas - Find difference between two data frames

I have two data frames df1 and df2, where df2 is a subset of df1. How do I get a new data frame (df3) which is the difference between the two data frames? In o

Python Dataframes - Breaking out single rows with duplicate columns into multiple rows and fewer columns

I have a data frame like this: A B C Date1 Time1 Value1 Date2 Time2 Value2 abc def ghi 01-01-2000 15:00:00 100 01-01-2000 19:00:00 110 There are duplicate col

Pandas - Add a new column extracting value from arrays based on other column value

I am currently stuck trying to extract a value from a list/array depending on values of a dataframe. Imagine i have this array. This array i can manually create

Pandas - Add a new column extracting value from arrays based on other column value

I am currently stuck trying to extract a value from a list/array depending on values of a dataframe. Imagine i have this array. This array i can manually create

R: slicing over dates in a dataframe using custom time window

I have a dateframe of player rankings over many years (2000-2020), which looks like : Now, I wish to group_by() and summarise() and calculate statistics for di

RStudio: Selecting the column with the latest available data from a dataframe

I am trying to extract data from the World Bank and import it into RStudio for a regression analysis. The data can be found here and as you can see, the online

Slicing a dataframe using matches to build a new dataframe with Pandas?

I am trying to get my code to take in a dataframe, find all occurrences of "START:", then iterate through each occurrence to create 'slices' (Where the first ro

Categorical column after melt in pandas

Is it possible to end up with a categorical variable column after a melt operation in pandas? If I set up the data like this: import pandas as pd import numpy a

Coalesce columns and create another column to specify source

I'm using dplyr::coalesce() to combine several columns into one. Originally, across columns, each row has only one column with actual value while the other colu

R Dataframe By Group Calculation

I have a dataframe like below (the real data has many more people and club): Year Player Club 2005 Phelan Chicago Fire 2007 Phelan Boston Pant 2

Any optimize way to iterate excel and provide data into pd.read_sql() as a string one by one

#here I have to apply the loop which can provide me the queries from excel for respective reports: df1 = pd.read_sql(SQLqueryB2, con=con1) df2 = pd.rea

Pandas Pivot table - How compute the following default ratio?

I am able to compute the default rate in number (e.g, the percentage of customers falled into default), with the code below, getting the following output: impor

Pandas dataframe divide features to group of high correlation

I have a dataframe with over 280 features. I ran correlation map to detect groups of features that are highly correlated: Now, I want to divide the features to

Adding a new column in pandas dataframe from another dataframe with differing indices

This is my original dataframe. This is my second dataframe containing one column. I want to add the column of second dataframe to the original dataframe at th