Category "dataframe"

Pandas approximating/rounding large numbers from csv

I am reading numbers from a csv file into a pandas dataframe. When the numbers I am reading are approximately >1E12, pandas will approximate the number to 3

Read .csv file in R

I am a beginner to R, I have a file like below. state population Alabama 4779736 Alaska 710231 Arizona 6392017

How to create ratios using value counts and separate fields in Python?

Using the data frame shown below I'd like to create manager to assistant and manager to associate percentages/ ratios based/ per location. I'm looking for the

R replace string in df with partial match in a list

I have a dataframe (df) in R and I want to create a new column (city1_n) that contains a line stored in the list key whenever there is a partial match between c

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I want to filter my dataframe with an or condition to keep rows with a particular column's values that are outside the range [-0.25, 0.25]. I tried: df = df[(df

Compare two excel files for the difference using pandas with multiple tabs

I found this nice script online which does a great job comparing the differences between 2 excel sheets but there's an issue - it doesn't work if the excel file

How to create a dummy variable corresponding to a change in a value in R

I have the following data: week <- c(1,2,3,4,1,2,3,4,1,2,3,4) product <- c("A", "A", "A", "A", "B", "B", "B", "B", "C", "C", "C", "C") price <- c(5,5,6

I have a dataframe with a json substring in 1 of the columns. i want to extract variables and make columns for them

imports json df = pd.read_json("C:/xampp/htdocs/PHP code/APItest.json", orient='records') print(df) I would like to create three columns extra: ['name','l

how to "transpose" datas from a date to another one in python

Sorry i had a lot of trouble explaining my problem in the title but i hope it will be more understandable with this example : i have a data source that tells me

How to select all the rows with 0

I have a dataset where I have some 0 values in it. I want to print all the rows having 0. I was able to print a single column, but can't find a way to print al

How do I select values from one array based on a boolean array?

Let's say I have 2 numpy arrays, with the same 1200x1200 shape. The first one contains boolean values. The second one is an image, that was converted to boolean

Summarize two dataframes in r

I have two dataframes df1 # var1 var2 # 1 X01 Red # 2 X02 Green # 3 X03 Red # 4 X04 Yellow # 5 X05 Red # 6 X06 Green df2 # X01 X02

Groupby and create a dummy =1 if column values do not contain 0, =0 otherwise

My df id var1 A 9 A 0 A 2 A 1 B 2 B 5 B 2 B 1 C 1 C 9 D 7 D 2 D 0 .. desired output will ha

Pandas Lookup to be deprecated - elegant and efficient alternative

The Pandas lookup function is to be deprecated in a future version. As suggested by the warning, it is recommended to use .melt and .loc as an alternative. df =

Pandas: Calculate Difference between a row and all other rows and create column with the name

We have data as below Name value1 Value2 finallist 0 cosmos 10 20 [10,20] 1 network 30 40 [30,40] 2 unab 20 40 [20,40]

Ways to select multiple columns in base R using the native pipe |>?

What are good ways to select multiple columns of a data frame in base R using the native pipe |>? (i.e., without the tidyverse/dplyr to reduce external depen

Updating a Value of A Panda Dataframe with a Function

I have a function which updates a dataframe that I have passed in: def update_df(df, x, i): for i in range(x): list = ['name' + str(i), i + 2, i - 1

Dataframe Operation Splicing

I have a single column dataframe without headers and I want to split it into multiple columns as follows The current dataframe - 1 2 3 4 5 . . 100 I want to re

python how to use string value for custom sort?

I have an datafremae like this time_posted 0 5 days ago 1 an hour ago 2 a day ago 3 6 hours ago 4 4 hours ago I tried this df.sort_values(by='time_p

Unable to read a column of an excel by Column Name using Pandas

Excel Sheet I want to read values of the column 'Site Name' but in this sheet, the location of this tab is not fixed. I tried, df = pd.read_excel('TestFile.xlsx