Category "dataframe"

Get for each row the last column name with a certain value

I have this kind of dataframe, and I'm looking to get for each row the last column name equals to 1 Here is an example of my dataframe col1 col2

How to select several rows when reading a csv file using pandas?

I have a very large csv file with millions of rows and a list of the row numbers that I need.like rownumberList = [1,2,5,6,8,9,20,22] I know there is somethi

check if timestamp column is in date range from another dataframe

I have a dataframe, df_A with two columns 'amin' and 'amax', which is a set of time range. My objective is to find whether a column in df_B lies between any o

Spark dataframe transform multiple rows to column

I am a novice to spark, and I want to transform below source dataframe (load from JSON file): +--+-----+-----+ |A |count|major| +--+-----+-----+ | a| 1| m

Pandas Dataframe: Replacing NaN with row average

I am trying to learn pandas but I have been puzzled with the following. I want to replace NaNs in a DataFrame with the row average. Hence something like df.fil

DATAFRAME TO BIGQUERY - Error: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp1yeitxcu_job_4b7daa39.parquet'

I am uploading a dataframe to a bigquery table. df.to_gbq('Deduplic.DailyReport', project_id=BQ_PROJECT_ID, credentials=credentials, if_exists='append') And I

What is the difference between combine_first and fillna?

These two functions seem equivalent to me. You can see that they accomplish the same goal in the code below, as columns c and d are equal. So when should I use

Grouping by multiple columns to find duplicate rows pandas

I have a df id val1 val2 1 1.1 2.2 1 1.1 2.2 2 2.1 5.5 3 8.8 6.2 4 1.1 2.2 5 8.8 6.2 I want t

How can I merge an empty data frame and a data frame in R

I'm trying to merge to data frames like this: data1 <- data.frame(hola = as.numeric(), toma = as.character()) data2 <- data.frame(hola = as.numeric(1), t

Pandas - dataframe groupby - how to get sum of multiple columns

This should be an easy one, but somehow I couldn't find a solution that works. I have a pandas dataframe which looks like this: index col1 col2 col3 col4

Python for Google Sheets: write dataframes to different sheets in the same workbook

Using the code below, I am able to write the dataframe df1 to the default first sheet (starting at cell ‘B7’) of the Google Sheet workbook. In the s

Python Pandas - Concat dataframes with different columns ignoring column names

I have two pandas.DataFrames which I would like to combine into one. The dataframes have the same number of columns, in the same order, but have column headings

Python Pandas - Concat dataframes with different columns ignoring column names

I have two pandas.DataFrames which I would like to combine into one. The dataframes have the same number of columns, in the same order, but have column headings

How to split a DataFrame based on consecutive index?

I have a DataFrame 'work' with non consecutive index, here is an example: Index Column1 Column2 4464 10.5 12.7 4465 11.3 12.8 4466 10.3 22.8 5123 1

Passing dataframe and using its name to create the csv file

I have a requirment where i need to pass different dataframes and print the rows in dataframes to the csv file and the name of the file needs to be the datafram

How to reorder indexed rows based on a list in Pandas data frame

I have a data frame that looks like this: company Amazon Apple Yahoo name A 0 130 0 C 173 0 0 Z 0 0

Why does lm generate NA for each independent variable?

I tried to make a linear regression with the lm function, but the output is NA for every independent variable. The dataframe is numeric. I have already tried t

Find the column name which has the maximum value for each row

I have a DataFrame like this one: In [7]: frame.head() Out[7]: Communications and Search Business General Lifestyle 0 0.745763 0.050847 0.118644

How To Solve KeyError: u"None of [Index([..], dtype='object')] are in the [columns]"

I'm trying to create a SVM model from what I found in github here, but it keeps returning this error. Traceback (most recent call last): File "C:\Users\Me\Do

Filter rows in csv file based on another csv file and save the filtered data in a new file

Good day all so I was trying to filter file2 based on file1, where file1 is a subset from file2. But file2 has a description column that I need to be able to an