Category "dataframe"

How to split a DataFrame based on consecutive index?

I have a DataFrame 'work' with non consecutive index, here is an example: Index Column1 Column2 4464 10.5 12.7 4465 11.3 12.8 4466 10.3 22.8 5123 1

Passing dataframe and using its name to create the csv file

I have a requirment where i need to pass different dataframes and print the rows in dataframes to the csv file and the name of the file needs to be the datafram

How to reorder indexed rows based on a list in Pandas data frame

I have a data frame that looks like this: company Amazon Apple Yahoo name A 0 130 0 C 173 0 0 Z 0 0

Why does lm generate NA for each independent variable?

I tried to make a linear regression with the lm function, but the output is NA for every independent variable. The dataframe is numeric. I have already tried t

Find the column name which has the maximum value for each row

I have a DataFrame like this one: In [7]: frame.head() Out[7]: Communications and Search Business General Lifestyle 0 0.745763 0.050847 0.118644

How To Solve KeyError: u"None of [Index([..], dtype='object')] are in the [columns]"

I'm trying to create a SVM model from what I found in github here, but it keeps returning this error. Traceback (most recent call last): File "C:\Users\Me\Do

Filter rows in csv file based on another csv file and save the filtered data in a new file

Good day all so I was trying to filter file2 based on file1, where file1 is a subset from file2. But file2 has a description column that I need to be able to an

Spark Scala Split dataframe into equal number of rows

I have a Dataframe and wish to divide it into an equal number of rows. In other words, I want a list of dataframes where each one is a disjointed subset of the

Sum over previous periods for each period for each subject - R

A MWE is as follows: library(dplyr) Period <- c(1, 1, 1, 2, 2, 2, 3, 3, 3) Subject <- c(1, 2, 3, 1, 2, 3, 1, 2, 3) set.seed(1) Values <- round(rnor

How can I remove all non-numeric characters from all the values in a particular column in pandas dataframe?

I have a dataframe which looks like this: A B C 1 red78 square big235 2 green circle small123 3 blue45 triangle big657

removing NA values from a DataFrame in Python 3.4

import pandas as pd import statistics df=print(pd.read_csv('001.csv',keep_default_na=False, na_values=[""])) print(df) I am using this code to create a data

VS Code no longer showing option to view DataFrame in Data Viewer

I'm working with pandas in VS Code and I've been using the View value in Data Viewer option to look at my Data frames while debugging. For some reason VS Code h

How to replace pandas dataframe column names with another list or dictionary matching

I need to replace column names of a pandas DataFrame having names like 'real_tag' and rename them like 'Descripcion' list = [{'real_tag': 'FA0:4AIS0007', 'Descr

How to use write.table to download a dataframe into a nice csv/Excel file?

I am trying to use the write.table() function, within Shiny downloadHandler(), to download the df reactive dataframe as a .csv file, per the reproducible code a

Python: create a pandas data frame from a list

I am using the following code to create a data frame from a list: test_list = ['a','b','c','d'] df_test = pd.DataFrame.from_records(test_list, columns=['my_let

MemoryError: Unable to allocate 1.88 GiB for an array with shape (2549150, 99) and data type object

I have a problem. I want to normalize with pd.json_normalize(...) a list with inside dict but unfortunately I got a MemoryError. Is there an option to work arou

Initialize a column with missing values and copy+transform another column of a dataframe into the initialized column

I have a messy column in a csv file (column A of the dataframe). using CSV, DataFrames df = DataFrame(A = ["1", "3", "-", "4", missing, "9"], B = ["M", "F", "R

Rolling OLS Regressions and Predictions by Group

Uncomfortable output of mode() in pandas Dataframe

I have a dataframe with several columns (the features). >>> print(df) col1 col2 a 1 1 b 2 2 c 3 3 d 3 2 I woul

How to subset Pandas Dataframe using an OR operator whilst avoiding "FutureWarning: elementwise comparison failed;"

I have a Pandas dataframe (tempDF) of 5 columns by N rows. Each element of the dataframe is an object (string in this case). For example, the dataframe looks li

Category "dataframe"

Other Categories