Category "dataframe"

split the lines of a data frame into a variable number of lines based on a character in R [duplicate]

I have this df: df = data.frame(ID = c(1,2,3), A = c("h;d;c", "j;k", "k")) And i want to retrieve a new df with splited rows ba

How to store the variables output inside a function during concurrent.futures.ProcessPoolExecutor from concurrent.futures

I am currently trying to store the output obtained in a function during multiprocessing by using concurrent.futures.ProcessPoolExecutor from concurrent.futures

how to covert a json to pandas dataframe when the value is completely in the string fomat

I am trying to convert the data from a json to dataframe. My son {"data":"key=IAfpK, age=58, key=WNVdi, age=64, key=jp9zt, age=47, key=0Sr4C, age=68, key=CGEqo,

Converting tensorflow dataset to pandas dataframe

I am very new to the deep learning and computer vision. I want to do some face recognition project. For that I downloaded some images from Internet and converte

Timeseries dataframe returns an error when using Pandas Align - valueError: cannot join with no overlapping index names

My goal: I have two time-series data frames, one with a time interval of 1m and the other with a time interval of 5m. The 5m data frame is a resampled version o

How to keep top 500 rows a csv loop (python) and overwrite each file

I am trying to read more than 100 csv files in python to keep the TOP 500 rows (they each have more than 55,0000 rows). So far I know how to do that, but I need

Pandas: return rows that have two matching columns commonality

I am trying to write a commonality script which will return rows in a pandas dataframe that have two matching columns, and also will sum up the number of rows w

How to select top level columns in multi header pandas dataframe

I have a multi header dataframe and it looks like that: SPY ARKW Open Hig

Is there a way to dynamically create new arrays from a dataframe

I have a table that looks like |Category|number|absorbance|protein1|protein2| |--------|------|----------|--------|--------| |a|int|float|float|float| |a|int|fl

Load Pandas Dataframe to S3 passing s3_additional_kwargs

Please excuse my ignorance / lack of knowledge in this area! I'm looking to upload a dataframe to S3, but I need to pass 'ACL':'bucket-owner-full-control'. i

How do I melt a pandas with custom nam

I have a table like this device_type version pool testMean testP50 testP90 testP99 testStd WidgetMean WidgetP50 WidgetP90 WidgetP99 WidgetStd PNB0Q

How do I melt a pandas with custom nam

I have a table like this device_type version pool testMean testP50 testP90 testP99 testStd WidgetMean WidgetP50 WidgetP90 WidgetP99 WidgetStd PNB0Q

Python Script to find file names from CSV will not concatenate

I am writing a script that will allow me to extract a segment of image files from a large folder. I put the image file names into a dataframe. I am having prob

Creating New columns from other pandas column

I would like to create a new Column from the genres column. The genres column contains one or multiple genres and I would like to create a column for each genre

Sort dataframe multiindex level and by column

#Updated: pandas version 0.23.0 solves this problem with Sorting by a combination of columns and index levels I have struggled with this and I suspect there is

Find specific value knowing row pandas

I have a dataframe with this structure: A indexer attr1_rank attr2_rank attr3_rank attr4_rank ... attrn_rank P 1 2 1 3 4 ... n S 2 1 2 4 3 ... n How can i add

how to merge multiple datasets with differences in merge-index strings?

Hello I am struggling to find a solution to probably a very common problem. I want to merge two csv-files with soccer data. They basically store different data

How to solve (NaN error) when given column specific name

I have many text files include data as follow: 350.0 2.1021 0.0000 1.4769 0.0000 357.0 2.0970 0.0000 1.4758 0.0000 364.0 2.0920 0.0000

How to solve (NaN error) when given column specific name

I have many text files include data as follow: 350.0 2.1021 0.0000 1.4769 0.0000 357.0 2.0970 0.0000 1.4758 0.0000 364.0 2.0920 0.0000

Replace values conditionally, better way

I have a large dataframe with city names and many are misspelled. Right now I have corrected then manually, one by one, using the following code: geo <- geo