Category "pandas"

pandas df.to_parquet write to multiple smaller files

Is it possible to use Pandas' DataFrame.to_parquet functionality to split writing into multiple files of some approximate desired size? I have a very large Data

Replacing ID values of polygons in a geodataframe to values of polygons from another geodataframe

I have polygons inside another bigger single polygon and I want to be able to replace the ID values (for example) of the former polygon to that of the latter. S

Pandas: Sampling from a DataFrame according to a target distribution

I have a Pandas DataFrame containing a dataset D of instances which all have some continuous value x. x is distributed in a certain way, say uniform, could be a

Plotly chart is a mess of lines after index converted to pandas datetime

My plotly chart is just a mess of zig-zagging lines (see chart here). This only happens after I use df['Date'] = pd.to_datetime(df.index) to convert the index t

Convert pandas data frame column, which has values of vectors, into tensors

My question is how to convert a vector on pandas data frame into tensors. The data frame has a resume column which has a vector representations of each resume d

Write to an existing .xlsm using pandas and XlsxWriter

I would like to write a dataframe to an existing .xlsm file which already has content. Write pandas dataframe to xlsm file (Excel with Macros enabled) describes

Display count on top of seaborn barplot

I have a dataframe that looks like: User A B C ABC 100 121 OPEN BCD 200 255 CLOSE BCD 500 134 OPEN DEF 600 1

Im getting a different output than expected when using df.loc to change some values of the df

I have a data frame, and I want to assign a quartile number based on the quartile variable, which gives me the ranges that I later use in the for. The problem i

How to add a new row after every unique entries in pandas dataframe

I have to add a new row at the end of each person information. In the new row which we will add all the information will be same as last row like name, last_upd

Why does pandas.json_normalize(json_results) raise a NotImplementedError?

I have a json variable named json_results and I am running pandas.json_normalize(json_results). It raises the following error: in _json_normalize raise NotI

Save image with fig.write_image in Python Plotly

I would like to save images within plotly fig.write_image using a forloop, where each image name include a customised id and Timestamp value with a string forma

median in pandas dropping center value

I am working in pandas and want to implement an algorithm that requires I assess a modified centered median on a window, but omitting the middle value. So for i

Having coding line graphs after iloc command line

I'm trying to graph a line with the x- axis being the hour to the sum of 24 hours and the y axis being the sums of the first 4 .15 min increments of kWh values.

How do I change the values in a pandas column that are selected by a regex?

I'm cleaning up data for a personal project and am standardizing the large number of categories. The seemingly low hanging fruit have similar enough names such

how to get smallest index in dataframe after using groupby

If create_date field does not correspond to period between from_date and to_date, I want to extract only the large index records using group by 'indicator' and

Printing values in new columns based on a condition from another column

I have a following dataframe: Time Tab User Description 27.10.2021 15:58:00 Tab Alpha [email protected] Tab Alpha of type PARTSTUDIO opened by User A 27.10.2021

Comparing two panda dataframes with different size

I want to compare two dataframes with content of 1s and 0s. I run for loops to check every element of the dataframes and at the end, I want to replace the "1" v

Can I store a Parquet file with a dictionary column having mixed types in their values?

I am trying to store a Python Pandas DataFrame as a Parquet file, but I am experiencing some issues. One of the columns of my Pandas DF contains dictionaries as

How to reset cumulative sum every time there is a NaN in a pandas dataframe?

If I have a Pandas data frame like this: 1 2 3 4 5 6 7 1 NaN 1 1 1 NaN 1 1 2 NaN NaN 1 1 1 1 1 3 NaN NaN NaN 1 NaN

How to use Python faker for dependent columns

Scenario If column1 = ‘Value’ then column2 = ‘AAA’ How can we use faker to generate mock data for these dependent columns. Need to consi