Category "pandas"

replace the empty value in the dataframe with a list of python values

There is a list of shops |Shop ID| |-------| | Shop1 | | Shop2 | | Shop3 | There is a list of events that took place in the store |Shop ID| Event | Start_date

Error: pandas hashtable keyerror

I have successfully read a csv file using pandas. When I am trying to print the a particular column from the data frame i am getting keyerror. Hereby i am shari

How to apply Target Encoding in test dataset?

I am working on a project, where I had to apply target encoding for 3 categorical variables: merged_data['SpeciesEncoded'] = merged_data.groupby('Species')['Wnv

Make Seaborn Distplot and Barplot the same color [duplicate]

I have been unable to figure out how to set the colors between distplot and barplot to be the same. Despite setting the color argument in both

AWS Athena table from python output with dates - dates get wrongly converted

I have a pandas DataFrame containing a date column ("2022-02-02"). I write this table to parquet using pyarrow. df[col] = df[col].astype(str) df.to_parquet(loc)

Binning 2D data with circles instead of rectangles - from pandas df

I have a dataframe of x, y data and need to bin it into circles. Ie a grid of circles of certain size and spacing centered on some point. So for example some da

How Do I Uploading Data Externally in Explainerdashboard

I am trying to upload external data into the dashboard using explainer.set_x_row_func() and explainer.set_y_func(). Does anyone know how to do this? Below is ho

Panda merge returns NAN values

Please consider 2 dataframes panda df1 and df2: df1 = pd.read_csv('df1.csv', sep=';') df2 = pd.read_csv('df2.csv', sep=';') We convert to date fields: df1['

Add a new record for each missing second in a DataFrame with TimeStamp [duplicate]

Be the next Pandas DataFrame: | date | counter | |-------------------------------------|--------------

Comparing 2 columns with different rows in different csv files, and output status to another csv file

I have 2 csv files as shown below. They contain different numbers of rows and the columns are not aligned/sorted along a common index. I need to compare the col

Error with delimiters on dataframe when trying to upload it to MSSQL

So I've been trying to upload a dataframe to an specific table that is under MSSQL, I've trying to use the BCPANDAS library to upload the data to it. However th

Get statistics for each group (such as count, mean, etc) using pandas GroupBy?

I have a data frame df and I use several columns from it to groupby: df['col1','col2','col3','col4'].groupby(['col1','col2']).mean() In the above way I almos

Poor accuarcy score for Semi-Supervised Support Vector machine

I am using a Semi-Supervised approach for Support Vector Machine in Python for the image classification from PASCAL VOC 2007 data. I have tried with the default

dtale show in jupyter notebook

I am exploring this new Python package named dtale. It is very convenient for pandas data frames visualization. https://pypi.org/project/dtale/ It worked onc

Inconsistent indexing of subplots returned by `pandas.DataFrame.plot` when changing plot kind

I know that, this issue is known and was already discussed. But I am encountering a strange behaviour, may be someone has idea why: When I run this: plot = df.p

how to check if value in a DataFrame is a type Decimal

I am writing a data test for some api calls that return a DataFrame with a date type and a type Decimal. I can't find a way to verify the Decimal the DataFrame

Get index and column with multiple headers and index_col in Pandas DataFrame

I have a dataframe with multiple headers and column indexes, and would like to retrieve the list of entries that are non-zero. The dataframe is constructed from

How to edit/ sort a non-column column in Python?

I wrote the script below, and I'm 98% content with the output. However, the unorganized manner/ disorder of the 'Approved' field bugs me. As you can see, I trie

Geopandas not plotting correct colors

My Geopandas DataFrame has 3 polygons and 9 points with color_rgba column computed with matplotlib.colors.to_rgba function: import contextily as ctx import geop

Numpy where function in python

I have a data frame like this: pd.DataFrame({'Material': ['Steel (16MnCr5)', 'X', 'X', 'X', 'Carbon black', 'Sulfur', 'Copper'], 'Weight': [4, 8, 0, 8, 6, 9, 3