Category "pandas"

reshaping the dataset in python

I have this dataset: Account lookup FY11USD FY12USD FY11local FY12local Sales CA 1000 5000 800 4800 Sales JP 5000 6500 10 15 Trying to arrive to get the data

Can we append a dataframe to snowflake table having some data, when some columns are same and some columns are different?

I have a dataframe which contains some columns and snowflake table is having some columns. Some columns are same and some columns are different between them. As

How to store the variables output inside a function during concurrent.futures.ProcessPoolExecutor from concurrent.futures

I am currently trying to store the output obtained in a function during multiprocessing by using concurrent.futures.ProcessPoolExecutor from concurrent.futures

how to covert a json to pandas dataframe when the value is completely in the string fomat

I am trying to convert the data from a json to dataframe. My son {"data":"key=IAfpK, age=58, key=WNVdi, age=64, key=jp9zt, age=47, key=0Sr4C, age=68, key=CGEqo,

How to line plot timeseries data on a bar plot

I have the following data frame: data = {'date': ['3/24/2020', '3/25/2020', '3/26/2020', '3/27/2020'], 'Total1': [133731.9147, 141071.6383, -64629.74024

group by and count unique entries with more than one entry in time period

I have a pandas df as follows: Date UserID 2022-01-01 ABC 2022-01-02 ABC 2022-01-03 ABC 2022-01-01 DEF 2022-01-05 DEF

Converting tensorflow dataset to pandas dataframe

I am very new to the deep learning and computer vision. I want to do some face recognition project. For that I downloaded some images from Internet and converte

Timeseries dataframe returns an error when using Pandas Align - valueError: cannot join with no overlapping index names

My goal: I have two time-series data frames, one with a time interval of 1m and the other with a time interval of 5m. The 5m data frame is a resampled version o

How can i extract day of week from timestamp in pandas

I have a timestamp column in a dataframe as below, and I want to create another column called day of week from that. How can do it? Input: Pickup date/time

Probability of selling the same items again in a pandas dataframe

I need to know the probability of selling similar items together, based on a sales history formatted like this: pd.DataFrame({"sale_id": [1, 1, 1, 2, 2, 3, 3, 3

thresh in dropna for DataFrame in pandas in python

df1 = pd.DataFrame(np.arange(15).reshape(5,3)) df1.iloc[:4,1] = np.nan df1.iloc[:2,2] = np.nan df1.dropna(thresh=1 ,axis=1) It seems that no nan value has bee

What causes these Int64 columns to cause a TypeError?

I have a pandas DataFrame with several flag/dummy variables of type Int64. I am aggregating on other fields and taking the mean value in order to calculate a pe

How to keep top 500 rows a csv loop (python) and overwrite each file

I am trying to read more than 100 csv files in python to keep the TOP 500 rows (they each have more than 55,0000 rows). So far I know how to do that, but I need

How can I add a path to the CSV files created?

Im splitting a CSV file based on column "ColumnName". How can I make all the CSV files created save into a specified path? data = pd.read_csv(r'C:\Users\...\O

Pandas: return rows that have two matching columns commonality

I am trying to write a commonality script which will return rows in a pandas dataframe that have two matching columns, and also will sum up the number of rows w

import pandas throws TypeError: expected string or bytes-like object

After pip installing a private repo in my Conda environment I now get the error TypeError: expected string or bytes-like object when trying to import pandas. I

How to select top level columns in multi header pandas dataframe

I have a multi header dataframe and it looks like that: SPY ARKW Open Hig

Creating custom colourmap for geopandas.explore plot

all code: def rgb2hex(r,g,b): return '#{:02x}{:02x}{:02x}'.format(r,g,b) def rg(num): num = int(np.round((num / 100) * 124)) r = (124 - num) g

Convert JSON format column to new columns

I have a sub-Yelp Dataset in csv, and attributes column is in json format. I'm trying to convert that column to new columns, but none of the relevant code on di

BigQuery Results to Panda DataFrame in Chunks

I am trying to save the results of a BigQuery query to a Panda DataFrame using bigquery.Client.query.to_dataframe() This query can return millions of rows. Gi