Category "dataframe"

Can I use itertools.count to add values in a column, resetting at a certain point?

I'm trying to create a list of timestamps from a column in a dataframe, that resets after a certain time to zero. So, if the limit was 4, I want the count to ad

panda df not showing all rows after loading from MS SQL

I'm using Pandas with latest sqlalchemy (1.4.36) to query a MS SQL DB, using the following Python 3.10.3 [Win] snippet: import pandas as pd

Python pandas df.copy() ist not deep

I have (in my opinion) a strange problem with python pandas. If I do: cc1 = cc.copy(deep=True) for the dataframe cc and than ask a certain row and column: p

Python iterating over dataframe based on user input

Im trying to filter the data frame (stops_trips_vehicles) based on user input - specifying hour (departure_time column) and the stop name(stop_name column). Eac

Creating custom Quantiles within data frame?

If i have the following table: tibble(year = c("2020", "2020", "2020","2021", "2021", "2021"), website = c("facebook", "google", "youtube","facebook", "

Is there a more Pythonic way to write this? Filling multiple lists to desired length

I'm struggling with making my lists of strings a DataFrame because they are different sizes. So, I'm running code to get the length of the largest list and then

Error: pandas hashtable keyerror

I have successfully read a csv file using pandas. When I am trying to print the a particular column from the data frame i am getting keyerror. Hereby i am shari

how can i insert text file values in a dataframe column

def read_text_file(file_path): with open(file_path, 'r') as f: stop_words(f.read()) # iterate through all file for file in sorte

Binning 2D data with circles instead of rectangles - from pandas df

I have a dataframe of x, y data and need to bin it into circles. Ie a grid of circles of certain size and spacing centered on some point. So for example some da

How Do I Uploading Data Externally in Explainerdashboard

I am trying to upload external data into the dashboard using explainer.set_x_row_func() and explainer.set_y_func(). Does anyone know how to do this? Below is ho

Panda merge returns NAN values

Please consider 2 dataframes panda df1 and df2: df1 = pd.read_csv('df1.csv', sep=';') df2 = pd.read_csv('df2.csv', sep=';') We convert to date fields: df1['

Add a new record for each missing second in a DataFrame with TimeStamp [duplicate]

Be the next Pandas DataFrame: | date | counter | |-------------------------------------|--------------

Replacing negative values in specific columns of a dataframe

This is driving me crazy! I want to replace all negative values in columns containing string "_p" with the value multiplied by -0.5. Here is the code, where Tdf

Comparing 2 columns with different rows in different csv files, and output status to another csv file

I have 2 csv files as shown below. They contain different numbers of rows and the columns are not aligned/sorted along a common index. I need to compare the col

%>% .$column_name equivalent for R base pipe |>

I frequently use the dplyr piping to get a column from a tibble into a vector as below iris %>% .$Sepal.Length iris %>% .$Sepal.Length %>% cut(5) How

Is there any difference between python scripts in airflow and same script in python

I was writing the below code but it is running endless in airflow, but in my system it take 5 min to run gc=pygsheets.authorize(service_account_file='file.json'

Get statistics for each group (such as count, mean, etc) using pandas GroupBy?

I have a data frame df and I use several columns from it to groupby: df['col1','col2','col3','col4'].groupby(['col1','col2']).mean() In the above way I almos

dtale show in jupyter notebook

I am exploring this new Python package named dtale. It is very convenient for pandas data frames visualization. https://pypi.org/project/dtale/ It worked onc

Pyspark Window function on entire data frame

Consider a pyspark data frame. I would like to summarize the entire data frame, per column, and append the result for every row. +-----+----------+-----------+

how to check if value in a DataFrame is a type Decimal

I am writing a data test for some api calls that return a DataFrame with a date type and a type Decimal. I can't find a way to verify the Decimal the DataFrame