Category "pandas"

Get DataFrame with the number of rows for each time interval

Given the following DataFrame of pandas in Python: | ID | date | |--------------|------------------------------------

Splitting dataframe into multiple dataframes

I have a very large dataframe (around 1 million rows) with data from an experiment (60 respondents). I would like to split the dataframe into 60 dataframes (a d

How to use a df column in a vertica_python SQL query?

I have a dataframe with names that I set to a dictionary, like this: {1: "Bob", 41: "John", 126: "Jim", 167: "Pete"} I am using Vertica. I want to be able to p

How to use a df column in a vertica_python SQL query?

I have a dataframe with names that I set to a dictionary, like this: {1: "Bob", 41: "John", 126: "Jim", 167: "Pete"} I am using Vertica. I want to be able to p

df.isna().sum() is not working on titanic dataset

I tried titanic model on kaggle. And it is weird that isna().sum() outputs wrong information. import os import pandas as pd import numpy as np import statsmode

Not able to see all the methods under dt accessor in Jupyter notebook

Maybe a silly question. I have been trying to use dt accessor in pandas to use datetime methods on certain date fields in my Data Frame. Not sure why, but the a

How to name the column when using value_count function in pandas?

I was counting the no of occurrence of angle and dist by the code below: g = new_df.value_counts(subset=['Current_Angle','Current_dist'] ,sort = False) the out

How to select values from pandas dataframe by column value

I am doing an analysis of a dataset with 6 classes, zero based. The dataset is many thousands of items long. I need two dataframes with classes 0 & 1 fo

How to add tags when uploading to S3 from pandas?

Pandas lets you pass an AWS S3 path directly to .to_csv() and .to_parquet(). There's a storage_options argument for passing S3 specific arguments. I would like

Removing [' and '] from CSV

I have several GB of CSV files where values in one of the columns look like this: Which is a consequence of this: urls.append(re.findall(r'http\S+', hashtags_r

Python Pandas Error tokenizing data

I'm trying to use pandas to manipulate a .csv file but I get this error: pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 fields in li

string split with expand=True. Can anyone explain what is the meaning?

all_data['Title']= all_data['Name'].str.split(', ', expand=True)[1].str.split('.', expand=True)[0] Can anyone explain what is the meaning of this line of code?

Converting pandas.DataFrame to bytes

I need convert the data stored in a pandas.DataFrame into a byte string where each column can have a separate data type (integer or floating point). Here is a

how to check if a None is not passed as an argument where a pandas dataframe is expected

I have a function which looks like below. def some_func(df:pd.Dataframe=pd.Dataframe()): if not df or df.empty: //some dataframe operations I want to ens

Pandas read json ValueError: Protocol not known

I ran these codes a while ago and it worked but now there is a ValueError: protocol not known. Could anyone help. Thanks. import json temp = json.dumps([status.

How to create a dictionary of two pandas DataFrame columns

What is the most efficient way to organise the following pandas Dataframe: data = Position Letter 1 a 2 b 3 c 4 d 5

I want to speed up a nested loop when creating a df keyword counts (keywords appearing with other keywords)

Using Pandas I have a df that is 14000 rows by 56 columns (keywords) I have a keyword list (full_keys) that is 1406 items and an empty (0) dataframe (called key

Search for "does-not-contain" on a DataFrame in pandas

I've done some searching and can't figure out how to filter a dataframe by df["col"].str.contains(word) however I'm wondering if there is a way to do the rever

Python / Bokeh / Pandas AttributeError: unexpected attribute 'responsive' to Figure

I'm trying to use bokeh and pandas to create a graph. If ", responsive = True" is not included, the code works. If it is included, it doesn't work. Any sugge

Python pandas: fill a dataframe row by row

The simple task of adding a row to a pandas.DataFrame object seems to be hard to accomplish. There are 3 stackoverflow questions relating to this, none of which