I tried titanic model on kaggle. And it is weird that isna().sum() outputs wrong information. import os import pandas as pd import numpy as np import statsmode
Maybe a silly question. I have been trying to use dt accessor in pandas to use datetime methods on certain date fields in my Data Frame. Not sure why, but the a
I was counting the no of occurrence of angle and dist by the code below: g = new_df.value_counts(subset=['Current_Angle','Current_dist'] ,sort = False) the out
I am doing an analysis of a dataset with 6 classes, zero based. The dataset is many thousands of items long. I need two dataframes with classes 0 & 1 fo
Pandas lets you pass an AWS S3 path directly to .to_csv() and .to_parquet(). There's a storage_options argument for passing S3 specific arguments. I would like
I have several GB of CSV files where values in one of the columns look like this: Which is a consequence of this: urls.append(re.findall(r'http\S+', hashtags_r
I'm trying to use pandas to manipulate a .csv file but I get this error: pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 fields in li
all_data['Title']= all_data['Name'].str.split(', ', expand=True)[1].str.split('.', expand=True)[0] Can anyone explain what is the meaning of this line of code?
I need convert the data stored in a pandas.DataFrame into a byte string where each column can have a separate data type (integer or floating point). Here is a
I have a function which looks like below. def some_func(df:pd.Dataframe=pd.Dataframe()): if not df or df.empty: //some dataframe operations I want to ens
I ran these codes a while ago and it worked but now there is a ValueError: protocol not known. Could anyone help. Thanks. import json temp = json.dumps([status.
What is the most efficient way to organise the following pandas Dataframe: data = Position Letter 1 a 2 b 3 c 4 d 5
Using Pandas I have a df that is 14000 rows by 56 columns (keywords) I have a keyword list (full_keys) that is 1406 items and an empty (0) dataframe (called key
I've done some searching and can't figure out how to filter a dataframe by df["col"].str.contains(word) however I'm wondering if there is a way to do the rever
I'm trying to use bokeh and pandas to create a graph. If ", responsive = True" is not included, the code works. If it is included, it doesn't work. Any sugge
The simple task of adding a row to a pandas.DataFrame object seems to be hard to accomplish. There are 3 stackoverflow questions relating to this, none of which
I have 2 data frames with identical columns. Column 'key' will have unique values. Data frame 1:- A B key C 0 1 k1 2 1 2 k2 3 2 3 k3 5 Data f
I want to read in a very large csv (cannot be opened in excel and edited easily) but somewhere around the 100,000th row, there is a row with one extra column ca
Having a series like this: ds = Series({'wikipedia':10,'wikimedia':22,'wikitravel':33,'google':40}) google 40 wikimedia 22 wikipedia 10 wikitra
This code generates error: IndexError: invalid index to scalar variable. at the line: results.append(RMSPE(np.expm1(y_train[testcv]), [y[1] for y in y_test])