Category "pandas"

Calculate the pair-wise correlation between distinct class pairs over two feature columns and the target variable?

Most similar questions relating to calculating this involve a single correlation value for each feature column, showing how the features in a dataset correlate

Having trouble expanding/normalizing a dataframe column of dictionary values into a dataframe/ other columns

I'm trying to expand a dataframe column of dictionaries into it's own dataframe/other columns. I have already tried using json_normalize, iteration, and list c

Split / Explode a column of dictionaries into separate columns with pandas

I have data saved in a postgreSQL database. I am querying this data using Python2.7 and turning it into a Pandas DataFrame. However, the last column of this dat

Why does dask take long time to compute regardless of the size of dataframe

What is the reason that dask dataframe takes long time to compute regardless of the size of dataframe. How to avoid this from happening ? What is the reason beh

DF with values for Time Intervals

I am trying to make a manual dataframe.. I would like to have a time stamp with a time interval, for example: df1: Time Interval Price 10:00 - 11:00 $15 11:00

using pandas to replace header of dataframe

I have an XYZ file in the following format X[m] Y[m] DensD_1200c[m] 625268.27 234978.67 7.24 625268.34 234978.52 7.24 625268.38

Pandas rolling average of a columns of dates

I'm trying to calculate the rolling average of a column of datetime objects. In my scenario, the input data are the last day below freezing each year for ~100 y

How to iterate over rows of each column in a dataframe

My current code functions and produces a graph if there is only 1 sensor, i.e. if col2, and col3 are deleted in the example data provided below, leaving one col

geopandas doesn't find point in polygon even though it should?

I have some lat/long coordinates and need to confirm if they are with the city of Atlanta, GA. I'm testing it out but it doesn't seem to work. I got a geojson f

Google colab: Read .xlsx file in from Github pandas

from Google Colab, I am trying to create a df from a xlsx file I have on a Github repo. As url I take the permalink from Github, the repo is public and account

How to plot some datasets in pandas based on different thresholds in python

I have a data frame that has 3 columns and I want to plot a line graph based on some thresholds. Here is the data frame date income ratio 0 2022-0

Classify DataFrame rows based on first matching condition

I have a pandas DataFrame, each column represents a quarter, the most recent quarters are placed to the right, not all the information gets at the same time, so

How to export Pandas DataFrame to HTML but without any formatting?

I want to export a DF with Pandas to an HTML formatted table, but I don't want any of the default styling that Pandas does to its tables, and would prefer just

Renaming identical column names in Pandas [duplicate]

I have a cycle_2 df with the following column names: 3ls 3rs 3ls 3rs 3 absolute_cost 3.00 9.40 9.40

Parse Year Week columns to Date

I have a data frame with columns Year and Week that I am trying to parse to date into a new column called Date. import datetime df['Date']=datetime.datetime.fro

How can I apply multiple conditions in Pandas, with Python?

How can I apply multiple conditions in pandas? For example I have this dataframe Country VAT RO RO1449488 RO RO1449489 RO RO1449486

How to transform TfidfVectorizer() outputs in dataframes

I found this answer about the model and specific outputs (How to get top n terms with highest tf-idf score - Big sparse matrix). It was great. I would like to k

Is there a method to split multilines to seperate rows

I have a data frame ( as shown in the image link ), some cells contain multiple line within the cell and also have unequal number of values. how can i split an

How to import a .sql file into DuckDB database?

I'm exploring DuckDB for one of my project. Here I have a sample Database file downloaded from https://www.wiley.com/en-us/SQL+for+Data+Scientists%3A+A+Beginner

Pandas - Key Exception - Length mismatch: Some times Expected axis has 3 elements, Some times has 2 elemnts

I have built a script to update stock values from yahoo finance with pandas. Sometimes the script works fine, but at some point it gets an error: Key Exception