Category "pandas"

geopandas doesn't find point in polygon even though it should?

I have some lat/long coordinates and need to confirm if they are with the city of Atlanta, GA. I'm testing it out but it doesn't seem to work. I got a geojson f

Google colab: Read .xlsx file in from Github pandas

from Google Colab, I am trying to create a df from a xlsx file I have on a Github repo. As url I take the permalink from Github, the repo is public and account

How to plot some datasets in pandas based on different thresholds in python

I have a data frame that has 3 columns and I want to plot a line graph based on some thresholds. Here is the data frame date income ratio 0 2022-0

Classify DataFrame rows based on first matching condition

I have a pandas DataFrame, each column represents a quarter, the most recent quarters are placed to the right, not all the information gets at the same time, so

How to export Pandas DataFrame to HTML but without any formatting?

I want to export a DF with Pandas to an HTML formatted table, but I don't want any of the default styling that Pandas does to its tables, and would prefer just

Renaming identical column names in Pandas [duplicate]

I have a cycle_2 df with the following column names: 3ls 3rs 3ls 3rs 3 absolute_cost 3.00 9.40 9.40

Parse Year Week columns to Date

I have a data frame with columns Year and Week that I am trying to parse to date into a new column called Date. import datetime df['Date']=datetime.datetime.fro

How can I apply multiple conditions in Pandas, with Python?

How can I apply multiple conditions in pandas? For example I have this dataframe Country VAT RO RO1449488 RO RO1449489 RO RO1449486

How to transform TfidfVectorizer() outputs in dataframes

I found this answer about the model and specific outputs (How to get top n terms with highest tf-idf score - Big sparse matrix). It was great. I would like to k

Is there a method to split multilines to seperate rows

I have a data frame ( as shown in the image link ), some cells contain multiple line within the cell and also have unequal number of values. how can i split an

How to import a .sql file into DuckDB database?

I'm exploring DuckDB for one of my project. Here I have a sample Database file downloaded from https://www.wiley.com/en-us/SQL+for+Data+Scientists%3A+A+Beginner

Pandas - Key Exception - Length mismatch: Some times Expected axis has 3 elements, Some times has 2 elemnts

I have built a script to update stock values from yahoo finance with pandas. Sometimes the script works fine, but at some point it gets an error: Key Exception

Polynomial Expansion without sklearn

I want to try and recreate this functions from scratch (without using sklearn): # The matrix is M which is 1000x10 matrix. from sklearn.preprocessing import Po

Problem with websocket output into dataframe with pandas

I have a websocket connection to binance in my script. The websocket runs forever as usual. I got each pair's output as seperate outputs for my multiple stream

Change labels on facet plots

I currently have a plot with 6 facets labeled 1 to 6 in pandas, i wish to change these labels to road type if possible (Motorway, A Road, B Road etc) the code i

I am trying to merge two dataframes

I have this dataframe firm formtype Date_Filed GameStop Corp. 8-K 2021-04-01 I want to change the Date_Filed to 2021-04-01 00:00:00. I am using

pandas.query in a chain not giving expected results

I have a pandas dataframe that looks like this: df = pd.DataFrame( { "ID": [1, 2, 3, 4], "Name": ["Alpha", "Beta", "Gamma", "Delta"],

Calculations on a pandas DataFrame column conditional on another column

I notice several 'set value of new column based on value of another'-type questions, but from what I gather, I have not found that they address dividing values

Type error for a column that exists within the dataframe I am trying to call

Essentially, I am getting a key error in my jupyter notebook when trying to merge two data frames. As I understand it, a key error will only occur if said colum

DataFrame challenge: mapping ID to value in different row. Preferably with Polars

Consider this example: import polars as pl df = pl.DataFrame({ 'ID': ['0', '1', '2', '3', '4', '5','6', '7', '8', '9', '10'], 'Name' : ['A','','','','B