Category "pandas"

How to count the same rows between multiple CSV files in Pandas?

I merged 3 different CSV(D1,D2,D3) Netflow datasets and created one big dataset(df), and applied KMeans clustering to this dataset. To merge them I did not use

Efficient way to unnest (explode) multiple list columns in a pandas DataFrame

I am reading multiple JSON objects into one DataFrame. The problem is that some of the columns are lists. Also, the data is very big and because of that I canno

converting a SAS Macro to python with Pandas?

I'm converting a program of SAS code into a python equivalent. One section that i'm struggling with is how to convert a macro program in SAS when the variables

how to append data frame to existed formulated excel file

if u have a formulated excel file and now wants to append data frame by python then how.. I used this code but did not get output mypath="C:\\Users\\egoyrat\\

Pandas wide to long bringing empty DataFrame

I was working in a pretty simple task: applying wide_to_long to a DataFrame, but every time I ran it, I got an empty DataFrame. I was almost sure I was doing it

Get CSV from google drive and then load to pandas

My Goal is to read a .csv file from google drive and load it to a dataframe. I tried some answers here but the thing is, the file is not public and needs authen

ValueError: Incompatible indexer with Series while adding date to Date to Data Frame

I am new to python and I can't figure out why I get this error: ValueError: Incompatible indexer with Series. I am trying to add a date to my data frame. The da

Problem when importing statsmodels.regression.rolling (AttributeError: 'pandas._libs.properties.CachedProperty' object has no attribute 'func')

When I run the below code: from statsmodels.regression import rolling I get this error message: AttributeError Traceback (most recen

Django - Create downloadable Excel file using Pandas & Class Based View

I'm relatively new to Django and have been looking for a way to export my DataFrame to Excel using Pandas and CBV. I have found this code: from django.http impo

Update patch edge colours in Geopandas plot

I've plotted a GeoDataFrame as a choropleth using the following code (geopandas 0.2.1, matplotlib 2.0.2, in a Jupyter notebook, using %inline: fig, ax = plt.su

How do I subset the columns of a dataframe based on the index of another dataframe?

The rows of clin.index (row length = 81) is a subset of the columns of common_mrna (col length = 151). I want to keep the columns of common_mrna only if the col

Using categorical variables in statsmodels OLS class

I want to use statsmodels OLS class to create a multiple regression model. Consider the following dataset: import statsmodels.api as sm import pandas as pd im

Merging two dataframes without losing data

I have two dataframes: df_1 = Material TypeOf 4100 N200 4101 M200 4200 M200 4500 N200 .

Integer out of range when inserting large number of rows to postgress

I have tried multiple solutions and way around to solve this issue, probably something is still I am missing. I want to insert a list of values to my database.

retrieve only months with at least 28 sample days - pandas dataframe

Hello to the people of the web, I have a dataframe containing 'DATE' (datetime) as index and TMAX as column with values: tmax dataframe What i'm trying to do is

randomly split dataframe into groups with even distribution of values

I have a dataframe of two groups (A and B) and within those groups, 6 subgroups (a, b, c, d, e, and f). Example data below: index group subgroup value 0

scraping pdf files multiple pages from url

I want to scrape the information on this PDF in python. I'm not sure where to start because it isn't organized at all. I'm used to scraping HTML. I tried conver

How do I reorder a long string of concatenated date and timestamps seperated by commas using Python?

I have a string type column called 'datetimes' that contains multiple dates with their timestamps, and I'm trying to extract the earliest and last dates (withou

How do I reorder a long string of concatenated date and timestamps seperated by commas using Python?

I have a string type column called 'datetimes' that contains multiple dates with their timestamps, and I'm trying to extract the earliest and last dates (withou

How to create variables based on column names in dataframe?

I wanted to create variables in python based on the column names of my dataframe. Not sure if this is possible as I am quite new to Python. Lets say my df looks