Category "pandas"

My text classifier model doens't improve with multiple classes

I'm trying to train a model for a text classification and the model take a list of maximum 300 integer embedded from articles. The model trains without problem

Oracle Cube Replica in Python Pandas

Oracle has a cube function for groupby clause which takes 2 or more columns and groups the results set in all possible combinations of columns passed to the cub

Delete row based on value in any column of the dataframe

There are several posts on how to drop rows if one column in a dataframe holds a certain undesired string, but I am struggling with how to do that if I have to

Re-evaluate data types in Pandas columns

Sorry if this question is duplicate!! I have a Dataframe like 0 1 2 3 4 0 1 33 40 75 73 45 2 46 59 40 53 17 3 43

Matplotlib and Pandas Plotting amount of numbers in certain range

I have pandas Dataframe that looks like this: I am asking to create this kind of plot for every year [1...10] with the Score range of [1...10]. This means that

Rename Pandas columns with inplace

I want to rename a few columns and set inplace = False then I am able to view the result. Why set inplace = True will result in None (I have refresh the runtime

Python Pandas : how to combine trip segments into a journey with Transport smart card data

Currently working with an interesting transport smart card dataset. Each line in the current data represent a trip (e.g. bus trip from A to B). Any trips within

Rename Pandas columns with inplace

I want to rename a few columns and set inplace = False then I am able to view the result. Why set inplace = True will result in None (I have refresh the runtime

How do I extract integer values from PHD using pyodbc?

I am trying to connect to a Honeywell PHD server using Python 3.x and extract data. I am connecting with this syntax: import pyodbc import pandas as pd import n

Adjust the size of folium popups

Site_Number Site_Description Region_Site Latitude Longitude S1_AverageSpeed S1_85thSpeed S2_AverageSpeed S2_85thSpeed S3_AverageSpeed S3_85thSpeed String_for_P

Pandas Rolling window to calculate sum of the same items of the last n days

Following up with this question, now I would like to calculate the sum/mean of a different column given the same grouping on a rolling window. Here is the code

returning dataframe as dictionary, can't get right format

I'm using Python to try and return a dataset after transforming it a bit. Currently, it works like this, with the data it's returning. [ { "index":

Select two sets of columns by column names in Pandas

Take the DataFrame in the answer of Loc vs. iloc vs. ix vs. at vs. iat? for example. df = pd.DataFrame( {'age':[30, 2, 12, 4, 32, 33, 69], 'color':['blue', 'g

Calculating Drawdown in Pandas

I have the following DataFrame: Profit Cumulative Date 1/6/2005 248.8500 248.85 1/12/2005 48.3500

pandas dividing rows by its total

I have this df: Name num1 num2 num3 A 1 2 3 B 4 5 6 C 7 8 9 My goal is to divide each row

How to query a numerical column name in pandas?

Lets suppose I create a dataframe with columns and query i.e pd.DataFrame([[1,2],[3,4],[5,6]],columns=['a','b']).query('a>1') This will give me a b

Get for each row the last column name with a certain value

I have this kind of dataframe, and I'm looking to get for each row the last column name equals to 1 Here is an example of my dataframe col1 col2

Best way to store many pandas dataframes in a single file

I have 20,000 ~1000-row dataframes, each of which has a name, in a 170GB pickle file at the moment. I'd like to write these to a file these so I can load them i

How to select several rows when reading a csv file using pandas?

I have a very large csv file with millions of rows and a list of the row numbers that I need.like rownumberList = [1,2,5,6,8,9,20,22] I know there is somethi

Plotting Time Series using pandas

I have a .csv file containing time series data with headers like Description, Date and Values. I am looking to make a line graph for this time series in such th