Category "pandas"

Re-evaluate data types in Pandas columns

Sorry if this question is duplicate!! I have a Dataframe like 0 1 2 3 4 0 1 33 40 75 73 45 2 46 59 40 53 17 3 43

Matplotlib and Pandas Plotting amount of numbers in certain range

I have pandas Dataframe that looks like this: I am asking to create this kind of plot for every year [1...10] with the Score range of [1...10]. This means that

Rename Pandas columns with inplace

I want to rename a few columns and set inplace = False then I am able to view the result. Why set inplace = True will result in None (I have refresh the runtime

Python Pandas : how to combine trip segments into a journey with Transport smart card data

Currently working with an interesting transport smart card dataset. Each line in the current data represent a trip (e.g. bus trip from A to B). Any trips within

Rename Pandas columns with inplace

I want to rename a few columns and set inplace = False then I am able to view the result. Why set inplace = True will result in None (I have refresh the runtime

How do I extract integer values from PHD using pyodbc?

I am trying to connect to a Honeywell PHD server using Python 3.x and extract data. I am connecting with this syntax: import pyodbc import pandas as pd import n

Adjust the size of folium popups

Site_Number Site_Description Region_Site Latitude Longitude S1_AverageSpeed S1_85thSpeed S2_AverageSpeed S2_85thSpeed S3_AverageSpeed S3_85thSpeed String_for_P

Pandas Rolling window to calculate sum of the same items of the last n days

Following up with this question, now I would like to calculate the sum/mean of a different column given the same grouping on a rolling window. Here is the code

returning dataframe as dictionary, can't get right format

I'm using Python to try and return a dataset after transforming it a bit. Currently, it works like this, with the data it's returning. [ { "index":

Select two sets of columns by column names in Pandas

Take the DataFrame in the answer of Loc vs. iloc vs. ix vs. at vs. iat? for example. df = pd.DataFrame( {'age':[30, 2, 12, 4, 32, 33, 69], 'color':['blue', 'g

Calculating Drawdown in Pandas

I have the following DataFrame: Profit Cumulative Date 1/6/2005 248.8500 248.85 1/12/2005 48.3500

pandas dividing rows by its total

I have this df: Name num1 num2 num3 A 1 2 3 B 4 5 6 C 7 8 9 My goal is to divide each row

How to query a numerical column name in pandas?

Lets suppose I create a dataframe with columns and query i.e pd.DataFrame([[1,2],[3,4],[5,6]],columns=['a','b']).query('a>1') This will give me a b

Get for each row the last column name with a certain value

I have this kind of dataframe, and I'm looking to get for each row the last column name equals to 1 Here is an example of my dataframe col1 col2

Best way to store many pandas dataframes in a single file

I have 20,000 ~1000-row dataframes, each of which has a name, in a 170GB pickle file at the moment. I'd like to write these to a file these so I can load them i

How to select several rows when reading a csv file using pandas?

I have a very large csv file with millions of rows and a list of the row numbers that I need.like rownumberList = [1,2,5,6,8,9,20,22] I know there is somethi

Plotting Time Series using pandas

I have a .csv file containing time series data with headers like Description, Date and Values. I am looking to make a line graph for this time series in such th

Plotting Time Series using pandas

I have a .csv file containing time series data with headers like Description, Date and Values. I am looking to make a line graph for this time series in such th

How to extract only English words from a from big text corpus using nltk?

I am want remove all non dictionary english words from text corpus. I have removed stopwords, tokenized and countvectorized the data. I need extract only the E

check if timestamp column is in date range from another dataframe

I have a dataframe, df_A with two columns 'amin' and 'amax', which is a set of time range. My objective is to find whether a column in df_B lies between any o