Category "pandas"

pandas: most elegant way to pivot table on pattern in name of columns

Given the following DataFrame: pd.DataFrame({ 'x': [0, 1], 'y': [0, 1], 'a_idx': [0, 1], 'a_val': [2, 3], 'b_idx': [4, 5], 'b_val': [6, 7], }) What

Remove non-business days rows from pandas dataframe

I have a dataframe with a timeseries data of wheat in df. df = wt["WHEAT_USD"] 2016-05-02 02:00:00+02:00 4.780 2016-05-02 02:01:00+02:00 4.777 2016-05-02

Problems while plotting time series against user logins?

I have a large pandas dataframe, which is a log of user ids that login in a website: id datetime 130 2018-05-17 19:46:18 133 2018-05-17 20:5

Converting Pandas dataframe into Spark dataframe error

I'm trying to convert Pandas DF into Spark one. DF head: 10000001,1,0,1,12:35,OK,10002,1,0,9,f,NA,24,24,0,3,9,0,0,1,1,0,0,4,543 10000001,2,0,1,12:36,OK,10002,1

pandas select range from index column

I need to make a function to select a range of the index (first col). 1880 Aachen 1 Valid L5 21.0 Fell 50.77500 6.08333 (50.775000, 6.083330)

Melt and pivot a dataframe in Python?

I'm working with a publicly available election data set that I've imported into Pandas as a df: fips_code county total_2008 dem_2008 gop

Split cell into multiple rows in pandas dataframe

I have a dataframe contains orders data, each order has multiple packages stored as comma separated string [package & package_code] columns I want to split

Remove rows that contain False in a column of pandas dataframe

I assume this is an easy fix and I'm not sure what I'm missing. I have a data frame as such: index c1 c2 c3 2015-03-07 01:2

Pandas dataframe in pyspark to hive

How to send a pandas dataframe to a hive table? I know if I have a spark dataframe, I can register it to a temporary table using df.registerTempTable("table_

python dataframe pandas drop column using int

I understand that to drop a column you use df.drop('column name', axis=1). Is there a way to drop a column using a numerical index instead of the column name?

compare multiple columns of pandas dataframe with one column

I have a dataframe: df- A B C D E 0 V 10 5 18 20 1 W 9 18 11 13 2 X 8 7 12 5 3 Y 7 9 7 8 4 Z 6 5 3 90

compare multiple columns of pandas dataframe with one column

I have a dataframe: df- A B C D E 0 V 10 5 18 20 1 W 9 18 11 13 2 X 8 7 12 5 3 Y 7 9 7 8 4 Z 6 5 3 90

Pandas dataframe in pyspark to hive

How to send a pandas dataframe to a hive table? I know if I have a spark dataframe, I can register it to a temporary table using df.registerTempTable("table_

pandas data mining from Eurostat

I'm starting a work to analyse data from Stats Institutions like Eurostat using python, and so pandas. I found out there are two methods to get data from Eurost

Pandas get topmost n records within each group

Suppose I have pandas DataFrame like this: df = pd.DataFrame({'id':[1,1,1,2,2,2,2,3,4],'value':[1,2,3,1,2,3,4,1,1]}) which looks like: id value 0 1

How to determine whether a column/variable is numeric or not in Pandas/NumPy?

Is there a better way to determine whether a variable in Pandas and/or NumPy is numeric or not ? I have a self defined dictionary with dtypes as keys and nume

Get DataFrame with the number of rows for each time interval

Given the following DataFrame of pandas in Python: | ID | date | |--------------|------------------------------------

Splitting dataframe into multiple dataframes

I have a very large dataframe (around 1 million rows) with data from an experiment (60 respondents). I would like to split the dataframe into 60 dataframes (a d

How to use a df column in a vertica_python SQL query?

I have a dataframe with names that I set to a dictionary, like this: {1: "Bob", 41: "John", 126: "Jim", 167: "Pete"} I am using Vertica. I want to be able to p

How to use a df column in a vertica_python SQL query?

I have a dataframe with names that I set to a dictionary, like this: {1: "Bob", 41: "John", 126: "Jim", 167: "Pete"} I am using Vertica. I want to be able to p