Category "pandas"

How to create a new table in a MySQL DB from a pandas dataframe

I recently transitioned from using SQLite for most of my data storage and management needs to MySQL. I think I've finally gotten the correct libraries installed

Check for existence of multiple columns

Is there a more sophisticated way to check if a dataframe df contains 2 columns named Column 1 and Column 2: if numpy.all(map(lambda c: c in df.columns, ['Colum

Convert a Column to Column Header

I have a list of dict containing x and y. I want to make x the index and y the column headers. How can I do it? import pandas pt1 = {"x": 0, "y": 1, "val": 3,}

dataframe transformation python products atrributes values

I have an excel file with products like this below. Is it possible to align the same kind of attributes to same column using python? I have this category name

pandas: most elegant way to pivot table on pattern in name of columns

Given the following DataFrame: pd.DataFrame({ 'x': [0, 1], 'y': [0, 1], 'a_idx': [0, 1], 'a_val': [2, 3], 'b_idx': [4, 5], 'b_val': [6, 7], }) What

Remove non-business days rows from pandas dataframe

I have a dataframe with a timeseries data of wheat in df. df = wt["WHEAT_USD"] 2016-05-02 02:00:00+02:00 4.780 2016-05-02 02:01:00+02:00 4.777 2016-05-02

Problems while plotting time series against user logins?

I have a large pandas dataframe, which is a log of user ids that login in a website: id datetime 130 2018-05-17 19:46:18 133 2018-05-17 20:5

Converting Pandas dataframe into Spark dataframe error

I'm trying to convert Pandas DF into Spark one. DF head: 10000001,1,0,1,12:35,OK,10002,1,0,9,f,NA,24,24,0,3,9,0,0,1,1,0,0,4,543 10000001,2,0,1,12:36,OK,10002,1

pandas select range from index column

I need to make a function to select a range of the index (first col). 1880 Aachen 1 Valid L5 21.0 Fell 50.77500 6.08333 (50.775000, 6.083330)

Melt and pivot a dataframe in Python?

I'm working with a publicly available election data set that I've imported into Pandas as a df: fips_code county total_2008 dem_2008 gop

Split cell into multiple rows in pandas dataframe

I have a dataframe contains orders data, each order has multiple packages stored as comma separated string [package & package_code] columns I want to split

Remove rows that contain False in a column of pandas dataframe

I assume this is an easy fix and I'm not sure what I'm missing. I have a data frame as such: index c1 c2 c3 2015-03-07 01:2

Pandas dataframe in pyspark to hive

How to send a pandas dataframe to a hive table? I know if I have a spark dataframe, I can register it to a temporary table using df.registerTempTable("table_

python dataframe pandas drop column using int

I understand that to drop a column you use df.drop('column name', axis=1). Is there a way to drop a column using a numerical index instead of the column name?

compare multiple columns of pandas dataframe with one column

I have a dataframe: df- A B C D E 0 V 10 5 18 20 1 W 9 18 11 13 2 X 8 7 12 5 3 Y 7 9 7 8 4 Z 6 5 3 90

compare multiple columns of pandas dataframe with one column

I have a dataframe: df- A B C D E 0 V 10 5 18 20 1 W 9 18 11 13 2 X 8 7 12 5 3 Y 7 9 7 8 4 Z 6 5 3 90

Pandas dataframe in pyspark to hive

How to send a pandas dataframe to a hive table? I know if I have a spark dataframe, I can register it to a temporary table using df.registerTempTable("table_

pandas data mining from Eurostat

I'm starting a work to analyse data from Stats Institutions like Eurostat using python, and so pandas. I found out there are two methods to get data from Eurost

Pandas get topmost n records within each group

Suppose I have pandas DataFrame like this: df = pd.DataFrame({'id':[1,1,1,2,2,2,2,3,4],'value':[1,2,3,1,2,3,4,1,1]}) which looks like: id value 0 1

How to determine whether a column/variable is numeric or not in Pandas/NumPy?

Is there a better way to determine whether a variable in Pandas and/or NumPy is numeric or not ? I have a self defined dictionary with dtypes as keys and nume