Category "pandas"

How to select top level columns in multi header pandas dataframe

I have a multi header dataframe and it looks like that: SPY ARKW Open Hig

Creating custom colourmap for geopandas.explore plot

all code: def rgb2hex(r,g,b): return '#{:02x}{:02x}{:02x}'.format(r,g,b) def rg(num): num = int(np.round((num / 100) * 124)) r = (124 - num) g

Convert JSON format column to new columns

I have a sub-Yelp Dataset in csv, and attributes column is in json format. I'm trying to convert that column to new columns, but none of the relevant code on di

BigQuery Results to Panda DataFrame in Chunks

I am trying to save the results of a BigQuery query to a Panda DataFrame using bigquery.Client.query.to_dataframe() This query can return millions of rows. Gi

How to do point biserial correlation for multiple columns in one iteration

I am trying to calculate a point biserial correlation for a set of columns in my datasets. I am able to do it on individual variable, however if i need to calcu

using statsmodels with a groupby

Consider this simple example import pandas as pd import statsmodels.formula.api as sm df = pd.DataFrame({'Y' : [1,2,3,4,5,6,7], 'X' : [2,3,4

Pandas and scikit-learn: KeyError: [....] not in index

I do not understand why do I get the error KeyError: '[ 1351 1352 1353 ... 13500 13501 13502] not in index' when I run this code: cv = KFold(n_splits=10) fo

Flatten a nested JSON? [duplicate]

I am trying to flatten the following JSON and flatten it hierarchically: https://justpaste.it/6e60p I am using pandas json_normalize function

Convert pandas.groupby to dict

Consider, dataframe d: d = pd.DataFrame({'a': [0, 2, 1, 1, 1, 1, 1], 'b': [2, 1, 0, 1, 0, 0, 2], 'c': [1, 0, 2, 1, 0, 2, 2]

Python: How to create multi line cells in excel when exporting a pandas dataframe

I have the following pandas Dataframe df = pd.DataFrame([ [['First Line', 'Second line']], [['First line', 'second line', 'third line']], [['first l

Load Pandas Dataframe to S3 passing s3_additional_kwargs

Please excuse my ignorance / lack of knowledge in this area! I'm looking to upload a dataframe to S3, but I need to pass 'ACL':'bucket-owner-full-control'. i

Python plotly Scattermapbox define colors by category

I want to draw some colored areas on a map. The coordinates are defined in a dataframe and I want each area to have a different color depending on the test_type

Python dictionary, how can I create a key with a string and the actual key combined?

I hope this is a quite easy question, but for me without a lot of python background I can't find an answer. df = pd.DataFrame( {'Messung': ['10bar','10bar',

OptionError:'Pattern matched multiple keys' pandas

I am trying to read a excel file. import requests url = 'http://www.nepalstock.com/todaysprice/export' r = requests.get(url, allow_redirects=True) open('todaypr

Search and filter text from a column using Pyspark

I am new to Data Scraping. I am reading the data from a file having JSON objects as one row {"name": "Soul Sweet \u2018Taters (Step-by-Step!)", "ingredients":

How do I melt a pandas with custom nam

I have a table like this device_type version pool testMean testP50 testP90 testP99 testStd WidgetMean WidgetP50 WidgetP90 WidgetP99 WidgetStd PNB0Q

How do I melt a pandas with custom nam

I have a table like this device_type version pool testMean testP50 testP90 testP99 testStd WidgetMean WidgetP50 WidgetP90 WidgetP99 WidgetStd PNB0Q

Percent change using Pandera for Pandas DataFrame

I have the following DataFrame. I need to do validation of balance and other numeric measures over date range. I want to check if for any group and date, the ba

Python Script to find file names from CSV will not concatenate

I am writing a script that will allow me to extract a segment of image files from a large folder. I put the image file names into a dataframe. I am having prob

Network Flow Dataframe - Merging Memory Error - Unable to allocate array with shape and data type

I have big 3 CSV files and they are all 76 same columns. The number of rows are different 17809 rows - 124262 rows - 108779 rows I am trying to merge these 3 d