Category "pandas"

SQLAlchemy ORM conversion to pandas DataFrame

Is there a solution converting a SQLAlchemy <Query object> to a pandas DataFrame? Pandas has the capability to use pandas.read_sql but this requires use o

pandas row values to column headers

I have a daraframe like this df = pd.DataFrame({'id1':[1,1,1,1,2,2,2],'id2':[1,1,1,1,2,2,2],'value':['a','b','c','d','a','b','c']}) id1 id2 value 0 1

Python Pandas replace NaN in one column with value from corresponding row of second column

I am working with this Pandas DataFrame in Python. File heat Farheit Temp_Rating 1 YesQ 75 N/A 1 NoR 115 N/A

How to fix Python Numpy/Pandas installation?

I would like to install Python Pandas library (0.8.1) on Mac OS X 10.6.8. This library needs Numpy>=1.6. I tried this $ sudo easy_install pandas Searching

How can I split the document path to the foldername and the document name in python?

I need to split the document path to the foldername and the document name in python. It is a large dataframe including many rows.For the filename with no docu

How to change specific column to rows without changing the other columns in pandas?

I have dataframe like this: Date ID Age Gender Fruits 1.1.19 1 50 F Apple 2.1.19 1 50

joblib.Memory and pandas.DataFrame inputs

I've been finding that joblib.Memory.cache results in unreliable caching when using dataframes as inputs to the decorated functions. Playing around, I found tha

joblib.Memory and pandas.DataFrame inputs

I've been finding that joblib.Memory.cache results in unreliable caching when using dataframes as inputs to the decorated functions. Playing around, I found tha

pandas using qcut on series with fewer values than quantiles

I have thousands of series (rows of a DataFrame) that I need to apply qcut on. Periodically there will be a series (row) that has fewer values than the desired

Plot multiple Y axes

I know pandas supports a secondary Y axis, but I'm curious if anyone knows a way to put a tertiary Y axis on plots. Currently I am achieving this with numpy+pyp

pandas create Cross-Validation based on specific columns

I have a dataframe of few hundreds rows , that can be grouped to ids as follows: df = Val1 Val2 Val3 Id 2 2 8 b 1 2 3 a 5

Pandas split dataframe column for every character

i have multiple dataframe columns which look like this: Day1 0 DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD 1 DDDDDDDDDDBBBB

add columns different length pandas

I have a problem with adding columns in pandas. I have DataFrame, dimensional is nxk. And in process I wiil need add columns with dimensional mx1, where m = [1,

Python pandas DataFrame from first and last row of csv

All - I am looking to create a pandas DataFrame from only the first and last lines of a very large csv. The purpose of this exercise is to be able to easily g

Get info on multiple stock tickers quickly using yfinance

I am trying to get the current price and market cap of all of the tickers in the S&P500 and the way I am currently doing it is very slow, so I was wondering

Pandas, group by with max return AssertionError:

There's something wrong with pandas, and I would like your opinion, I've this Dataframe where I need to get the max values, code is just below, df_stack=pd.

Pandas dataframe count values above threshold using groupby - code optimization

I have a large pandas dataframe where I want to count the number of values above a threshold (zero) in each column grouped by the values in one name column. Th

pandas reshape multiple columns fails with KeyError

For a pandas dataframe of: defined by: import pandas as pd df = pd.DataFrame({'id':[1,2,3], 're_foo':[1,2,3], 're_bar':[4,5,6], 're_foo_baz':[0.4, 0.8, .9],

ModuleNotFoundError: No module named 'validate_email'

I am trying to execute the following code in python pandas. from email_validator import validate_email from pandas import DataFrame, read_csv import pandas as

Pandas: sum DataFrame rows for given columns

I have the following DataFrame: In [1]: df = pd.DataFrame({'a': [1, 2, 3], 'b': [2, 3, 4], 'c': ['dd', 'ee', 'ff'],