Category "pandas"

How can I split the document path to the foldername and the document name in python?

I need to split the document path to the foldername and the document name in python. It is a large dataframe including many rows.For the filename with no docu

How to change specific column to rows without changing the other columns in pandas?

I have dataframe like this: Date ID Age Gender Fruits 1.1.19 1 50 F Apple 2.1.19 1 50

joblib.Memory and pandas.DataFrame inputs

I've been finding that joblib.Memory.cache results in unreliable caching when using dataframes as inputs to the decorated functions. Playing around, I found tha

joblib.Memory and pandas.DataFrame inputs

I've been finding that joblib.Memory.cache results in unreliable caching when using dataframes as inputs to the decorated functions. Playing around, I found tha

pandas using qcut on series with fewer values than quantiles

I have thousands of series (rows of a DataFrame) that I need to apply qcut on. Periodically there will be a series (row) that has fewer values than the desired

Plot multiple Y axes

I know pandas supports a secondary Y axis, but I'm curious if anyone knows a way to put a tertiary Y axis on plots. Currently I am achieving this with numpy+pyp

pandas create Cross-Validation based on specific columns

I have a dataframe of few hundreds rows , that can be grouped to ids as follows: df = Val1 Val2 Val3 Id 2 2 8 b 1 2 3 a 5

Pandas split dataframe column for every character

i have multiple dataframe columns which look like this: Day1 0 DDDDDDDDDDBBBBBBAAAAAAAAAABBBBBBDDDDDDDDDDDDDDDD 1 DDDDDDDDDDBBBB

add columns different length pandas

I have a problem with adding columns in pandas. I have DataFrame, dimensional is nxk. And in process I wiil need add columns with dimensional mx1, where m = [1,

Python pandas DataFrame from first and last row of csv

All - I am looking to create a pandas DataFrame from only the first and last lines of a very large csv. The purpose of this exercise is to be able to easily g

Get info on multiple stock tickers quickly using yfinance

I am trying to get the current price and market cap of all of the tickers in the S&P500 and the way I am currently doing it is very slow, so I was wondering

Pandas, group by with max return AssertionError:

There's something wrong with pandas, and I would like your opinion, I've this Dataframe where I need to get the max values, code is just below, df_stack=pd.

Pandas dataframe count values above threshold using groupby - code optimization

I have a large pandas dataframe where I want to count the number of values above a threshold (zero) in each column grouped by the values in one name column. Th

pandas reshape multiple columns fails with KeyError

For a pandas dataframe of: defined by: import pandas as pd df = pd.DataFrame({'id':[1,2,3], 're_foo':[1,2,3], 're_bar':[4,5,6], 're_foo_baz':[0.4, 0.8, .9],

ModuleNotFoundError: No module named 'validate_email'

I am trying to execute the following code in python pandas. from email_validator import validate_email from pandas import DataFrame, read_csv import pandas as

Pandas: sum DataFrame rows for given columns

I have the following DataFrame: In [1]: df = pd.DataFrame({'a': [1, 2, 3], 'b': [2, 3, 4], 'c': ['dd', 'ee', 'ff'],

How to change the parameter dynamically in pd.DateOffset in python code?

date_ranges_values = request.POST['range'] ft = [df.index[-1] + DateOffset(date_ranges_values = lambda x:x) for x in range(0, 24))] Suppose I get value in

T-Test in Python for multiple group comparisons

I would like to conduct a simple t-test in python, but I would like to compare all possible groups to each other. Let's say I have the following data: import p

Percentage difference between any two columns of pandas dataframe

I would like to have a function defined for percentage diff calculation between any two pandas columns. Lets say that my dataframe is defined by: R1 R2 R3

Create Python DataFrame from dictionary where keys are the column names and values form the row

I am familiar with python but new to panda DataFrames. I have a dictionary like this: a={'b':100,'c':300} And I would like to convert it to a DataFrame, wher