Category "pandas"

Extracting a .7z File into a Pandas Data Frame

I am Using a Jupyter notebook (google colab) to try and extract data from a .7z file into a pandas dataframe, using linux commands. The data is from http://untr

Calculate Decay Rate in Python

I have dataset which somewhat follows an exponentional decay df_A Period Count 0 1600 1 894 2 959 3 773 4 509 5 206 I want

Create numpy array from function applied to (multiple) pandas columns

I have pd.DataFrame containing rows of values: import pandas as pd df = pd.DataFrame({"col1": [1, 2, 3, 4, 5, 6], "col2": [6, 5, 4, 3, 2, 1]}) I now want to f

pandas exlewriter.book does not read my excel file and even break the existed file

I want to stack a series of dataframe in one excel file and I wrote the code below. if os.path.isfile(result) is False: with pd.ExcelWriter(result, engine='

Functional Programming: How does one create a new column in a multi-index data frame that is a function of another column?

Suppose the below simplified dataframe. (The actual df is much, much bigger.) How does one assign values to a new column f such that f is a function of another

Pandas+Uncertainties producing AttributeError: type object 'dtype' has no attribute 'kind'

I want to use Pandas + Uncertainties. I am getting a strange error, below a MWE: from uncertainties import ufloat import pandas number_with_uncertainty = ufloa

Error while converting csv to parquet file using pandas

I would like to upload csv as parquet file to S3 bucket. Below is the code snippet. df = pd.read_csv('right_csv.csv') csv_buffer = BytesIO() df.to_parquet(csv_b

Multiple aggregations of the same column using pandas GroupBy.agg()

Is there a pandas built-in way to apply two different aggregating functions f1, f2 to the same column df["returns"], without having to call agg() multiple times

add a column in dataframe based on existing value in another dataframe

I have a dataframe DF3 : zone_id combine 0 ABD 10 BCD 20 ABC 30 ABE and a second dataframe :combinaison_df: zone_id combine 0

How can I create a cross-tab of two columns in a dataframe in Python and generate a total row and column in the output?

I have created a dataframe from a CSV file and now I'm trying to create a cross-tab of two columns ("Personal_Status" and "Gender"). The output should look like

Missing data error on adfuller test although I cleaned for inf and nans

Currently I am working on a data set which has many time-dependent variables. I ran adfuller for all and changed the non-stationary ones to percentage change (t

Pandas approximating/rounding large numbers from csv

I am reading numbers from a csv file into a pandas dataframe. When the numbers I am reading are approximately >1E12, pandas will approximate the number to 3

How to create ratios using value counts and separate fields in Python?

Using the data frame shown below I'd like to create manager to assistant and manager to associate percentages/ ratios based/ per location. I'm looking for the

Searching a value within range between columns in pandas (not date columns and no sql)

thanks in advance for help. I have two dataframes as given below. I need to create column category in sold frame based on information in size frame. It should c

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I want to filter my dataframe with an or condition to keep rows with a particular column's values that are outside the range [-0.25, 0.25]. I tried: df = df[(df

Sort multiIndex table based on other table

I have a multiIndex data frame like this probe_names PLAGL1 GRB10 MEST H19 KCNQ1OT1 MEG3 MEG8 SNRPN \ Patient_1 0 0.55 0.53 0.53

Compare two excel files for the difference using pandas with multiple tabs

I found this nice script online which does a great job comparing the differences between 2 excel sheets but there's an issue - it doesn't work if the excel file

I have a dataframe with a json substring in 1 of the columns. i want to extract variables and make columns for them

imports json df = pd.read_json("C:/xampp/htdocs/PHP code/APItest.json", orient='records') print(df) I would like to create three columns extra: ['name','l

how to "transpose" datas from a date to another one in python

Sorry i had a lot of trouble explaining my problem in the title but i hope it will be more understandable with this example : i have a data source that tells me

Pandas rolling window cumsum, with incomplete series

I have a pandas df as follows: YEAR MONTH USERID TRX_COUNT 2020 1 1 1 2020 2 1 2 2020 3 1 1 2020 12