Category "dataframe"

How can I remove all non-numeric characters from all the values in a particular column in pandas dataframe?

I have a dataframe which looks like this: A B C 1 red78 square big235 2 green circle small123 3 blue45 triangle big657

removing NA values from a DataFrame in Python 3.4

import pandas as pd import statistics df=print(pd.read_csv('001.csv',keep_default_na=False, na_values=[""])) print(df) I am using this code to create a data

VS Code no longer showing option to view DataFrame in Data Viewer

I'm working with pandas in VS Code and I've been using the View value in Data Viewer option to look at my Data frames while debugging. For some reason VS Code h

How to replace pandas dataframe column names with another list or dictionary matching

I need to replace column names of a pandas DataFrame having names like 'real_tag' and rename them like 'Descripcion' list = [{'real_tag': 'FA0:4AIS0007', 'Descr

How to use write.table to download a dataframe into a nice csv/Excel file?

I am trying to use the write.table() function, within Shiny downloadHandler(), to download the df reactive dataframe as a .csv file, per the reproducible code a

Python: create a pandas data frame from a list

I am using the following code to create a data frame from a list: test_list = ['a','b','c','d'] df_test = pd.DataFrame.from_records(test_list, columns=['my_let

MemoryError: Unable to allocate 1.88 GiB for an array with shape (2549150, 99) and data type object

I have a problem. I want to normalize with pd.json_normalize(...) a list with inside dict but unfortunately I got a MemoryError. Is there an option to work arou

Initialize a column with missing values and copy+transform another column of a dataframe into the initialized column

I have a messy column in a csv file (column A of the dataframe). using CSV, DataFrames df = DataFrame(A = ["1", "3", "-", "4", missing, "9"], B = ["M", "F", "R

Rolling OLS Regressions and Predictions by Group

I have a Pandas dataframe with some data on race car drivers. The relevant columns look like this: |Date |Name |Distance |avg_speed_calc |---- |-

Uncomfortable output of mode() in pandas Dataframe

I have a dataframe with several columns (the features). >>> print(df) col1 col2 a 1 1 b 2 2 c 3 3 d 3 2 I woul

How to subset Pandas Dataframe using an OR operator whilst avoiding "FutureWarning: elementwise comparison failed;"

I have a Pandas dataframe (tempDF) of 5 columns by N rows. Each element of the dataframe is an object (string in this case). For example, the dataframe looks li

pandas diff() giving 0 value for first difference, I want the actual value instead

I have df: Hour Energy Wh 1 4 2 6 3 9 4 15 I would like to add a column that shows the per hour differenc

Combining Python variables into SQL queries

I am pulling data from an online database using SQL/postgresql queries and converting it into a Python dataframe using Pandas. I want to be able to change the d

Joining on datetime64[ns, UTC] fails using pandas.join

I'm trying to join two pandas.DataFrames on a datetime64[ns, UTC] field and it's failing with a ValueError (described below) that is not intuitive to me. Consid

Chunking DataFrame by gaps in datetime index

First of all, my apologies if the title was too ambiguous. I have a pd.DataFrame with datetime64 as a dtype of index. These indices, however, are not equally

Python Pandas add Filename Column CSV

My python code works correctly in the below example. My code combines a directory of CSV files and matches the headers. However, I want to take it a step furthe

Pandas - find specific value in entire dataframe

I have a dataframe and I want to search all columns for values that is text 'Apple'. I know how to do it with one column, but how can I apply this to ALL column

How to extract values from key value map?

I have a column of type map, where the key and value changes. I am trying to extract the value and create a new column. Input: ----------------+ |symbols

Dataframe Column name not defined PowerBI Python Integration

i wrote code to visualize matplotlib bar chart using the python Jupiter notebook. But now I wanted to integrate that code with powerBI. That dataset includes 3

Insert a row to pandas dataframe

I have a dataframe: s1 = pd.Series([5, 6, 7]) s2 = pd.Series([7, 8, 9]) df = pd.DataFrame([list(s1), list(s2)], columns = ["A", "B", "C"]) A B C 0 5