Category "pandas"

Read timeout in pd.read_parquet from S3, and understanding configs

I'm trying to simplify access to datasets in various file formats (csv, pickle, feather, partitioned parquet, ...) stored as S3 objects. Since some users I supp

create dataframe from dictionary of datetime and int

I have datetime and int values dictionary like below. end_date = datetime.datetime.strptime("01-12-2020", "%d-%m-%Y") details = { datetime.datetime.strptime

Adding new column based on combined criteria in Pandas Groupby

Following on from my previous question (thanks to those responding) I'm stuck again in achieving what I suspect is possible using a groupby in Pandas. Here's wh

How to create a new column showing when a change to an observation occurred?

I have a data-frame formatted like so: Contract Agreement_Date Date A 2017-02-10 2020-02-03 A 2017-02-10 2020-02-04 A 2017-02-11 2020-02-09 A 2017-02-11 2020-0

Inserting Data to SQL Server from a Python Dataframe Quickly

I have been trying to insert data from a dataframe in Python to a table already created in SQL Server. The data frame has 90K rows and wanted the best possible

What's the computational complexity of .iloc[] in pandas dataframes?

I'm trying to understand what's the execution complexity of the iloc function in pandas. I read the following Stack Exchange thread (Pandas DataFrame search is

What's the computational complexity of .iloc[] in pandas dataframes?

I'm trying to understand what's the execution complexity of the iloc function in pandas. I read the following Stack Exchange thread (Pandas DataFrame search is

How do I force a blank for rows in a dataframe that have any str or character apart from numerics?

I have a datframe >temp Age Rank PhoneNumber State City 10 1 99-22344-1 Ga abc 15 12 No Ma xyz For the column(Phone Numbe

How to reshape my dataset in specific way?

I have a dataset: name val a a1 a a2 b b1 b b2 b b3 c c1 I want to make all possible permutations "names" which are not

How to create a new columns based off of values of other columns which could contain #s or NaN?

I have a few dataframes that I'm merging based on known, populated fields. The resulting dataframe will always contain a set of columns, but may or may not have

Add a new record for each missing row in a DataFrame with TimeStamp without replacing the original records

Be the next Pandas DataFrame: | date | counter | |-------------------------------------|------------------| | 2

How can I read this csv data?

I'm getting an error: Error tokenizing data. C error: Expected 1 fields in line 88, saw 4 while trying to read this data: import pandas as pd df = pd.read_csv

Pandas - Take value n month before

I am working with datetime. Is there anyway to get a value of n months before. For example, the data look like: dft = pd.DataFrame( np.random.randn(100, 1),

Scraping the English Vivino.com reviews from the website

I have two questions about web scraping information from Vivino.com: 1.) With the code below I can scrape information and reviews from the Vivino website, howev

Remove specific string char at the beginning of each lines of a txt file using python

I'm currently working on a script in python. I want to convert an xls file into a txt file but I also want to clean and manage the data. In the xls files, there

Need to add row above the headers of Dataframe in pandas

I have a dataframes, I need to add 8 rows above the header of dataframe, I am sharing dataframe and the desired output Dataframe:- Toll No. Vr.name

Creating multiple figures out of for loop

I am trying to loop through my table and to create 3 different figures. This is my code .... tab_stat = pd.read_table('test.txt', delim_whitespace=True) radius

Most efficient way to search over a DataFrame in Python [duplicate]

I have a DataFrame having these kind of data : df = pd.DataFrame({ 'id' : ['a', 'a', 'b', 'b', 'c', 'c'], 'alias' : ['value'+str(i) fo

Ingesting An Null Int Column: Pandas and Pandera

I am using pandas with pandera for schema validation, but I've run into a problem since there's a null integer column in the data. from prefect import task, Flo

How to conditionally assign values from another dataframe?

I want to merge 2 dataframes without using the function '.merge' and I try to assign a value to a dataframe column based on an interval and an id. intervals = p