Category "pandas"

How to get PRAW (the Python Reddit API Wrapper) to read submission ID?

Goal: I have collected hundreds of reddit posts' details in Excel sheets. Now, I want to collect comments on these Reddit posts using PRAW. Method: At first, I

data time Format recognition in exported excel with xlsxwriter

I didn't find a solution for this: From a dataframe I generate an excel and some columns need to be in format hh:mm:ss (with no limit to 24h, for example a valu

Repeated values in prediction with sequential model

The problem I got is with the result, I get the same value in the 'future' field in all the rows as follows. open high low close

Place the first value of Column B in Column C if Column A has same names in python pandas with loop [duplicate]

I have the following data set in python, Input I want to bring the first value of Column B that belongs to column Column A for a unique A val

Which Java class is compatible with python Pandas DataFrame when using DJL(Deep Java Library)?

I'm trying to import Python Tensorflow custom model to spring-boot using DJL Tensorflow, and the model gets Pandas DataFrame as both input and output. I'm wonde

Pandas find consecutive ones, column wise

I am having an output data frame like the one below and I wanted to format the output so that I can use it for the further pipeline. Few pointers about the data

Querying deeply nested and complex JSON data with multiple levels

I am struggling to break down the method required to extract data from deeply nested complex JSON data. I have the following code to obtain the JSON. import req

Linearregression of two dataframes

I have two dataframes: df = pd.DataFrame([{'A': -4, 'B': -3, 'C': -2, 'D': -1, 'E': 2, 'F': 4, 'G': 8, 'H': 6, 'I': -2}]) df2 looks like this (just a cutout; i

Can I use itertools.count to add values in a column, resetting at a certain point?

I'm trying to create a list of timestamps from a column in a dataframe, that resets after a certain time to zero. So, if the limit was 4, I want the count to ad

panda df not showing all rows after loading from MS SQL

I'm using Pandas with latest sqlalchemy (1.4.36) to query a MS SQL DB, using the following Python 3.10.3 [Win] snippet: import pandas as pd

Python pandas df.copy() ist not deep

I have (in my opinion) a strange problem with python pandas. If I do: cc1 = cc.copy(deep=True) for the dataframe cc and than ask a certain row and column: p

How to divide a groupby Object by pandas Series efficiently? Or how to convert yfinance multiple ticker data to another currency?

I am pulling historical price data for the S&P500 index components with yfinance and would now like to convert the Close & Volume from USD into EUR. Thi

Pagination not working in Python Session.put()

I am trying to upload a file to a website (that has an inbuilt API) using the following code. The code reads a list of medical codes/diagnoses codes etc. (1 col

Read excel file in python using pandas

I am trying to read excel file in pycharm using pandas. I installed the package successfully. My issue is that I am trying to use file location in addition to i

Join two pyarrow tables

I have orc with data as after. Table A: Name age school address phone tony 12 havard UUU 666 tommy 13 abc

replace the empty value in the dataframe with a list of python values

There is a list of shops |Shop ID| |-------| | Shop1 | | Shop2 | | Shop3 | There is a list of events that took place in the store |Shop ID| Event | Start_date

Error: pandas hashtable keyerror

I have successfully read a csv file using pandas. When I am trying to print the a particular column from the data frame i am getting keyerror. Hereby i am shari

How to apply Target Encoding in test dataset?

I am working on a project, where I had to apply target encoding for 3 categorical variables: merged_data['SpeciesEncoded'] = merged_data.groupby('Species')['Wnv

Make Seaborn Distplot and Barplot the same color [duplicate]

I have been unable to figure out how to set the colors between distplot and barplot to be the same. Despite setting the color argument in both

AWS Athena table from python output with dates - dates get wrongly converted

I have a pandas DataFrame containing a date column ("2022-02-02"). I write this table to parquet using pyarrow. df[col] = df[col].astype(str) df.to_parquet(loc)