Category "pandas"

How to convert YYYYMM to YYYY-MM datetime format without day?

I have two datasets that have monthly frequencies. For one of them,df, I had to aggregate some data to turn it from daily to monthly using the following code: d

differenc between using panda.drop_duplicate or value_count on whole frame or one column

I am a new python user just for finish the homework. But I am willing to dig deeper when I meet questions. Ok the problem is from professor's sample code for d

How to calculate the session change of daily bars

I have a DF that looks like: date volume open close high low previous close 2022-05-02 1756159.0 118.38 119.57 120.34 116.49 2022-05-03 3217838.0 119.72 122.4

How to solve error with limits in boxplot (seaborn)?

The code used to plot the box plot: import seaborn as sns ax= sns.boxplot(x = "Current_Sim_Az_obj1",y= "RUT_Distance",data = df2,whis = (0,100),meanline= True,s

Refreshing data from csv in python using pandas

I'm new to python and trying to learn it on the go, i'm tring to make a data entry phonebook using python with pandas. There is the code I wrote: import pandas

Make a list from a data frame that has repeated and non repeat values in columns

I have a data frame like this data = [['Ma', 1,'too'], ['Ma', 1,'taa'], ['Ma', 1,'tuu',],['Ga', 2,'too'], ['Ga', 2,'taa'], ['Ga', 2,'tuu',]] df = pd.DataFra

How to replace a list inside a multildimensional array?

I was solving this question on SO and faced a few problems with the methods I was trying. OP has a list which looks like this, a = [[[100, 90, 80, 255],

Hey guys i was trying read csv file using pandas in pycharm i am getting this error how to resolve it ,i was able to run in googlecolab but in pycharm [duplicate]

#i am getting url error how do i resolve it C:\Python\python.exe E:/data_science/Python_basic/module1_eda/EDA.py Traceback (most recent

pandas datetime to unix timestamp seconds

From the official documentation of pandas.to_datetime we can say, unit : string, default ‘ns’ unit of the arg (D,s,ms,us,ns) denote the unit,

How can I save a dataframe into an excel sheet based on number of the worksheet (not a name)?

Here is my DF. data3 = {'DCF Years': ['1st', '2nd', '3rd','4th','5th'], 'DCF Amt': ['8.5', '6.5', '10.5', '4.5', '12.5']} df = pd.DataFrame (data3, columns

Expanding Records Based On Date Range Pandas

I am attempting to expand the records in a data frame between two dates. Given the input file of single entry for each record, I want to expand it based on a gi

Expanding dataset and filling missing dates in Pandas

The raw dataset is below: DF When the start and end dates differ, we require daily granularity. Daily granularity ensures each row has the same start and end d

Splitting Array of Lists into named subarrays

Splitting Arrays for Test Train Essentially I am attempting to convert a pandas dataframe into numpy arrays so that I can run it through a Test/Train. My goal h

Pandas datetime column to ordinal

I'm trying to create a new Pandas dataframe column with ordinal day from a datetime column: import pandas as pd from datetime import datetime print df.ix[0:5]

"Reindex" only fills the first two rows with new values

I am new to stackoverflow. I hope I can formulate my question clearly. I am using reindex to fill out missing dates in a pandas dataframe: df = pd.read_csv('myf

python does not recognize R functions with ”.“ in the function name

When I call the contrasts.fit function in the R package, limma, I get an error, which I suspect is due to the "." reason, python may not recognize contrasts.fit

How to get the previous rows close and apply it to the next row in a new column called previous close

I have a bunch of daily ohlc data like so: date,volume,open,close,high,low 2022-05-02,1756159.0,118.38,119.57,120.34,116.49 2022-05-03,3217838.0,119.72,122.4,12

i dont get SettingWithCopyWarning [duplicate]

I love pandas and have been working in the library for years now, but I have never understood SettingWithCopyWarning or its documentation. In

Question about selecting rows and columns from a DataFrame (Python) [duplicate]

I'm following this tutorial to select specific rows and columns from a DataFrame. The tutorial example shows that you can use: adult_names = t

CTF Who can train the neural network, CNN and Signal Processing?

I was trying the challenge in InsomniHack and could not figure it out for 4 weeks. In this example, print(y.shape, train_data.shape, train_labels.shape)(5000,)