I have been running seasonal_decompose() from the statsmodels on about 20 totally different datasets. Is it standard that the seasonality is 7 when looking at a
I am trying to extract the data in the table at https://www.ecoregistry.io/emit-certifications/ra/10 Using the google developer tools>network tab, I am able
I want to save my data in the CSV format, I have some sentences and I want to save every sentence in a different row, but the output is like this: This is my c
I want to be able to organize data for efficiency and constantly update the order of that data based on frequency of access, relevancy, and accuracy. For exampl
I have a data frame and I wanted to generate a new column for colour codes which stars from red for the least value of Opportunity and moves toward green for hi
I'm currently working with the Python framework - Prefect (prefect.io) I wrote the code below from prefect import Flow, task @task def say_hello(): print('H
I was searching for the best ways for feature selection in a regression problem & came across a post suggesting mutual info for regression, I tried the same
I'm an aspiring data scientist. I stumbled across the titanic dataset. I tried to use logistic regression for the problem. However, I got stuck. Since I have tw
I have a mining dataset which has a following features Rock_type, Gold in grams(AU). Rock type has 8 different rock types and Gold (AU) has pr
I ran a series of simulations and want to create a response surface of the performance based off my two parameters, tol and eta. The issue I'm having is actuall
If I have data that easily fits into memory, but I need to iterate over it hundreds or thousands of times, is there a faster way? For instance, if I have 400k d
My y going in and both y_train and y_eval are binary int, what am I doing wrong? I noticed the predictions going out are like this [0.,1.,0. ...] which is proba
parent_folder / subfolder1 / subsubfolder1/ a.py b.py subsubfolder2/ c.py d.py e.py subfolder2 / subsubfolder2/ f.py g.py subfolder3 / h.py i.py g.py I want to
While working on a project I have come across a weird error, where fitting my model works perfectly but when I apply gridsearch it gives me an error. The code p
I have the following 2 dfs: diag id encounter_key start_of_period end_of_period 1 AAA 2020-06-12 2021-07-07 1 BBB 2021-12-31 2022-01-04 drug id start_datetime
Following is my sample data: data = {850.0: 6, -852.0: 5, 992.0: 29, -993.0: 25, 990.0: 27, -992.0: 28, 965.0: 127, 988.0: 37, -994.0: 24, 996.0: 14, -996.0: 1
[here] I tried to do it with sp.hstack() and with
I'm new in machine learning and I'm trying to train a model. I'm using this Keras oficial example as a guide to set my dataset and feed it into the model: https
You can see my dataframe below, x values are different value, but other values are same with left values, for example, column 15 and column 16 are same value. I
I have this struggle with a dataheavy project. I can run a file that uses a query file -- Al the query's and converters are in here -- without problems, but whe