'Ratio between predicted and actual in confidence intervals

I have a collection of vectors, each vector contains 6 parameters in the range of $3\sigma$, i.e, each parameter has an average value $\pm 3\sigma$ (i will call it 3sig for short). My Goal is to predict a vector containing 2498 elements for each vector of parameters. To do so I created a RandomForest regressor and it did the job. The next step is to evaluate the model performance by comparing the predicted results with fiducial values (real values).

The first thing I have to do is: filter out the data into confidence intervals of 1sig, 2sig and 3sig and see how the predictor performs on each of these intervals by analyzing the ratio between predicted values and actual values (in particular I shall analyze the maximum value and the minimum value of this ratio).

To get the ratio between predictions "pred" and actual values "y_test", I used

Ratio_All =pred.div(np.exp(y_test).reset_index(drop=True).set_axis([i for i in range(0,28)],axis=1))

The rest_index part is to make the rows' numbers start at 0 in order to not obtain NaN values when dividing the two dataframes.

With this ratio, I can filter the maximum and minimum values by appyling .max() and .min():

plt.figure(figsize=(8, 6), dpi=100)
plt.xscale('log');plt.yscale('linear');plt.xlim(2,2500)
plt.xlabel(r'$\ell$')
plt.ylabel(r'$ \hat{C_\ell}^\mathrm{TT}/C_\ell^\mathrm{TT}$')
plt.axhline(y=1, color='k', linestyle='--')
plt.plot(l, Ratio_All.max(), '-b')
plt.plot(l, Ratio_All.min(), '-r')

As an output, I get the figure:

enter image description here

This figure corresponds to the maximum and minimum of the ratio in the confidence interval of 3sig. Now, doing the same for data in 2sig and data in 1sig i should get two similar curves but INSIDE the curves for 3sig, because the data in 1sig is a subset of the data in 2sig and the data in 2sig is a subset of the data in 3sig (right???). But... that's not what I get... actually if I do this, i get the following figure:

enter image description here

(It's the same as before, but for l<30 it's in log scale) Where the black line corresponds to the maximum and minimum of the ratio in 3sig, the green line is the data in 1sig and the yellow line is the data in 2sig. The red line is a "test data" in 3sig that i generated to test again, and it performs A LOT worse than the original training set...

My question is: is my approach wrong and where? how can I fix it?

My hypothesis is: the predictor is and expert in predicting data in 3sig, but is worse when compared to data in 1sig and 2sig. The solution would be to make a predictor to each interval, but it does not explain why the red curve is worse than the black one when they should be equal (once it's the same confidence interval)



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source