'Using forestci to create error bars for random forest regression algorithms

I am using a program called GALPRO to implement a random forest regression algorithm to predict photometric redshift estimates. It uses a random forest algorithm as a method of machine learning. I input testing and training data. I use x_train (dimensions = [90,13]), x_train (dimensions = [10,13]) y_train (dimensions = [90,2]) and y_test (dimensions = [10,2]).

The code below shows how GALPRO does the random forest regression calculation:

model = RandomForestRegressor(**self.params)
model.fit(x_train, y_train)

I then make point estimate predictions using:

# Use the model to make predictions on new objects
y_pred = model.predict(x_test)

I am then trying to create error estimates using the forestci package random_forest_error:

 y_error = fci.random_forest_error(model, x_train, x_test)

However I get an error:

ValueError                                Traceback (most recent call last)
/tmp/ipykernel_2626600/1096083143.py in <module>
----> 1 point_estimates = model.point_estimate(save_estimates=True, make_plots=False)
      2 print(point_estimates)

/scratch/wiay/lara/galpro/galpro/model.py in point_estimate(self, save_estimates, make_plots)
    158         # Use the model to make predictions on new objects
    159         y_pred = self.model.predict(self.x_test)
--> 160         y_error = fci.random_forest_error(self.model, self.x_train, self.x_test)
    161 
    162         # Update class variables

~/.local/lib/python3.7/site-packages/forestci/forestci.py in random_forest_error(forest, X_train, X_test, inbag, calibrate, memory_constrained, memory_limit)
    279     n_trees = forest.n_estimators
    280     V_IJ = _core_computation(
--> 281         X_train, X_test, inbag, pred_centered, n_trees, memory_constrained, memory_limit
    282     )
    283     V_IJ_unbiased = _bias_correction(V_IJ, inbag, pred_centered, n_trees)

~/.local/lib/python3.7/site-packages/forestci/forestci.py in _core_computation(X_train, X_test, inbag, pred_centered, n_trees, memory_constrained, memory_limit, test_mode)
    135     """
    136     if not memory_constrained:
--> 137         return np.sum((np.dot(inbag - 1, pred_centered.T) / n_trees) ** 2, 0)
    138 
    139     if not memory_limit:

<__array_function__ internals> in dot(*args, **kwargs)

ValueError: shapes (90,100) and (100,10,2) not aligned: 100 (dim 1) != 10 (dim 1)

I'm not sure what this error means or why my dimensions are wrong as I am following a similar example. If anyone has any ideas please let me know!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Using forestci to create error bars for random forest regression algorithms

Sources

Related Questions