'Using forestci to create error bars for random forest regression algorithms
I am using a program called GALPRO to implement a random forest regression algorithm to predict photometric redshift estimates. It uses a random forest algorithm as a method of machine learning. I input testing and training data. I use x_train (dimensions = [90,13]), x_train (dimensions = [10,13]) y_train (dimensions = [90,2]) and y_test (dimensions = [10,2]).
The code below shows how GALPRO does the random forest regression calculation:
model = RandomForestRegressor(**self.params)
model.fit(x_train, y_train)
I then make point estimate predictions using:
# Use the model to make predictions on new objects
y_pred = model.predict(x_test)
I am then trying to create error estimates using the forestci package random_forest_error:
y_error = fci.random_forest_error(model, x_train, x_test)
However I get an error:
ValueError Traceback (most recent call last)
/tmp/ipykernel_2626600/1096083143.py in <module>
----> 1 point_estimates = model.point_estimate(save_estimates=True, make_plots=False)
2 print(point_estimates)
/scratch/wiay/lara/galpro/galpro/model.py in point_estimate(self, save_estimates, make_plots)
158 # Use the model to make predictions on new objects
159 y_pred = self.model.predict(self.x_test)
--> 160 y_error = fci.random_forest_error(self.model, self.x_train, self.x_test)
161
162 # Update class variables
~/.local/lib/python3.7/site-packages/forestci/forestci.py in random_forest_error(forest, X_train, X_test, inbag, calibrate, memory_constrained, memory_limit)
279 n_trees = forest.n_estimators
280 V_IJ = _core_computation(
--> 281 X_train, X_test, inbag, pred_centered, n_trees, memory_constrained, memory_limit
282 )
283 V_IJ_unbiased = _bias_correction(V_IJ, inbag, pred_centered, n_trees)
~/.local/lib/python3.7/site-packages/forestci/forestci.py in _core_computation(X_train, X_test, inbag, pred_centered, n_trees, memory_constrained, memory_limit, test_mode)
135 """
136 if not memory_constrained:
--> 137 return np.sum((np.dot(inbag - 1, pred_centered.T) / n_trees) ** 2, 0)
138
139 if not memory_limit:
<__array_function__ internals> in dot(*args, **kwargs)
ValueError: shapes (90,100) and (100,10,2) not aligned: 100 (dim 1) != 10 (dim 1)
I'm not sure what this error means or why my dimensions are wrong as I am following a similar example. If anyone has any ideas please let me know!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
