'PyCaret ensemble_model() function accepts ONLY one model for Bagging method

I'm using PyCaret library.

First I compare best models applied to my dataset with:

top4 = compare_models(sort = 'RMSE', fold = 5, n_select=4)

Results:

    Model   MAE MSE RMSE    R2  RMSLE   MAPE    TT (Sec)
rf  Random Forest Regressor 1.3217  5.8631  2.4211  0.8420  0.2463  1261552464798.9094  6.1560
et  Extra Trees Regressor   1.3251  5.9131  2.4313  0.8407  0.2468  771463480917.2954   3.3100
lightgbm    Light Gradient Boosting Machine 1.4299  6.1775  2.4852  0.8336  0.2524  1751772071502.1594  0.1680
knn K Neighbors Regressor   1.3378  6.2570  2.5011  0.8315  0.2513  1410202704281.8367  0.2420

After, I want to improve my RMSE by testing ensembling methods as Bagging with PyCaret 's library.

Bagging method is implemented by ensemble_model() PyCaret function.

I get an error with: ensemble_model(top4)(default method is "Bagging").

ValueError: Estimator [RandomForestRegressor(bootstrap=True, ccp_alpha=0.0, criterion='mse',
                      max_depth=None, max_features='auto', max_leaf_nodes=None,
                      max_samples=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      n_estimators=100, n_jobs=-1, oob_score=False,
                      random_state=123, verbose=0, warm_start=False), ExtraTreesRegressor(bootstrap=False, ccp_alpha=0.0, criterion='mse',
                    max_depth=None, max_features='auto', max_leaf_nodes=None,
                    max_samples=None, min_impurity_decrease=0.0,
                    min_impurity_split=None, min_samples_leaf=1,
                    min_samples_split=2, min_weight_fraction_leaf=0.0,
                    n_estimators=100, n_jobs=-1, oob_score=False,
                    random_state=123, verbose=0, warm_start=False), LGBMRegressor(boosting_type='gbdt', class_weight=None, colsample_bytree=1.0,
              importance_type='split', learning_rate=0.1, max_depth=-1,
              min_child_samples=20, min_child_weight=0.001, min_split_gain=0.0,
              n_estimators=100, n_jobs=-1, num_leaves=31, objective=None,
              random_state=123, reg_alpha=0.0, reg_lambda=0.0, silent='warn',
              subsample=1.0, subsample_for_bin=200000, subsample_freq=0), KNeighborsRegressor(algorithm='auto', leaf_size=30, metric='minkowski',
                    metric_params=None, n_jobs=-1, n_neighbors=5, p=2,
                    weights='uniform')] does not have the required fit() method.

It seems ensemble_model() accepts ONLY one model as argument.

If I want it works fine, I have to call ensemble_model() with only 1 model among the 4 best models as: ensemble_model(top4[0]) or compare_models(top4[1]) ...

I don't understand because Bagging = Bootstrap + aggregating. So aggregating step should use severals models! And I would like to do a Bagging with best 4 models getted by compare_models() above!

And here, PyCaret ensemble_model() use only ONE model. Can you explain me please? Thanks.



Solution 1:[1]

It seems, that you are looking for Gray codes

C# code

public static IEnumerable<int[]> GrayCodes(int length, int radix) {
  if (length < 0)
    throw new ArgumentOutOfRangeException(nameof(length));
  if (radix < 0)
    throw new ArgumentOutOfRangeException(nameof(radix));

  if (0 == length || 0 == radix)
    yield break;

  static int digit(long n, int radix, int i) =>
    (int)(Math.Floor(n / Math.Pow(radix, i)) % radix);

  double count = Math.Pow(radix, length);

  long max = count > long.MaxValue ? long.MaxValue : (long)count;

  for (long i = 0; i < max; ++i) {
    int[] result = new int[length];
    int shift = 0;

    for (int j = length - 1; j >= 0; j--) {
      var x = (digit(i, radix, j) + shift) % radix;

      shift += radix - x;
      result[length - j - 1] = x;
    }

    yield return result;
  }
}

Demo

var report = string.Join(Environment.NewLine, GrayCodes(2, 3)
  .Select(g => string.Join(" ", g)));

Console.Write(report);

Outcome:

0 0
0 1
0 2
1 2
1 0
1 1
2 1
2 2
2 0

If you want to flatten the sequence, add SelectMany:

var report = string.Join(", ", GrayCodes(2, 3).SelectMany(g => g));

Console.WriteLine(report);

Console.WriteLine();

// Your case, length = 3, radix = 2
report = string.Join(", ", GrayCodes(3, 2).SelectMany(g => g));

Console.WriteLine(report);

Outcome:

0, 0, 0, 1, 0, 2, 1, 2, 1, 0, 1, 1, 2, 1, 2, 2, 2, 0

0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0

Edit: If "only one change at every step" means not only "in one position" (so 2 1 0 -> 2 1 2 is allowed) but in "in one position and by +1 or -1 only" you can still use Gray code. The algorithm (let me name it "To and Fro") is below.

  1. Start from all zeroes 0...000
  2. Start incrementing nth digit while it's possible: 0..00 -> 0..01 -> .. -> 0..00dn
  3. Increment n-1th digit and start decrementing nth digit: 0..00dn -> 0..01dn -> ... 0..011 -> 0..10
  4. Increment n-1th digit and start incrementing nth digit: 0..10 -> 0..20 -> 0..021 -> ...0..2dn
  5. On 0..0dn-1dn increment n-2th index and start decrementing nth etc.

Demo

# start from 000
    000 -> 001 -> 002 -> 
# increment n-1 th digit, decrementing n-th
 -> 012 -> 011 -> 010 ->    
# increment n-1 th digit, incrementing n-th
 -> 020 -> 021 -> 022 -> 
# increment n-2 th digit, decrementing n-th
 -> 122 -> 121 -> 120 ->
# decrement n-1 th digit, incrementing n-th
 -> 110 -> 111 -> 112 ->
# decrement n-1 th digit, decrementing n-th
 -> 102 -> 101 -> 100 -> 
# increment n-2 th digit, incrementing n-th
 -> 200 -> 201 -> 202 ->
# increment n-1 th digit, decrementing n-th
 -> 212 -> 211 -> 210 -> 
# increment n-1 th digit, incrementing n-th
 -> 220 -> 221 -> 222

C# Code: (Please, fiddle)

public static IEnumerable<int[]> GrayCodes(int length, int radix) {
  if (length < 0)
    throw new ArgumentOutOfRangeException(nameof(length));
  if (radix < 0)
    throw new ArgumentOutOfRangeException(nameof(radix));

  if (0 == length || 0 == radix)
    yield break;

  int[] signs = Enumerable.Repeat(1, length).ToArray();
  int[] current = new int[length];

  for (bool keep = true; keep; ) {
    yield return current.ToArray();

    keep = false;

    for (int i = current.Length - 1; i >= 0; --i) {
      int d = current[i] + signs[i];

      if (d >= 0 && d < radix) {
        current[i] = d;

        for (int j = i + 1; j < signs.Length; ++j)
          signs[j] = -signs[j];
        
        keep = true;

        break;
      }
    }
  }
}

Demo:

Console.Write(string.Join(Environment.NewLine, GrayCodes(3, 3)
  .Select(line => string.Join(" ", line))));

Outcome:

0 0 0
0 0 1
0 0 2
0 1 2
0 1 1
0 1 0
0 2 0
0 2 1
0 2 2
1 2 2
1 2 1
1 2 0
1 1 0
1 1 1
1 1 2
1 0 2
1 0 1
1 0 0
2 0 0
2 0 1
2 0 2
2 1 2
2 1 1
2 1 0
2 2 0
2 2 1
2 2 2

Solution 2:[2]

The Math

There may be multiple answers, I see Gray codes in the other answer, but this is my own solution. Examples are given in decimals (S=10), as I think decimals are easier to understand than binaries. But the method works for any values of S.

For 2 digits decimals (L=2, S=10), I can have:

00, 01, 02, 03, ... 09, 19, 18, 17, 16, ... 10, 20, 21, 22, ...

That is, after 09, instead of 10, we just go to 19 instead, and start counting backwards to 18, 17, ... 10. Then we go to 20, and repeat:

20, ... 29, 39, ... 30, 40, ... 49, 59, ...

Ultimately we will reach 90: 30, 40, ... 50, 60, ... 70, 80, ... 90

But we cannot go to 100, because that involves 2 digits that are changed. But we can use the same trick of counting backwards:

090, 190, 191, ... 199, 189, 188, ... 180, 170, 171, ... 179, 169, ...

The Function

In Python. Explanation as comments in code.

def f(n, l, s):
    # Get all digits of n. 
    # Example for decimals (s=10), if n = 143, digits = [1, 4, 3]
    digits = []
    cur_n = n
    for i in range(l):
        cur_digit = cur_n%s 
        digits.append(cur_digit)
        cur_n = int(cur_n / s)
    digits.reverse() # digits = [1, 4, 3]

    # Continuing the same decimals example if n = 143, the output should be 153.
    # Explanation below. We first copy the digits.
    output_digits = list(digits)

    # First digit is the same.
    output_digits[0] = digits[0]

    # For each digit
    for i in range(len(digits) - 1):
        digit=digits[i]
        next_digit = digits[i+1]

        # If it is an odd number
        if digit%2 == 1:
            # Then the next_output_digit is 9 - next_digit
            next_output_digit = s - 1 - next_digit
        else:
            # Else, the next_output_digit is the same as next_digit
            next_output_digit = next_digit
        output_digits[i+1] = next_output_digit

    # Convert [1, 5, 3] to ["1", "5", "3"]
    output_digits_string = [str(digit) for digit in output_digits]
    # Convert to "153"
    output_string = "".join(output_digits_string)
    # Finally, return output_string
    return int(output_string)

To test the algorithm, run these 2 examples:

for i in range(199): print(f(i,3,10)) # Decimals, 3 digits, first 199 values
for i in range(8): print(f(i,3,2)) # Binaries, 3 digits, all 8 values

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Register Sole