'which package has griddata?scipy or matplotlib.mlab?

I ran my teacher's code and got wrong that it said cannot import name 'griddata' from 'matplot.mlab', i coded then 'from scipy import griddata' and it got 'griddata() got an unexpected keyword argument 'interp' because there is a line 'z = griddata(x,y,z,xi,yi,interp='linear')'. i don't know how to modify this code

here is the code :

from matplotlib import cm
from matplotlib.mlab import griddata  # i used 'from scipy.interpolate import griddata' , and it could work ,but showed error '

x = np.genfromtxt(sys.argv[2], usecols=(0))
y = np.genfromtxt(sys.argv[2], usecols=(1))
z = np.genfromtxt(sys.argv[2], usecols=(2))

xi = np.linspace(x.min(), x.max(), 1000)
yi = np.linspace(y.min(), y.max(), 1000)
zi = griddata(x, y, z, xi, yi, interp='linear')

when i ran it got error 'griddata() got an unexpected keyword argument 'interp'. I changed ' interp' to 'method' because scipy.doc shows it can use (method='linear'),but it also got error 'griddata() got an unexpected keyword argument 'method'.



Solution 1:[1]

To get this information you need to get the booster object back, I assume you are using the scikit-learn interface, so for example using a model with 3 estimators (trees) and a maximum depth of 7:

import xgboost as xgb
from sklearn.datasets import make_classification

X,y = make_classification(random_state=99)
clf = xgb.XGBModel(objective='binary:logistic',n_estimators = 3,max_depth = 7)
clf.fit(X,y)

In this case, we pull out the object and also covert the tree information to a dataframe:

booster = clf.get_booster()

tree_df = booster.trees_to_dataframe()
tree_df[tree_df['Tree'] == 0]

    Tree    Node    ID  Feature Split   Yes No  Missing Gain    Cover
0   0   0   0-0 f11 -0.233068   0-1 0-2 0-1 48.161629   25.00
1   0   1   0-1 f1  -1.081945   0-3 0-4 0-3 0.054384    9.25
2   0   2   0-2 f14 0.480458    0-5 0-6 0-5 8.410727    15.75
3   0   3   0-3 Leaf    NaN NaN NaN NaN -0.150000   1.00
4   0   4   0-4 Leaf    NaN NaN NaN NaN -0.535135   8.25
5   0   5   0-5 f18 0.261421    0-7 0-8 0-7 5.638095    6.50
6   0   6   0-6 f9  -1.585489   0-9 0-10    0-9 0.727795    9.25
7   0   7   0-7 f18 -0.640538   0-11    0-12    0-11    4.342857    4.00
8   0   8   0-8 f0  0.072811    0-13    0-14    0-13    1.028571    2.50
9   0   9   0-9 Leaf    NaN NaN NaN NaN 0.163636    1.75
10  0   10  0-10    Leaf    NaN NaN NaN NaN 0.529412    7.50
11  0   11  0-11    Leaf    NaN NaN NaN NaN -0.120000   1.50
12  0   12  0-12    Leaf    NaN NaN NaN NaN 0.428571    2.50
13  0   13  0-13    Leaf    NaN NaN NaN NaN -0.000000   1.00
14  0   14  0-14    Leaf    NaN NaN NaN NaN -0.360000   1.50

Visualizing the first tree. The depth of a decision tree is the number of splits from a root to a leaf, so this tree has a depth of 4 :

xgb.plotting.plot_tree(booster, num_trees=0)

enter image description here

Maybe there's a better solution, but quickly I used the solution from this post, to iterate through the json output and calculate the depth of each tree:

def item_generator(json_input, lookup_key):
    if isinstance(json_input, dict):
        for k, v in json_input.items():
            if k == lookup_key:
                yield v
            else:
                yield from item_generator(v, lookup_key)
    elif isinstance(json_input, list):
        for item in json_input:
            yield from item_generator(item, lookup_key)

def tree_depth(json_text):
    json_input = json.loads(json_text)
    return max(list(item_generator(json_input, 'depth'))) + 1

[tree_depth(x) for x in booster.get_dump(dump_format = "json")]

[4, 4, 5]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1