One example of Decision Tree Learning

One example of extrapolating from data under Decision Tree

T Miyamoto
1 min readSep 30, 2020

Decision Trees is a machine learning algorithm based on a flowchart-like structure, and it is divided into classification trees and regression trees, according to the type (either discrete or continuous) of target values. It is often said that under the machine learning algorithm, you can’t extrapolate to make predictions. Therefore it doesn’t have a clear advantage of making predictions in terms of time-series dataset.

But under certain conditions, we can extrapolate. For instance the data set we treat has cyclical properties. A dataset can be created with a noise as follows:

df_data=pd.DataFrame(pd.date_range(start='2020-05-25', end='2020-09-30', freq='D'), columns=['date0']);
df_data['month0']=df_data['date0'].dt.month;
df_data['day0']=df_data['date0'].dt.day;
df_data['noise0']=np.random.randint(low=-1,high=1,size=(df_data.shape[0],))
df_data['v1']=df_data['month0'] * 0.1 + np.sin( df_data['day0'] * 1.0) / ( 2 + 1.5 * np.sin(df_data['day0'] * 1.5) ) + df_data['noise0'];

Then I train the data, and make predictions, importing DecisionTreeRegressor from sklearn.tree and using its instance:

df_train=df_data.iloc[:105,:]
df_validation=df_data.iloc[105:,:]
print('train=\n', df_train[['date0', 'month0', 'day0','noise0']].to_numpy())
regressor0 = DecisionTreeRegressor(random_state=0, max_depth=10)
clf0=regressor0.fit(df_train[['month0', 'day0']].to_numpy(), df_train['v1'].to_numpy().reshape(-1,1))
df_validation['forecast1']=regressor0.predict(df_validation[['month0', 'day0']].to_numpy() )
imp_a0=permutation_importance(clf0, df_train[['month0', 'day0']].to_numpy(), df_train['v1'].to_numpy().reshape(-1,1) )
R2v=r2_score(y_true=df_validation['v1'].to_numpy().reshape(-1,1), y_pred=df_validation['forecast1'].to_numpy().reshape(-1,1) )

I’ve got the following results:

--

--