Simple CART model example

My goal is to test Decision tree to regression model. My data is like below(python dataframe). There are 2 features F1 and F2. And there is label which is number. How to make CART model from this using sklearn or Tensorflow? (I've searched the examples but they look complex for beginner like me.)

import pandas as pd
df = pd.Dataframe({'F1',[a,a,b,b],'F2',[a,b,a,b],'Label',[10,20,100,200]})

F1 F2 label
a  a  10
a  b  20
b  a  100
b  b  200

Topic cart decision-trees

Category Data Science


In the sklearn docs it is stated that:

scikit-learn uses an optimised version of the CART algorithm; however, scikit-learn implementation does not support categorical variables for now.

So you could use sklearn.tree.DecisionTreeRegressor

from sklearn.datasets import load_diabetes
from sklearn.model_selection import cross_val_score
from sklearn.tree import DecisionTreeRegressor
X, y = load_diabetes(return_X_y=True)
regressor = DecisionTreeRegressor(random_state=0)
cross_val_score(regressor, X, y, cv=10)

There is also a TF implementation.

import tensorflow_decision_forests as tfdf
import pandas as pd

dataset = pd.read_csv("project/dataset.csv")
tf_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(dataset, label="my_label")

model = tfdf.keras.CartModel()
model.fit(tf_dataset)

print(model.summary())

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.