I can give you some basics but the proper way to do this is to learn the prolog language, and it's very different from any other programming language.
In Prolog you define predicates (similar to functions) which contain conditions for the predicate to be satisfied, for example:
dt(Feat1, Feat2, Feat3, labelA) :-
Feat1 =< 3,
Feat2 == 'green',
Feat3 >= 0.54.
dt(Feat1, Feat2, Feat3, labelB) :-
Feat1 =< 3,
Feat2 == 'green',
Feat3 < 0.54.
dt(Feat1, Feat2, _Feat3, labelC) :-
Feat1 > 3,
Feat2 == 'green'.
dt(_Feat1, Feat2, _Feat3, labelB) :-
Feat2 == 'blue'.
- Here the predictate is
dt
and it has 4 clauses. In order for dt
to be satisfied at least one of the clauses needs to to be true (i.e. all the conditions it contains must be true).
- Variables are represented with a capital, for instance
Feat1
, Feat2
, Feat3
. Prolog will try to find an instanciation of the variables which satisfies one of the clauses (it will actually try all the possibilities until one of them works, it's a solver).
- In each clause above the label is the last parameter of the predicate. Basically each clause means "if all the conditions after
:-
are satisfied, then the label is labelX".
Assuming the 4 clauses above have been saved in a file dt.pl
, one can use for instance the SWI Prolog interpreter like this:
?- consult('dt.pl').
true.
Now the way to use this toy decision tree would be:
?- dt(4,'green',0.3, Label).
Label = labelC .
?- dt(4,'blue',0.3, Label).
Label = labelB.
- Note the capital L for
Label
: it means that now we let prolog find a value of Label
which satisfies the predicate dt
given the values provided for the features.
This is the basic idea. A more advanced and more generic representation of the DT would be possible using recursion and unification, but this would require an advanced understanding of Prolog.