Dealing with observation with arbitrary number of categories with arbitary number of values
Suppose to have a set of elements $X = \{x_1, x_2, ..., x_n\}$. Each element is characterised by a set of features. The features characterising a particular element $x_i$ can belong to one of $q$ different categories. Each different category $f_q$ can have a different value $v_{q_i}$, belonging to a set of possible values $V_q = \{v_{q_1}, v_{q_2} ...\}$. So, an observation $x_i$ may be described as $x_i = \{f_{q_1} = v_{{q_1}_i}, f_{q_1} = v_{{q_1}_j}, ... f_{q_i} = v_{{q_i}_i}\}$. In order to make myself extra-clear, I will state some property of the element $x_i$, along with an example.
For an element $x_i$:
- It is possible that a particular feature category $f_{q_i}$ may appear more than once, but with a different value. So, for example, $x_i = \{..., f_{q_i} = 1, f_{q_i} = 4, ...\}$ (1, 4 are examples of values);
- Both $f_{q_i}$ (the id used to describe the category) and its set of values $V_{q_i}$ are categorical variables.
- An element $x_i$ may have an arbitrary number of feature categories describing it.
- The number of unique pairs of $(f_{q_i}, v_{{q_i}_k})$ appearing in the dataset is around $903$.
- The set of elements $X$ can appear in a set of observations $O = \{o_1, o_2, ..., o_m\}$, whose generic element $o_j$ can group multiple $x_i \in X$ as a sequence, and a final object $x_j$. The scope of the problem is to infer, for some given observations $O_{-j}$ the last element $x_j$, given the sequence.
How would you convert the categorical features in a meaningful way? I want to underline that my question refers only on how to convert these features within this particular setting, as the approaches I tried so far cannot solve the problem of an element sharing multiple arbitary categories of feature. The problem has been stated only because I wanted to make clear that target encoding is not a viable option, or is just possible in terms of how many times a particular pair $(f_{q_i}, v_{{q_i}_k})$ appears as belonging to an object in the set of final elements $X_J = \{x_{j_1}, x_{j_2}, ..., x_{j_m}\}$. This was, in fact, my initial approach. It is also possible that I have not fully understood target encoding, at this point.
Topic target-encoding categorical-encoding machine-learning
Category Data Science