Transformer time series classification using time2vec positional embedding

I want to use a transformer model to do classification of fixed-length time series. I was following along this tutorial using keras which uses time2vec as a positional embedding. According to the original time2vec paper the representation is calculated as $$ \boldsymbol{t2v}(\tau)[i] = \begin{cases} \omega_i \tau + \phi_i, i = 0\\ F(\omega_i \tau + \phi_i), 1 \leq i \leq k \end{cases} $$

The mentioned tutorial simply concatenates this embedding with the input. Now, I understand the intention of the original time2vec paper, that you generate different possible periods and scales of a given time variable $ \tau $ and essentially put that into one vector. The mentioned tutorial, however, applies the given formula on the input directly. So $ \tau $ is not the time variable or index, but the input value at that time. I have found this way of implementing time2vec for transformer models in other projects on github as well. But by doing this, you are not really adding any positional information, or are you? Because the positional embedding only depends on the value at that time, not the time itself. Am I misunderstanding or missing something here?

Topic transformer embeddings keras

Category Data Science


Your observation is interesting. All tutorials I have seen so far are using the time2vec layer in the way described by you. They are just transforming the input features into the time2vec embedding. In my opinion this would not reproduce the effects which are described in the original paper. A correct usage would use one time related feature (e.g. epoch time), scale it and then use this feature to produce the time2vec embedding.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.