Topic models for non-textual data?
I am looking to employ an unsupervised clustering on a dataset where each observation has a mix of textual and non-textual features.
For each observation, I combine the features into a single vector of ~1000 dimensions. To cluster I have two potential ideas:
- Using an autoencoder (or an embedding?) to reduce the dimensionality of the data and then cluster using k-means.
- Could I use a topic model? If so, isn't this the superior method in most circumstances to the above?
Why are topic models (in my experience) not commonly used for non-textual data? Is this just a relic of their name/original application, or is there something more fundamental?
Thanks!
Topic unsupervised-learning topic-model k-means clustering
Category Data Science