Word Embedding for Item Names(integer, one-hot encoding)
I am looking for the way to get the similarity between two item names using integer encoding or one-hot encoding.
For example, lane connector vs. a truck crane.
I have 100,000 item names consisting of 2~3 words as above.
also, items have its size(36mm, 12M, 2400*1200...) and unit(ea, m2, m3, hr...)
I wanna make (item name, size, unit) as a vector. To do this, I need to change texts to numbers using some way. All I found is only word2vec things, but my case has no context corpus. So I don't think it is possible to learn some context from my data.
Topic word word-embeddings nlp python
Category Data Science