Using LSTM for multi label classification

Question

Using LSTM for multi label classification

funkyFunk

2022年5月3日 22:05

I am trying to use LSTMs to train and predict authors using reviews data and metadata

author  phone  country day     review 
james   iphone chile   tuesday the book was really amazing

How do I pass all these features into the network?

Topic lstm keras multilabel-classification deep-learning python

Category Data Science

Tanvir Sajed · Accepted Answer · 2021年7月16日 18:25

Since the review of an author is bound to change in terms of the number of words being used in the review, I would suggest using a Keras Sequential() model to build an LSTM encoder for the review itself. The final hidden layer of the review LSTM encoder can then be fed into another LSTM encoder with 3 words (phone, country, and day). Think of the last LSTM encoder as a sequential 3 worded message. The final layer of this LSTM can then be joined with a softmax layer to predict the author.

The reason I suggested two different LSTMs has something to do with word embeddings. Word embeddings are basically features that get modified in multiple hidden layers in an LSTM. Having two different word embeddings means that the same word in the Review Embedding may mean different in the other LSTM embedding, ie have different embedding vectors after training. It ensures that the features are independent. You can argue that you might need 4 different LSTM and Embedding Layers because of 4 different features. That is definitely a viable option, but it might need a lot more time to train. Having more LSTMs will significantly add hyperparameters and the training time will be really high.

That being said, you may not need an LSTM to solve this problem. You can easily convert the phone, country, and day values into integers. There are 7 days in a week so the value of day can only be one from 0 to 6. That reduces the dimension of features significantly thereby reducing the training time. You can probably try a TF-IDF for the review feature or a naive bag of word model and use a Random Forest or SVM Classifier for the prediction of Author. It would be interesting to perform multiple experiments with and without LSTMs and observe results with accuracy and performance.

Using LSTM for multi label classification

About