How to deal with one output for multiple inputs?
Hei!
I want to train a model, that predicts the sentiment of news headlines. I've got multiple unordered news headlines per day, but one sentiment score.
What is a convenient solution to overcome the not 1:1 issue?
I could:
- Concatenate all headlines to one string, but that feels a bit wrong, as an LSTM or CNN will use cross-sentence word relations, that don't exist.
- Predict one score per headline (1:1), and take the average in the application. But that might miss some cross-news dependencies.
I want that
- only one value/category is predicted for multiple headline
- the order of the news doesn't matter (ideally without shuffling)
- the number of headlines per day is variable (would also be open to just pick random 10 headlines)
What's the usual handling for this?
Topic deep-learning sentiment-analysis text-mining neural-network
Category Data Science