Biasing SVM algorithm towards particular subset of data
I'm training an SVM model for sentiment analysis, based on social media data eg. tweets.
The model will be trained using a small selection of a particular company's tweets in order to classify new ones. However, since the training set is too small to get an accurate model I will be combining the company's data with a much larger general tweets dataset to train the model.
Being specialised to one company, the content of the respective data is slightly different to the content of the general dataset. Since the data to be predicted is company specialised, it seems logical to me to bias the models training towards giving greater importance to the company related tweets to improve the accuracy. My first thought was simply increasing the magnitude of the polarity of the companies tweets, ie general tweets are -1 or 1 and company tweets are -3 and 3, for example.
Is this the right idea/method?
Topic sentiment-analysis scikit-learn svm social-network-analysis dataset
Category Data Science