Deciding Initial Weights In A Linear Classifier For Sentiment Analysis

Question

Deciding Initial Weights In A Linear Classifier For Sentiment Analysis

Suhail Gupta

2022年5月26日 16:00

I would like to build a simple sentiment analysis classifier using logistic regression. I downloaded a list of positive and negative words from cs.uic.edu. There are more than 6000 words both positive and negative. Linear Classifier has the form: (Wikipedia Reference)

$$\sum wj*xj$$

where $w$ is the weight of the vector $x$. So for example, if the weight of vector awesome is 3, then in the following sentence:

Food is awesome and music is awesome.

according to the formula, it will become:

$$3 * 2$$

where 3 is the weight of the vector awesome and 2 is the vector itself (denotes the number of times it occurs in a sentence)

My question is how do I decide the coefficients to start with? Will it be a manual process? There are more than 6000 words. What is the way to approach this?

Topic machine-learning-model logistic-regression sentiment-analysis classification machine-learning

Category Data Science

Brian Spiering · Accepted Answer · 2021年8月11日 17:37

If you are adding up the occurrences of positive or negative words to predict sentiment, there is no reason to build a machine learning model.

In order to build a logistic regression model, you need labeled data. It is not clear what the labels you are using for your problem.

The initial weights for a model depend on the optimization technique. Logistic regression is often optimized with gradient descent. If gradient descent is used, the initial weights are random. Then the model learns how to adjust the weight to minimize errors on the target.

Deciding Initial Weights In A Linear Classifier For Sentiment Analysis

About