How to encode ordinal data before applying linear regression in STATA?

I have a data set that has student performance marks (continuous and dependent variable), Teacher Qualification (Ordinal and independent variable containing categories: Masters, Bachelors, High School). I want to apply the regression analysis to check the impact of teacher qualification on student's marks.

How can I encode ordinal data before applying linear regression?

Topic stata linear-regression encoding

Category Data Science


I think the best way is to dummy-encode teacher qualification. So each level of qualification enters the regression with a separate intercept term. Note that dummy-encoding always works against a contrast level. So when "Master degree" is the base-level, you will see the effect of "Bachelor" compared to "Master" etc.

You can dummy-encode in Stata by using the i. prefix, e.g. summarize i.size. In a regression you would use reg y i.x.

See the Stata docs for details.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.