# Treating discrete variables as if they were continuous

Francesca Vandrola writes,

Reading several papers published recently in Political Science journals (i.e. Journal of Politics, Political Behavior, etc), I find quite consistently that:

I) Authors have discrete variables such as

– Income, say, measured as follow: (1 = \$15,000 or under, 2 = \$15,001Â–\$25,000, 3 = \$25,001Â–\$35,000, 4 = \$35,001Â–\$50,000, 5 = \$50,001Â–\$65,000, 6 = \$65,001Â–\$80,000, 7 = \$80,001Â– \$100,000, 8 = over \$100,000)

– External efficacy, say, measured as follow: an index that sums responses from four questions: Â‘Â‘People like me donÂ’t have any say about what the government doesÂ’Â’, Â‘Â‘I donÂ’t think public officials care much what people like me thinkÂ’Â’, Â‘Â‘How much do you feel that having elections makes the government pay attention to what the people think?Â’Â’, and Â‘Â‘Over the years, how much attention do you feel the government pays to what the people think when it decides what to do?Â’Â’. The first two questions are coded 0 = agree, 0.5 = neither, and 1 = disagree. The third and fourth questions are coded 1 = a good deal, 0.5 = some, and 0 = not much.

– Church attendance, say, measured as an index of religious attendance, 1 = never/no religious preference, 2 = a few times a year, 3 = once or twice a month, 4 = almost every week, and 5 = every week.

And so on and so forth.

II) Authors include the above variables in their models (as explanatory variables) as if they were continuous. Why? I see the point of not including some of the variables as categorical predictors (say, if the variable has 9 categories), but I am less clear on there being a good rationale for some other cases. Wouldn’t it be preferable, especially if there are sufficient observations, to include some of those predictors in a categorical fashion? Maybe they will indeed behave like an index and have a linear effect… but maybe they won’t.