Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Hoaglin <dchoaglin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: find categorical variables |

Date |
Thu, 22 Mar 2012 06:22:38 -0400 |

Jakob, In this situation (and in the binary vs. continuous discussion), the decision should be based, first, on a clear understanding of the definition of the variable. That stage it does not involve looking at the data. It involves understanding the "measurement process." If a "continuous" variable takes too few values in a particular set of data, it might be appropriate to treat it as an (ordered) categorical variable. In a regression-like model, that choice may depend on whether the variable is the response or a predictor. A similar consideration applies when the variable is a count. Data that are naturally "continuous" or counts are sometimes collected in categories. Income is one common example. Analysts sometimes use the midpoint of the category, but that distorts the data by not accounting for variation that would have been present if the data had not been collected in categories. Also, an open-ended top category may require special treatment. In building a regression model, when one has enough data, it may be useful to turn a continuous variable into a detailed set of categories and fit a separate coefficient for each category, so that the data can guide the choice of functional form for that variable. If the analyst has not understood the nature of all the variables, what are the results worth? David Hoaglin * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: find categorical variables***From:*Jakob Petersen <jpeterb@essex.ac.uk>

- Prev by Date:
**Re: st: find categorical variables** - Next by Date:
**Re: st: Obtaining rrr's of margins after mlogit** - Previous by thread:
**Re: st: find categorical variables** - Next by thread:
**st: marginal effects after multinomial logits** - Index(es):