Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Identify Categorical/Dichotomous and Continuous Variables


From   Steven Samuels <sjhsamuels@earthlink.net>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Identify Categorical/Dichotomous and Continuous Variables
Date   Sun, 5 Oct 2008 13:26:50 -0400

Agreed.

-Steve
On Oct 5, 2008, at 1:11 PM, Nick Cox wrote:

Good advice, but to repeat for Frank and any others puzzled: Stata does
not think of such variables as nominal, indicator, etc. If something is
(say) coded 0, 1 or missing then any user is perfectly entitled to think
of it as an indicator/dummy, etc., but Stata does not think that way.
It's in your mind, not Stata's.

Nick
n.j.cox@durham.ac.uk

Steven Samuels

To turn any variable in Stata into a nominal variable, you create
indicator variables. This is what SPSS does when you use a
categorical variable as a predictor in regression. There are two ways
of doing this in Stata: a) -xi- or b) -tab- with the -gen()- option.
See http://www.ats.ucla.edu/stat/Stata/webbooks/reg/chapter3/
statareg3.htm Section 3.3 for some examples.

On Oct 5, 2008, at 12:33 PM, Nick Cox wrote:


This is not quite true. In particular, -anova- has an idea of the
distinction. If you specify that a variable is categorical or
continuous, or imply that by default, -anova- takes action
accordingly.

But in general, as others have emphasised or implied, Stata puts the
onus on users to decide how they want variables to be treated. If you
want -foreign- in the auto data to be a binary response for -logit-,
that's fine. If you want to average it with -summarize-, that's fine
too. Sometimes, Stata will refuse to do something on principle; more
usually, it assumes that you are smart enough to know what you want to
do.

# of distinct values is, as Svend will agree, a criterion to be used
circumspectly. I often deal with rainfall data usually measured by
convention to a resolution of 0.1 mm. I bet that the number of
distinct
values met in practice is fewer than that in the typical
classifications
of death, disease or economic activity.

Nick
n.j.cox@durham.ac.uk

Svend Juul

As Martin responded: Stata has no formal distinction between
continuous and categorical numeric variables. However, the
command

codebook, compact

may tell you what you want. The -Unique- column tells you
how many "unique" (meaning different) values each variable
has.

Frank

I am new to Stata: moved from SPSS a week ago. I am hoping
that someone can help me with what I imagine is a simple
issue. I saved an SPSS file as a Stata one. I am working
my way through the user guide and the data management
manual, but I am having difficulty with confirming whether
Stata recognizes variables as continuous (or scale) or
categorical/dichotomous (or nominal). In SPSS, you can
easily identify whether the type of measure is a scale,
nominal, or string with its drop down menu in the variable
view. It would be a great help, and I would appreciate it
very much if someone would tell me the method to confirm
the data type for categorical/dichotomous and for
continuous variables? Thank you.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index