Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Confirming whether a variable is binary or continuous


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Confirming whether a variable is binary or continuous
Date   Fri, 16 Mar 2012 22:34:25 +0000

You can also do this by e.g.

assert inlist(var, 0. 1)

Nick

On Fri, Mar 16, 2012 at 10:28 PM, daniel klein
<klein.daniel.81@googlemail.com> wrote:
> Bert,
>
> as you already realized, there is no possibility to tell whether a
> variable is intended to be a binary indicator or merely happens to
> only have values 0 and 1. For this purpose you will need more
> information on that variable. An option, indicating continuous
> variables, seems to be a good idea.
>
> However, I would like to add some thoughts here.
>
> Checking for binary variables -tabulate- is useful but the information
> in r(r) is not all it has to offer. Note that a variable with values 1
> and 2 will also result in r(r) = 2 and therfore will be declared a
> binary variable by your program. Here is how I checked for binary
> variables in one of my programs using -tabulate- with -matrow()-
> option
>
> [...]
> tempname M
> qui ta <var> ,matrow(`M')
> if (r(r) != 2) | (`M'[1, 1] != 0) | (`M'[2, 1] != 1) {
>        di "<var> is not a binary variable"
> }
> [...]
>
> You will have to make sure <var> is not a string variable, as it is
> not allowed to use option -matrow()- with string variables. If you do
> not want to check, you can use -levelsof- to get the values of any
> variable. In any case, user-written software is not required here
> (although the first versions of -levelsof- were, at least partly,
> user-written by Nick Cox, as far as I know).
>
> I would not use -compress- as it is, in general, a bad idea to make
> (any) changes to the user's dataset if these changes are not the very
> purpose of your program. You could use -preserve- to avoid permanent
> changes but my guess is your program will execute faster if you just
> use -tabulate- (as shown above) in a loop for all numeric variables
> (not declared "continuous" by the user).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index