Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Confirming whether a variable is binary or continuous

From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Confirming whether a variable is binary or continuous
Date   Fri, 16 Mar 2012 22:34:25 +0000

You can also do this by e.g.

assert inlist(var, 0. 1)


On Fri, Mar 16, 2012 at 10:28 PM, daniel klein
<[email protected]> wrote:
> Bert,
> as you already realized, there is no possibility to tell whether a
> variable is intended to be a binary indicator or merely happens to
> only have values 0 and 1. For this purpose you will need more
> information on that variable. An option, indicating continuous
> variables, seems to be a good idea.
> However, I would like to add some thoughts here.
> Checking for binary variables -tabulate- is useful but the information
> in r(r) is not all it has to offer. Note that a variable with values 1
> and 2 will also result in r(r) = 2 and therfore will be declared a
> binary variable by your program. Here is how I checked for binary
> variables in one of my programs using -tabulate- with -matrow()-
> option
> [...]
> tempname M
> qui ta <var> ,matrow(`M')
> if (r(r) != 2) | (`M'[1, 1] != 0) | (`M'[2, 1] != 1) {
>        di "<var> is not a binary variable"
> }
> [...]
> You will have to make sure <var> is not a string variable, as it is
> not allowed to use option -matrow()- with string variables. If you do
> not want to check, you can use -levelsof- to get the values of any
> variable. In any case, user-written software is not required here
> (although the first versions of -levelsof- were, at least partly,
> user-written by Nick Cox, as far as I know).
> I would not use -compress- as it is, in general, a bad idea to make
> (any) changes to the user's dataset if these changes are not the very
> purpose of your program. You could use -preserve- to avoid permanent
> changes but my guess is your program will execute faster if you just
> use -tabulate- (as shown above) in a loop for all numeric variables
> (not declared "continuous" by the user).

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index