Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Dimitriy V. Masterov" <dvmaster@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Confirming whether a variable is binary or continuous |

Date |
Fri, 16 Mar 2012 18:01:47 -0400 |

Bert, Personally, I would use a t-test on a binary variable as long as I had enough data. I would use bitest with a small sample. I've never encountered a continuous variable with only 2 levels before in my work, but I can see how that's possible. That seems more of a data cleaning issue, rather than a statistical one. You can compress your data and then use storage type (see ds, hastype() for details) as an additional check on your binary variables to filter out those cases. DVM On Fri, Mar 16, 2012 at 5:43 PM, Bert Jung <bjung59@gmail.com> wrote: > Thanks Eric and Dimitriy, > > That would work but is it legitimate? It would seem to me that the > correct test for a continuous variable that just happens to have 2 > levels should be -ttest-. > > I guess my problem cannot be resolved without prior knowledge about > the variable. Purely from the data one wouldn't be able to tell if > the variable is binary by definition or by chance. I will add an > option to my program that allows the user to specify this distinction > ex ante and then double-check using your suggestion with -tab-. > > Apologies for not thinking through this properly before posting. > > Thanks! > Bert > > > > On Fri, Mar 16, 2012 at 5:28 PM, Eric Booth <eric.a.booth@gmail.com> wrote: >> <> >> >> One way is to -tabulate- the var and then use stored value in r(r) to tell how many values it has. You could also grab values from the user-written packages -egenmore- (form SJ, see the nvals() fcn) and -distinct- (from SSC) >> >> >> Example: >> >> ********* >> >> sysuse auto, clear >> >> ds, has(type numeric) >> foreach x in `r(varlist)' { >> quietly tabulate `x' >> if r(r) == 2 di in red `"`x' is binary"' >> if r(r)!=2 di "`x' is not binary" >> } >> ********* >> >> - Eric >> >> __ >> Eric A. Booth >> Public Policy Research Institute >> Texas A&M University >> ebooth@ppri.tamu.edu >> +979.845.6754 >> >> On Mar 16, 2012, at 4:18 PM, Bert Jung wrote: >> >>> Dear Statalisters, >>> >>> I am writing a short program to make a balance table that compares >>> covariates across a treatment and control group. I am looking for a >>> way to confirm whether a variable is binary in order to use -prtest- >>> for proportions rather than -ttest- for continous variables. >>> >>> One option is to check the actual data values and do -prtest- if there >>> are only 0's and 1's. But a continuous but rare outcome could >>> accidentally also take these values, e.g. the number of >>> hospitalizations in the past 3 months. >>> >>> Is there a safer way to confirm that a variable is binary? >>> >>> Thanks for any pointers, >>> Bert >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Confirming whether a variable is binary or continuous***From:*Bert Jung <bjung59@gmail.com>

**Re: st: Confirming whether a variable is binary or continuous***From:*Eric Booth <eric.a.booth@gmail.com>

**Re: st: Confirming whether a variable is binary or continuous***From:*Bert Jung <bjung59@gmail.com>

- Prev by Date:
**Re: st: Confirming whether a variable is binary or continuous** - Next by Date:
**st: ztnb model: why are some results omitted?** - Previous by thread:
**Re: st: Confirming whether a variable is binary or continuous** - Next by thread:
**st: ztnb model: why are some results omitted?** - Index(es):