Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: dropping vars from analysis under conditions


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: dropping vars from analysis under conditions
Date   Tue, 17 Apr 2012 11:35:23 +0100

I don't get quite what you want here.

"Too few cases" seems to mean for you "too few observations with value
1", which is not the same.

The "identical model" implies to me the same predictors. So you need
to work out in advance which variables don't satisfy in all possible
combinations. Stata does not to my knowledge have a syntax like the
one you want. There are circumstances in which Stata automatically
omits predictors but that is not what you want.

Expansion by time spent also sounds very dubious. If that means #
observations for # units of time spent, well, the frequency
interpretation depends on units of time being discrete, and on which
units you use, and there is now a cluster structure.

All that said, this identifies variables with at least 15 values of 1
in every distinct -year- of nlswork.dta.

webuse nlswork, clear
findname, all(inlist(@, 0, 1, .))
foreach v in `r(varlist)' {
      bysort year : egen ones = total(`v')
      su ones, meanonly
      if r(min) >= 15 local OK `OK' `v'
      drop ones
}

di "`OK'"

-findname- (SJ, SSC) is used to find the dummies.

On Tue, Apr 17, 2012 at 10:55 AM, K.O. Ivanova <K.O.Ivanova@uvt.nl> wrote:
> Hello, all.
>
> I am trying to run an identical model for several different countries (for = all, the data are expanded by the amount of time the respondent was looking  for a partner):
>
> by female: logit rel edu_genrank agesep_tvc sepcoh_cont ch_res_dum2 ch_res_dum3 ch_res_dum4 ch_res_dum5 interval2 interval3 interval4 interval5 interval6 interval7
>
> The thing is that for some countries, the child residence dummies (ch_res_dum2, etc.) have too few cases for one (or both) of the genders (for example, in one country, there are only 15 women who say that they have shared custody of their kid).
>
> I know that there is a way to tell Stata to drop a variable from the analysis under certain conditions but I don't know what it is. Basically, I want to run my syntax but then add a statement which tells Stata to drop one (or  more than one) of these residence dummies from that list of predictors if  the number of cases for that dummy is smaller than... e.g.:
>
> by female: logit rel edu_genrank agesep_tvc sepcoh_cont ch_res_dum2 ch_res_dum3 ch_res_dum4 ch_res_dum5 interval2 interval3 interval4 interval5 interval6 interval7 (do not consider ch_res_dum2 OR ch_res_dum3 OR ch_res_dum4 OR ch_res_dum5 if n for ch_res_dum(2,3,4,5) < 40).
>
> Thank you for the help!
>
> Katya
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index