Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: dropping vars from analysis under conditions

 From Maarten Buis To statalist@hsphsun2.harvard.edu Subject Re: st: dropping vars from analysis under conditions Date Tue, 17 Apr 2012 12:16:39 +0200

```--- On Tue, Apr 17, 2012 at 11:55 AM, K.O. Ivanova wrote:
> I am trying to run an identical model for several different countries The thing is that for some countries, the child residence dummies (ch_res_dum2, etc.) have too few cases for one (or both) of the genders (for example, in one country, there are only 15 women who say that they have shared custody of their kid). Basically, I want to run my syntax but then add a statement which tells Stata to drop one (or  more than one) of these residence dummies from that list of predictors if  the number of cases for that dummy is smaller than...

It is tricky enough to compare groups (in your case countries) with
non-linear models. See, for example Williams (2009). My stance on this
issue is that you are fine as long as you interpret your results as
descriptive and not causal, but many people (_not_ including me) are
uncomfortable with "just" descriptive results.

However, this problem becomes a lot worse when you use different sets
of covariates. Unlike in linear models, in non-linear models this
changes the meaning of the dependent variable. Think of it this way:
Your dependent variable is a probability, this is a measure of central
tendency but also a measure of uncertainty. That uncertainty comes
from all (observed and unobserved) variables you did not include in
your model. In essence you classified all those variables as "luck".
So the choice of which variables you include in your model (and thus
which variables you choose to leave out) determines which variables
you treat as systematic and which are treated as just "luck". So when
you compare countries and these countries differ also with respect to
the covariates than the trick to interpret the differences as
descriptive also starts to unravel.

Hope this is not too depressing,
Maarten

Williams, R. 2009. Using heterogenous choice models to compare logit
and probit coefficients across groups. Sociological Methods & Research
37: 531–559.

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```