Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | daniel klein <klein.daniel.81@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Re: Problem with combining sibling data |
Date | Mon, 13 May 2013 15:01:58 +0200 |
Sorry for the follow up post, but to elaborate on my statement I do not underrstand why you want to only exclude -2 from the calculations, as you already correctly noted, that we cannot "be sure of what [any of the negative values] mean", in terms of the "true" (underlying) value. What Annabel might be tempted to do here (and this can be done with -egen-'s -rowtotal()- after defineing missing vales) is to add up all valid values in the four variables. Thus, all observations that have at least one valid value on any of the fur original variables will have have a "valid" value on the newly created variable holding the number of siblings. Now my problem with this approach is, that someone who ticks e.g. "one older brother in the same house" and has missing values (i.e. -1 or -2) on all other variables, will be treated as if he or she has a total of one sibling. This migth be true -- but we cannot know. If any of the missing values in the other three variables "maskes" one or more other sibling, our results will be wrong. Therefore I strongly recomment to exclude all _cases_ from the calculation if there is one or more missing value in the original variable. Ther are (more or less complex) ways of dealing with missing values later on. Best Daniel -- I will try and answer as good as I can. If -1 and -2 represent missing values of some kind, it seems natural to exclude those from the calculation (that is before or while adding the variables up). You can do so using -mvdecode- (see my first reply), -recode-, -replace-, or some technique as demonstaretd in Nick's reply. I do not underrstand why you want to only exclude -2 from the calculations, as you already correctly noted, that we cannot "be sure of what [any of the negative values] mean", in terms of the "true" (underlying) value. Best Daniel * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/