Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Filling gaps???


From   "Austin Nichols" <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Filling gaps???
Date   Thu, 30 Nov 2006 10:07:29 -0500

Honorati Masanja--
I prefer Scott Merryman's (and Michael Blasnik's second) solution to
the others for one simple reason: it accounts for the *possibility*
that Infected is missing.  The others fail to bring potential problems
with missing values to light, by coding cases where households have
missing values as one or zero.  You may have a good reason for coding
such households as "infected" or "not" even in the presence of missing
values at the individual level (such as a skip pattern of questions
that ensures that anyone for whom Infected is missing is sure to be
uninfected), but that extra step of what to do in the case of missing
values at the individual level should be coded explicitly, or at least
should be checked to see if it matters, IMHO:

egen anyone_coded_infected=max(Infected), by(HouseholdID)
bys HouseholdID (Infected): g HHinfected = Infected[_N]
tab HHinfected anyone_coded_infected, mi

On 11/30/06, Scott Merryman <smerryman@kc.rr.com> wrote:
In addition to the suggestions by Maarten and Philipp, another way would be:

clear
input str6 householdid  str8 personid  infected
010101         01010101        1
010101         01010102        1
010102         01010201        0
010102         01010202        1
010102         01010203        1
010103         01010301        0
010103         01010302        0
010103         01010303        0
010104         01010401        0
010104         01010402        1
end
gen hinfect = infect == 1
bys hous (hinfect): replace hinfect = hinfect[_N]
l, sepby(householdid)

Scott

> -----Original Message-----
> From:Honorati Masanja
> I have a dataset with individuals in households. Each individual has a
> unique identifier. Some individuals in the households are infected and
> some are not. My problem is how do I tell Stata to create a new variable
> which will have 1 for households with at least one infected person  and
> 0 for households without infected persons.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index