Someone asked how to count how many children are in a given household and
how to assign this number to each individual in the same household.
What I usually do is:
sort hhid
egen no_kids=count(id) if age<19, by(hhid)
/*this counts kids in every hh and places the sum in a row where an
individual is <=18yo, rows with adults will have a missing value because
they did not meet the if condition*/
egen no_childr=max(no_kids), by(hhid)
/*this assigns the total number of children to each individual within the
same hh*/
replace no_childr=0 if no_childr==.
drop no_kids
/*you do not need no_kids anymore, so drop it*/
The same trick can be used if you want to create variables with parents'
education: let's say you need to create mom_ed and dad_ed, but your data
only allows you to identify parents via a variable called relationship.
Say, relationship 1=dad, 2=mom, 3=children and you have one variable
called educ. Then:
sort hhid
egen mom_ed=educ if relationship==2
egen dad_ed=educ if relationship==1
/*at this point hhid does not matter, as the above variables will take a
missing value if the "if" condition is not satisfied*/
egen mom_educ=max(mom_ed), by(hhid)
egen dad_educ=max(dad_ed), by(hhid)
/*this assigns mom's and dad's education levels to all members of the
drop mom_ed dad_ed
you could also rectangulize the dataset to get the same results. if you
are intersted in this, let me know - i have a sample code somewhere.
