Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: generating parent variable in child level data

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject RE: st: generating parent variable in child level data Date Mon, 8 Nov 2010 12:15:51 +0000

```This crossed with Eric Booth's solution, which in a way is about half-way between mine and Mitch Abdon's.

In my solution, the age for the mother, for example, is recorded for the mother herself and also for the father. Values in such observations can be ignored if of no interest or set to missing afterwards.

For more on -cond()- if desired, see

SJ-5-3  pr0016  . . Depending on conditions: a tutorial on the cond() function
. . . . . . . . . . . . . . . . . . . . . . .  D. Kantor and N. J. Cox
Q3/05   SJ 5(3):413--420                                 (no commands)
tutorial on the cond() function

Nick
n.j.cox@durham.ac.uk

Nick Cox

Here is another way to do it, without any loops:

egen age_mother = mean(   cond(relation == "mother", age, .)   ), by(hhid)

The way this works, reading inside out:

1. The expression

cond(relation == "mother", age, .)

yields the -age- of the mother when the person is the mother and missing otherwise.

2. The -egen- function -mean()- takes the mean of that expression, -by(hhid)-. As you would hope and expect, it ignores missings, except if all the values are missing.

Now the implication, or perhaps inference, is that there should be at most one mother in each household. If that's true, then other -egen- functions will yield the same result, such as -min()- and -max()-.

Conversely, you should check that it is true:

egen n_mothers = total(relation == "mother"), by(hhid)

If it's not true, then presumably you need to work out what you want for two or more mothers. (If there's no mother, the result is missing, as above.)

By the way, a -by()- option for -egen- is supported, just no longer documented.

For other problems in this territory, see

FAQ     . . Creating variables recording prop. of the other members of a group
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
4/05    How do I create variables summarizing for each
individual properties of the other members of a
group?
http://www.stata.com/support/faqs/data/members.html

FAQ     . . Creating variables recording whether any or all possess some char.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
2/03    How do I create a variable recording whether any
members of a group (or all members of a group)
possess some characteristic?
http://www.stata.com/support/faqs/data/anyall.html

Nick
n.j.cox@durham.ac.uk

Mitch Abdon
===========

Here is one way of doing this:

gen age_mother=.
gen age_father=.
gen educ_mother=.
gen educ_father=.
levelsof hhid, local(levels)
foreach i of local levels{
qui: sum age if relation=="mother" & hhid==`i'
replace age_mother=r(mean) if hhid==`i'

qui: sum age if relation=="father" & hhid==`i'
replace age_father=r(mean) if hhid==`i'

qui: sum education if relation=="mother" & hhid==`i'
replace educ_mother=r(mean) if hhid==`i'

qui: sum education if relation=="father" & hhid==`i'
replace educ_father=r(mean) if hhid==`i'
}

if you don't need the lines for 'mother' and 'father' , you can just drop them

Shikha Sinha
============

> I have a household level data set in which each household has father,
> mother, and child level data as row, something as the following:
>
> hhid   relation    age    education
> 1        father      40        8
> 1        mother    38        3
> 1        son         18       4
> 1        son         15       2
> 1        daughter   12      2
>
> I wish to generate parent level variable (age, education)  for every
> children in the same household. Please suggest.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```