Nick Cox <njcoxstata@gmail.com> |

statalist@hsphsun2.harvard.edu |

Re: st: how to generate parent variables matched to their children in household level data set? |

Sat, 23 Feb 2013 09:33:13 +0000 |

I am at a loss to understand what you are asking. My previous posts showed that with your sample data the code I used does work. It remains a mystery why you first reported otherwise, and also why you imply that the problem you stated is still unsolved. I just did that for you. It seems that you have not studied my code and its results. The absence of a single clear indicator variable is immaterial here. You want to copy data from mothers' and fathers' observations to children's; for that being able to link mother and father identifiers to children is necessary and sufficient, and done separately. My mention of -merge- just hints at a different method, but I have given a method that works. I was not stating or implying that you need to -merge-; that's merely a good alternative. If you want to know why my method works you need to study not only discussion of loops as in SJ-2-2 pr0005 . . . . . . Speaking Stata: How to face lists with fortitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q2/02 SJ 2(2):202--222 (no commands) demonstrates the usefulness of for, foreach, forvalues, and local macros for interactive (non programming) tasks but also the use of -by:- as in SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q1/02 SJ 2(1):86--102 (no commands) explains the use of the by varlist : construct to tackle a variety of problems with group structure, ranging from simple calculations for each of several groups to more My code requires the fact that under the aegis of -by:- subscripts (42 in -foo[42]- is a subscript) are numbered within groups, so the subscript [1] refers to the first observation in each group. As said, I don't see that you need any further code, so I have not studied your code beyond noticing that -forevar- is not a Stata command. Nick On Sat, Feb 23, 2013 at 8:36 AM, Haena Lee <hannahlee419@gmail.com> wrote: > Nick, > > I would love to merge father's and mother's data with children. That > was my first choice. > As you may have noticed, however, my data doesn't have one clear > indicator variable of who is mother/father/child/grandparent. Although > there are ID_F and ID_M, what makes me confused is, ID_F and ID_M are > on the same row of children. I see "fid and mid" from your previous > answer is also located on children's row. So how do I tell stata to > generate a new indicator of "mothers" and to treat it as a property of > mothers, not children? So that eventually I would extract moms from > this raw data (e.g., keep ID BMI_M EMP_M if mom==1) and merge (1:many) > it based on key variable (ID_fam) with children's data? > > Assuming looping would do this work, > > gen mom=. > unab Y: ID > unab Z: ID_M > forevar x of newlist mom > replace `x' ==1 if Y==Z > } > > Please note that I am not familiar with the concept of looping. Just > taught myself today for a little bit so I am not sure if those > commands above would make sense. If not, let me know. I'd happy to > explain it again. > > Haena > > On Fri, Feb 22, 2013 at 7:54 PM, Nick Cox <njcoxstata@gmail.com> wrote: >> Note that I wrote that FAQ some years ago. Now I think why didn't I >> approach that as a -merge- problem? Create a dataset with fathers' >> data, one with mothers' data, and -merge- using those. There is still >> some fiddling around. This all goes with the simple idea that we have >> favourite tools. >> >> Nick >> >> On Sat, Feb 23, 2013 at 1:50 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>> That's an allusion is to my FAQ >>> >>> FAQ . . Creating variables recording prop. of the other members of a group >>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox >>> 4/05 How do I create variables summarizing for each >>> individual properties of the other members of a >>> group? >>> >>> http://www.stata.com/support/faqs/data-management/creating-variables-recording-properties/ >>> >>> I don't know why you report problems. The code suggested there works >>> as intended. Here it is again run on your example data: >>> >>> . by ID_fam (ID), sort: gen pid = _n >>> >>> . gen byte fid = . >>> (7 missing values generated) >>> >>> . gen byte mid = . >>> (7 missing values generated) >>> >>> . summarize pid, meanonly >>> >>> . forval i = 1 / `r(max)' { >>> 2. by ID_fam: replace fid = `i' if ID_F == ID[`i'] & >>> !missing(ID_F) >>> 3. by ID_fam: replace mid = `i' if ID_M == ID[`i'] & >>> !missing(ID_M) >>> 4. } >>> (3 real changes made) >>> (0 real changes made) >>> (0 real changes made) >>> (3 real changes made) >>> (0 real changes made) >>> (0 real changes made) >>> (0 real changes made) >>> (0 real changes made) >>> >>> . l >>> >>> +----------------------------------------------------------------------------------+ >>> | ID_F ID_M BMI ID ID_fam Emp >>> pid fid mid | >>> |----------------------------------------------------------------------------------| >>> 1. | 26.501 A901963701 A9019637 1 >>> 1 . . | >>> 2. | 20.483 A901963702 A9019637 1 >>> 2 . . | >>> 3. | A901963701 A901963702 20.924 A901963703 A9019637 . >>> 3 1 2 | >>> 4. | 27.209 A901963801 A9019638 1 >>> 1 . . | >>> 5. | 31.733 A901963802 A9019638 . >>> 2 . . | >>> |----------------------------------------------------------------------------------| >>> 6. | A901963801 A901963802 18.018 A901963803 A9019638 . >>> 3 1 2 | >>> 7. | A901963801 A901963802 19.054 A901963804 A9019638 . >>> 4 1 2 | >>> +----------------------------------------------------------------------------------+ >>> >>> Using the same logic, we copy parents' employment and mothers' BMI as desired: >>> >>> . gen BMI_M = . >>> (7 missing values generated) >>> >>> . gen Emp_M = . >>> (7 missing values generated) >>> >>> . gen Emp_F = . >>> (7 missing values generated) >>> >>> . summarize pid, meanonly >>> >>> . forval i = 1 / `r(max)' { >>> 2. by ID_fam: replace BMI_M = BMI[`i'] if ID_M == ID[`i'] & !missing(ID_M) >>> 3. by ID_fam: replace Emp_M = Emp[`i'] if ID_M == ID[`i'] & !missing(ID_M) >>> 4. by ID_fam: replace Emp_F = Emp[`i'] if ID_F == ID[`i'] & !missing(ID_F) >>> 5. } >>> (0 real changes made) >>> (0 real changes made) >>> (3 real changes made) >>> (3 real changes made) >>> (1 real change made) >>> (0 real changes made) >>> (0 real changes made) >>> (0 real changes made) >>> (0 real changes made) >>> (0 real changes made) >>> (0 real changes made) >>> (0 real changes made) >>> >>> >>> Here are the results: >>> >>> . l >>> >>> +-----------------------------------------------------------------------------------------------+ >>> | ID_F ID_M BMI ID ID_fam Emp >>> pid BMI_M Emp_M Emp_F | >>> |-----------------------------------------------------------------------------------------------| >>> 1. | 26.501 A901963701 A9019637 1 >>> 1 . . . | >>> 2. | 20.483 A901963702 A9019637 1 >>> 2 . . . | >>> 3. | A901963701 A901963702 20.924 A901963703 A9019637 . >>> 3 20.483 1 1 | >>> 4. | 27.209 A901963801 A9019638 1 >>> 1 . . . | >>> 5. | 31.733 A901963802 A9019638 . >>> 2 . . . | >>> |-----------------------------------------------------------------------------------------------| >>> 6. | A901963801 A901963802 18.018 A901963803 A9019638 . >>> 3 31.733 . 1 | >>> 7. | A901963801 A901963802 19.054 A901963804 A9019638 . >>> 4 31.733 . 1 | >>> +-----------------------------------------------------------------------------------------------+ >>> >>> Nick >>> >>> On Fri, Feb 22, 2013 at 10:45 PM, Haena Lee <hannahlee419@gmail.com> wrote: >>> >>>> I am working on investigating the relationship between maternal >>>> employment status and prevalence of childhood obesity using a >>>> nationally representative data (KNHANES). Suppose I have ID(all >>>> observations including both children and parents), ID_fam (household >>>> indicator), >>>> ID_F( father's ID), ID_M (mother's ID), BMI (body mass index) and >>>> finally Emp (employment status 1 if employed; 0 if non-employed) as >>>> the following; >>>> >>>> ID_F ID_M BMI ID ID_fam Emp >>>> 26.501 A901963701 A9019637 1 >>>> 20.483 A901963702 A9019637 1 >>>> A901963701 A901963702 20.924 A901963703 A9019637 . >>>> 27.209 A901963801 A9019638 1 >>>> 31.733 A901963802 A9019638 . >>>> A901963801 A901963802 18.018 A901963803 A9019638 . >>>> A901963801 A901963802 19.054 A901963804 A9019638 . >>>> >>>> And ultimately, I would like to have a data set like this following; >>>> >>>> ID (children) ID_fam BMI Mom's Bmi Mom's Emp Dad's Emp >>>> A901963703 A9019637 20.924 20.483 1 1 >>>> A901963803 A9019638 18.018 31.733 . 1 >>>> A901963804 A9019638 19.054 31.733 . 1 >>>> >>>> Given this, my question is 1) how to map the properties of other >>>> family members to children within each household, using loop, or 2) >>>> how to generate an indicator of mother (1 if ID == ID_M; 0 otherwise)? >>>> I found Nick Cox's helpful example and imitated it as the following; >>>> >>>> by ID_fam (ID), sort: gen pid = _n >>>> gen byte fid = . >>>> gen byte mid = . >>>> summarize pid, meanonly >>>> forval i = 1 / `r(max)' { >>>> by ID_fam: replace fid = `i' >>>> if ID_F == ID[`i'] & !missing(ID_F) >>>> by ID_fam: replace mid = `i' >>>> if ID_M == ID[`i'] & !missing(ID_M) >>>> } >>>> >>>> And it didn't produce any meaningful values but missing. 