Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Haena Lee <hannahlee419@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: how to generate parent variables matched to their children in household level data set? |

Date |
Sat, 23 Feb 2013 02:36:50 -0600 |

Nick, I would love to merge father's and mother's data with children. That was my first choice. As you may have noticed, however, my data doesn't have one clear indicator variable of who is mother/father/child/grandparent. Although there are ID_F and ID_M, what makes me confused is, ID_F and ID_M are on the same row of children. I see "fid and mid" from your previous answer is also located on children's row. So how do I tell stata to generate a new indicator of "mothers" and to treat it as a property of mothers, not children? So that eventually I would extract moms from this raw data (e.g., keep ID BMI_M EMP_M if mom==1) and merge (1:many) it based on key variable (ID_fam) with children's data? Assuming looping would do this work, gen mom=. unab Y: ID unab Z: ID_M forevar x of newlist mom replace `x' ==1 if Y==Z } Please note that I am not familiar with the concept of looping. Just taught myself today for a little bit so I am not sure if those commands above would make sense. If not, let me know. I'd happy to explain it again. Haena On Fri, Feb 22, 2013 at 7:54 PM, Nick Cox <njcoxstata@gmail.com> wrote: > Note that I wrote that FAQ some years ago. Now I think why didn't I > approach that as a -merge- problem? Create a dataset with fathers' > data, one with mothers' data, and -merge- using those. There is still > some fiddling around. This all goes with the simple idea that we have > favourite tools. > > Nick > > On Sat, Feb 23, 2013 at 1:50 AM, Nick Cox <njcoxstata@gmail.com> wrote: >> That's an allusion is to my FAQ >> >> FAQ . . Creating variables recording prop. of the other members of a group >> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox >> 4/05 How do I create variables summarizing for each >> individual properties of the other members of a >> group? >> >> http://www.stata.com/support/faqs/data-management/creating-variables-recording-properties/ >> >> I don't know why you report problems. The code suggested there works >> as intended. Here it is again run on your example data: >> >> . by ID_fam (ID), sort: gen pid = _n >> >> . gen byte fid = . >> (7 missing values generated) >> >> . gen byte mid = . >> (7 missing values generated) >> >> . summarize pid, meanonly >> >> . forval i = 1 / `r(max)' { >> 2. by ID_fam: replace fid = `i' if ID_F == ID[`i'] & >> !missing(ID_F) >> 3. by ID_fam: replace mid = `i' if ID_M == ID[`i'] & >> !missing(ID_M) >> 4. } >> (3 real changes made) >> (0 real changes made) >> (0 real changes made) >> (3 real changes made) >> (0 real changes made) >> (0 real changes made) >> (0 real changes made) >> (0 real changes made) >> >> . l >> >> +----------------------------------------------------------------------------------+ >> | ID_F ID_M BMI ID ID_fam Emp >> pid fid mid | >> |----------------------------------------------------------------------------------| >> 1. | 26.501 A901963701 A9019637 1 >> 1 . . | >> 2. | 20.483 A901963702 A9019637 1 >> 2 . . | >> 3. | A901963701 A901963702 20.924 A901963703 A9019637 . >> 3 1 2 | >> 4. | 27.209 A901963801 A9019638 1 >> 1 . . | >> 5. | 31.733 A901963802 A9019638 . >> 2 . . | >> |----------------------------------------------------------------------------------| >> 6. | A901963801 A901963802 18.018 A901963803 A9019638 . >> 3 1 2 | >> 7. | A901963801 A901963802 19.054 A901963804 A9019638 . >> 4 1 2 | >> +----------------------------------------------------------------------------------+ >> >> Using the same logic, we copy parents' employment and mothers' BMI as desired: >> >> . gen BMI_M = . >> (7 missing values generated) >> >> . gen Emp_M = . >> (7 missing values generated) >> >> . gen Emp_F = . >> (7 missing values generated) >> >> . summarize pid, meanonly >> >> . forval i = 1 / `r(max)' { >> 2. by ID_fam: replace BMI_M = BMI[`i'] if ID_M == ID[`i'] & !missing(ID_M) >> 3. by ID_fam: replace Emp_M = Emp[`i'] if ID_M == ID[`i'] & !missing(ID_M) >> 4. by ID_fam: replace Emp_F = Emp[`i'] if ID_F == ID[`i'] & !missing(ID_F) >> 5. } >> (0 real changes made) >> (0 real changes made) >> (3 real changes made) >> (3 real changes made) >> (1 real change made) >> (0 real changes made) >> (0 real changes made) >> (0 real changes made) >> (0 real changes made) >> (0 real changes made) >> (0 real changes made) >> (0 real changes made) >> >> >> Here are the results: >> >> . l >> >> +-----------------------------------------------------------------------------------------------+ >> | ID_F ID_M BMI ID ID_fam Emp >> pid BMI_M Emp_M Emp_F | >> |-----------------------------------------------------------------------------------------------| >> 1. | 26.501 A901963701 A9019637 1 >> 1 . . . | >> 2. | 20.483 A901963702 A9019637 1 >> 2 . . . | >> 3. | A901963701 A901963702 20.924 A901963703 A9019637 . >> 3 20.483 1 1 | >> 4. | 27.209 A901963801 A9019638 1 >> 1 . . . | >> 5. | 31.733 A901963802 A9019638 . >> 2 . . . | >> |-----------------------------------------------------------------------------------------------| >> 6. | A901963801 A901963802 18.018 A901963803 A9019638 . >> 3 31.733 . 1 | >> 7. | A901963801 A901963802 19.054 A901963804 A9019638 . >> 4 31.733 . 1 | >> +-----------------------------------------------------------------------------------------------+ >> >> Nick >> >> On Fri, Feb 22, 2013 at 10:45 PM, Haena Lee <hannahlee419@gmail.com> wrote: >> >>> I am working on investigating the relationship between maternal >>> employment status and prevalence of childhood obesity using a >>> nationally representative data (KNHANES). Suppose I have ID(all >>> observations including both children and parents), ID_fam (household >>> indicator), >>> ID_F( father's ID), ID_M (mother's ID), BMI (body mass index) and >>> finally Emp (employment status 1 if employed; 0 if non-employed) as >>> the following; >>> >>> ID_F ID_M BMI ID ID_fam Emp >>> 26.501 A901963701 A9019637 1 >>> 20.483 A901963702 A9019637 1 >>> A901963701 A901963702 20.924 A901963703 A9019637 . >>> 27.209 A901963801 A9019638 1 >>> 31.733 A901963802 A9019638 . >>> A901963801 A901963802 18.018 A901963803 A9019638 . >>> A901963801 A901963802 19.054 A901963804 A9019638 . >>> >>> And ultimately, I would like to have a data set like this following; >>> >>> ID (children) ID_fam BMI Mom's Bmi Mom's Emp Dad's Emp >>> A901963703 A9019637 20.924 20.483 1 1 >>> A901963803 A9019638 18.018 31.733 . 1 >>> A901963804 A9019638 19.054 31.733 . 1 >>> >>> Given this, my question is 1) how to map the properties of other >>> family members to children within each household, using loop, or 2) >>> how to generate an indicator of mother (1 if ID == ID_M; 0 otherwise)? >>> I found Nick Cox's helpful example and imitated it as the following; >>> >>> by ID_fam (ID), sort: gen pid = _n >>> gen byte fid = . >>> gen byte mid = . >>> summarize pid, meanonly >>> forval i = 1 / `r(max)' { >>> by ID_fam: replace fid = `i' >>> if ID_F == ID[`i'] & !missing(ID_F) >>> by ID_fam: replace mid = `i' >>> if ID_M == ID[`i'] & !missing(ID_M) >>> } >>> >>> And it didn't produce any meaningful values but missing. Please >>> advise. Thank you so much for any help in advance. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ -- -------------------------------------- Haena Lee Ph.D Student Sociology Department The University of Chicago 312 - 405 - 3223 -- ===================== Haena Lee Ph.D Student Sociology Department The University of Chicago 312 - 405 - 3223 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: how to generate parent variables matched to their children in household level data set?***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: how to generate parent variables matched to their children in household level data set?***From:*Haena Lee <hannahlee419@gmail.com>

**Re: st: how to generate parent variables matched to their children in household level data set?***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: how to generate parent variables matched to their children in household level data set?***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Re: Polychoric PCA error message** - Next by Date:
**st: capturing scalars in simulation** - Previous by thread:
**Re: st: how to generate parent variables matched to their children in household level data set?** - Next by thread:
**Re: st: how to generate parent variables matched to their children in household level data set?** - Index(es):