Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: Re: A question


From   "Martin Wittenberg" <[email protected]>
To   <[email protected]>
Subject   st: Re: Re: A question
Date   Mon, 25 Nov 2002 11:15:29 +0200

Double looping seems very inefficient if there are several thousand
observations. A simple merge as far as I can see would do the trick more
effectively. For that you would need a family identifier as well as an
individual identifier (it seems from the data below that the first bit of
the person ID is a family ID, but I'm not quite sure).

If your data is in the following form:

Family ID        Person code       Mother's code
0422082010                06
0422082010                14
...
0422082010                97                01
0422082010                98                01

you would:
1. drop the person code variable
2. rename the Mother's code as person code. Drop any cases where this code
is missing.
3. save this new data set under a new name (say temp.dta)
4. merge the *original* data set with this new data set (temp.dta) on the
family ID and person code
5. If there are no matches within a particular household you know that you
have a problem.

Hope this helps.

Martin

----- Original Message -----
From: "Zhiqiang Wang" <[email protected]>
To: <[email protected]>
Sent: Monday, November 25, 2002 9:29 AM
Subject: st: Re: A question


> Jisheng
> I am not sure how long double loops will take for your data. It works for
> the small sample you gave.
> ---------
>     qui gen _find=0
>     local N=_N
>     foreach y of numlist 1/`N' {
>         foreach x of numlist 1/`N' {
>             qui replace _find=1 if mother_id[`y']==person_id[`x'] &
`y'==_n
>         }
>     }
> ---------
> Cheers
>
> Zhiqiang
> Menzies School of Health Research
>
>
> ----- Original Message -----
> From: Jisheng Cui
> To: [email protected]
> Sent: Monday, November 25, 2002 2:37 PM
> Subject: st: A question
>
>
> Following is a sample data with two columns indicating person's ID and
> mother's ID within a family. I would like to seek the best way to check
> whether each mother's ID is one of the person's IDs. Otherwise something
> wrong with the data. Please note:
> (1) We do not need to check the blank mother's ID.
> (2) There are some duplicate mother's ID in the family. If a mother's ID
is
> one of the person's ID, then we skip its duplicates.
> (3) There are thousands of such families. The program has to be efficient
in
> calculation. Be ware that -foreach- seems not work with the -by- command.
>
> With best wishes,
>
> Jisheng.
>
>
> person_id         mother_id
>
> 042208201006
> 042208201014
> 042208201008
> 042208201099
> 042208201005
> 042208201007
> 042208201097  042208201001
> 042208201098  042208201001
> 042208201001  042208201002
> 042208201094  042208201005
> 042208201093  042208201005
> 042208201002  042208201005
> 042208201095  042208201005
> 042208201096  042208201005
> 042208201003  042208201007
> 042208201009  042208201007
> 042208201010  042208201007
> 042208201011  042208201007
> 042208201013  042208201011
> 042208201012  042208201011
>
>
>
> ----------------------------------------------------------------------
> Dr. Jisheng Cui
> Senior Research Fellow
> Centre for Genetic Epidemiology
> School of Population Health
> The University of Melbourne
> Parkville, Victoria 3010, Australia
> Tel: +61 3 8344-0641, Fax: +61 3 9349-5815
> URL: http://ariel.its.unimelb.edu.au/~jisc
> ----------------------------------------------------------------------
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index