Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: repeated measures


From   Phil Clayton <[email protected]>
To   [email protected]
Subject   Re: st: repeated measures
Date   Tue, 24 May 2011 00:14:30 +0930

If the values are unique within an individual then you can fill out the values with something like:
bysort id (ssn): replace ssn=ssn[1]

(if they're not unique then this will make all DOBs equal to the earliest one recored, and all SSNs equal to the one sorted first alphabetically)

You need to decide on what the rules are for determining an individual's DOB or SSN when there are typos - Stata will do whatever you tell it. I doubt there is a "right" answer for this kind of problem (beyond going back to the primary data source of course!) but there may be a right answer for your particular data.

You can get rid of the hyphens using -destring- with the -ignore- option. Be careful though - often it's better to keep IDs in string format because of precision issues with high numbers. See for example the "Use theory to check IDs if they are numeric" section here:
http://blog.stata.com/2011/04/18/merging-data-part-1-merges-gone-bad/

Phil

On 23/05/2011, at 11:21 PM, Michael Eisenberg wrote:

> Colleagues,
> 
> I have what I hope will be simple question.
> 
> I have a database of men with lab tests over several years.  Included
> in this is SSN (social security number) and DOB (date of birth).
> Unfortunately, not all the men had this data recorded for every lab
> test.
> 
> For example:
> 
> id       dob                 ssn                testdate      testa
> 1        3-1-1963       123-45-6789       6/1/2009      45
> 1                            123-45-6789       8/1/2009      90
> 
> 2        6-1-1966       123-00-6000       6/1/2009      45
> 
> 3                                                    6/1/2007      45
> 3                           123-45-6789       6/1/2008      55
> 3        7-1-1963                               6/1/2010      76
> 
> 4                            123-45-6789       12/1/2009      45
> 4                            123-45-6789       12/15/2009     78
> 
> 
> 
> How can I have stata assign the known values for SSN or DOB to all of
> the rows for a given subject?  If there happen to be typos in the
> data, so that one subject has 2 different DOB, how will this be
> handled by stata?
> 
> also, is there a simple way to remove the dashes from the SSN so that
> I can destring this variable?
> 
> Thank you in advance.
> 
> Mike Eisenberg
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index