Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Need help with combing observations by using by: command along with subscripting


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Need help with combing observations by using by: command along with subscripting
Date   Wed, 6 Feb 2013 18:26:27 +0000

The context is apparently that observations for each distinct person
occur in pairs.

If so, this is a general trick:

bysort PATIENTno : replace whatever = whatever[3 - _n] if missing(whatever)

If that looks a bit odd, let us just run through the possibilities.

If whatever[1] is missing, you want to copy whatever[2]. (If
whatever[2] is missing too, no harm done.)

If whatever[2] is missing, you want to copy whatever[1]. (If
whatever[1] is missing too, no harm done.)

-whatever- can be numeric or string: same code applies.

The rule [3 - _n] ties these cases together. If _n is 1 then 3 - _n is
2 and vice versa.

If it looked odd a few lines ago, it now seems all too obvious.

Note also that -- as in Rebecca's careful code -- you should only
overwrite _missing_ values, otherwise you can propagate missings and
lose what you have.

Now let's go back to the assumption that observations occur in pairs.
If that is not true, this trick may mess things up. But it is painless
to add at the
end

& _N == 2

bysort PATIENTno : replace whatever = whatever[3 - _n] if
missing(whatever) & _N == 2

That way we insist on this being done only for pairs of observations.

I find this trick to be slightly but pleasantly amusing when it is the
solution, perhaps because it hinges on Stata's treatment of _n and _N
and so is a very small insider's joke.

Nick

On Wed, Feb 6, 2013 at 5:44 PM, Rebecca Pope <rebecca.a.pope@gmail.com> wrote:
> Mike,
> The way you have specified the -bysort- currently, Stata is looking
> within the patient and date of birth for multiple records. You just
> want it to look within patient only.
>
> bys PATIENTno (dtbirth): replace medicalrecordnumber =
> medicalrecordnumber[1] if missing(medicalrecordnumber)
>
> The underlying assumption is that an MRN will always be present when
> the DOB is present. If not, you might be better off creating a look-up
> table of MRNs and using -merge- to update your missing values.
>
> preserve
> keep PATIENTno medicalrecordnumber
> keep if !missing(medicalrecordnumber)
> duplicates drop
> tempfile mrn
> save `"`mrn'"'
> restore
> merge m:1 PATIENTno using `"`mrn'"', update
>
> * warning: code not tested
>
> Cheers,
> Rebecca
>
> On Wed, Feb 6, 2013 at 11:12 AM, Michael Stewart
> <michaelstewartresearch@gmail.com> wrote:
>> Hi,
>>
>> I need help with following data where I am trying to combine
>> observations for each patient
>>
>> PATIENTno   dtbirth       medicalrecordnumber   accountnumber
>>     1             8/10/1907         1111111
>>     1              .                                                        dddddddd
>>    2             12/14/1910        222222222
>>    2                 .                                                       eeeeee
>>    3              4/16/1914         33333333                  rrrrrrr
>>
>>
>> PATIENTno,medicalrecordnumber & accountnumber -->STRING format
>>
>> dtbirth-->%td format
>>
>>
>> I need to combine row one and row two for patient 1 so that resultatnt
>> row will have information from both rows one and row two
>>
>> I tried to use subscripting as follows
>>
>> bysort PATIENTno dtbirth: repalce medicalrecordnumber= medicalrecordnumber[_n-1]
>>
>> then all values for the medicalrecordnumber are turning to blank spaces
>>
>> I also tried bysort PATIENTno dtbirth: repalce medicalrecordnumber=
>> medicalrecordnumber[1] but it didnt work either.
>>
>> Can you please help.
>>
>> --
>> Thank you ,
>> Yours Sincerely,
>> Mike.
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
>
> On Wed, Feb 6, 2013 at 11:12 AM, Michael Stewart
> <michaelstewartresearch@gmail.com> wrote:
>> Hi,
>>
>> I need help with following data where I am trying to combine
>> observations for each patient
>>
>> PATIENTno   dtbirth       medicalrecordnumber   accountnumber
>>     1             8/10/1907         1111111
>>     1              .                                                        dddddddd
>>    2             12/14/1910        222222222
>>    2                 .                                                       eeeeee
>>    3              4/16/1914         33333333                  rrrrrrr
>>
>>
>> PATIENTno,medicalrecordnumber & accountnumber -->STRING format
>>
>> dtbirth-->%td format
>>
>>
>> I need to combine row one and row two for patient 1 so that resultatnt
>> row will have information from both rows one and row two
>>
>> I tried to use subscripting as follows
>>
>> bysort PATIENTno dtbirth: repalce medicalrecordnumber= medicalrecordnumber[_n-1]
>>
>> then all values for the medicalrecordnumber are turning to blank spaces
>>
>> I also tried bysort PATIENTno dtbirth: repalce medicalrecordnumber=
>> medicalrecordnumber[1] but it didnt work either.
>>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index