Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Reshape, Duplicate Observations


From   Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Reshape, Duplicate Observations
Date   Sat, 22 Aug 2009 11:22:00 +0100

Here is another stab at a solution. I am still not very clear about
how you want this done though, so this might not be what you want:
***************************************
clear*
input str2 hid  str15 date       fd
   A   "01/03/2005"    0
   A   "04/05/2006"    1
   B   "02/03/1999"    1
   B   "09/07/2004"   1
   B   "09/07/2004"   0
   C   "05/02/2004"   0
   C   "03/11/2004"   1
   D   "05/08/1998"   0
end
save 1, replace
******************
rename hid nid
bys nid: g j=_n
reshape wide date fd, i(nid) j(j)
rename nid nid1
save 2, replace
clear
******************

input str2 hid  str2 nid1   dist1   str2 nid2   dist2
        A      B     .75      C     .25
        B      D     .35      A     .75
        C      E     .65      A     .25
        D      B     .35       ""       .
end
joinby nid1 using 2, unmatched(master)
save 3, replace
******************

use 1
joinby hid using 3
drop _merge
list, clean
***************************************

T


On Sat, Aug 22, 2009 at 4:59 AM, <abiswas@clarku.edu> wrote:
> Hi Tirthankar,
>
> Thanks for the reply. However, this is not the one I need. Basically, for
> each observation I need hid date their corresponding nids and those nid's
> all possible dates. If you look at my table 3 you will understand what
> exactly I want. In your solution what I am getting is that each row
> incorporates within group hid dates only.
>
> Thanks.
>
> Arnab
>
>
>
>
>> <>
>> Is this perhaps what you want?
>> ***************************************
>> clear*
>> input str2 hid  str15 date       fd
>>     A   "01/03/2005"    0
>>     A   "04/05/2006"    1
>>     B   "02/03/1999"    1
>>     B   "09/07/2004"   1
>>     B   "09/07/2004"   0
>>     C   "05/02/2004"   0
>>     C   "03/11/2004"   1
>>     D   "05/08/1998"   0
>> end
>> save 1, replace
>> ******************
>>
>> bys hid: g j=_n
>> reshape wide date fd, i(hid) j(j)
>> save 2, replace
>> clear
>> ******************
>>
>> input str2 hid  str2 nid1   dist1   str2 nid2   dist2
>>        A      B     .75      C     .25
>>        B      D     .35      A     .75
>>        C      E     .65      A     .25
>>        D      B     .35       ""       .
>> end
>> save 3, replace
>> ******************
>>
>> use 1
>> joinby hid using 2
>> joinby hid using 3
>> list, clean
>> ***************************************
>>
>>
>> On Sat, Aug 22, 2009 at 1:05 AM, <abiswas@clarku.edu> wrote:
>>> I have a dataset with multiple observations (both unique and duplicate)
>>> for each identifier HID. Here is an example
>>>
>>>                           Table 1
>>>
>>>        HID    DATE       FD
>>>
>>>  1.     A   01/03/2005    0
>>>  2.     A   04/05/2006    1
>>>  3.     B   02/03/1999    0
>>>  4.     B   09/07/2004    1
>>>  5.     B   09/07/2004    0
>>>  6.     C   05/02/2004    0
>>>  7.     C   03/11/2004    1
>>>  8.     D   05/08/1998    0
>>>
>>>
>>> I have another dataset (already reshaped widely) as follows
>>>
>>>                           Table 2
>>>
>>>      HID    NID1   DIST1   NID2   DIST2
>>>
>>> 1.      A      B     .75      C     .25
>>> 2.      B      D     .35      A     .75
>>> 3.      C      E     .65      A     .25
>>> 4.      D      B     .35              .
>>>
>>>
>>> Now, I want to gather information by HID on a set of other variables
>>> DATE
>>> and FD so that each observation in Table 2 contains information on HID
>>> DATE NID* and their corresponding dates, DIST* and FD*. I am not allowed
>>> to drop the duplicate observation (obs. 4 & 5) since each of them
>>> contains
>>> important information. The outcome table that I am looking for is as
>>> follows
>>>
>>>
>>>                               Table
>>> 3
>>>
>>>  HID    DATE    FD NID1 DIST1 DATE1    DATE2   DATE3    FD1
>>> FD2 FD3 NID2
>>> DIST2 DATE* FD*
>>>
>>>  1.  A   01/03/05   0   B   .75 02/03/99 09/07/04 09/07/04  0
>>>  1   0
>>> C   .25
>>>  2.  A   04/05/06   1   B   .75 02/03/99 09/07/04 09/07/04  0
>>>  1   0
>>> C   .25
>>>  3.  B   02/03/99   0   D
>>>  4.  B   09/07/04   1   D
>>>  5.  B   09/07/04   0   D
>>>  6.  C   05/02/04   0   E
>>>  7.  C   03/11/04   1   E
>>>  8.  D   05/08/98   0   B
>>>
>>> Basically, my plan is to know for each group (obs.) of HID and DATE the
>>> corresponding NIDS which are within a year from the DATE and their
>>> corresponding total number of FDs. That’s why I think I need to go
>>> through
>>> Table 3.
>>>
>>> Thanks in advance.
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>
>>
>>
>> --
>> To every ω-consistent recursive class κ of formulae there correspond
>> recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
>> belongs to Flg(κ) (where v is the free variable of r).
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
To every ω-consistent recursive class κ of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(κ) (where v is the free variable of r).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index