Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Reshape, Duplicate Observations


From   Tirthankar Chakravarty <[email protected]>
To   [email protected]
Subject   Re: st: Reshape, Duplicate Observations
Date   Sat, 22 Aug 2009 02:06:35 +0100

<>
Is this perhaps what you want?
***************************************
clear*
input str2 hid  str15 date       fd
    A   "01/03/2005"    0
    A   "04/05/2006"    1
    B   "02/03/1999"    1
    B   "09/07/2004"   1
    B   "09/07/2004"   0
    C   "05/02/2004"   0
    C   "03/11/2004"   1
    D   "05/08/1998"   0
end
save 1, replace
******************

bys hid: g j=_n
reshape wide date fd, i(hid) j(j)
save 2, replace
clear
******************

input str2 hid  str2 nid1   dist1   str2 nid2   dist2
	 A      B     .75      C     .25
	 B      D     .35      A     .75
	 C      E     .65      A     .25
	 D      B     .35       ""       .
end
save 3, replace
******************

use 1
joinby hid using 2
joinby hid using 3
list, clean
***************************************

T

On Sat, Aug 22, 2009 at 1:05 AM, <[email protected]> wrote:
> I have a dataset with multiple observations (both unique and duplicate)
> for each identifier HID. Here is an example
>
>                           Table 1
>
>        HID    DATE       FD
>
>  1.     A   01/03/2005    0
>  2.     A   04/05/2006    1
>  3.     B   02/03/1999    0
>  4.     B   09/07/2004    1
>  5.     B   09/07/2004    0
>  6.     C   05/02/2004    0
>  7.     C   03/11/2004    1
>  8.     D   05/08/1998    0
>
>
> I have another dataset (already reshaped widely) as follows
>
>                           Table 2
>
>      HID    NID1   DIST1   NID2   DIST2
>
> 1.      A      B     .75      C     .25
> 2.      B      D     .35      A     .75
> 3.      C      E     .65      A     .25
> 4.      D      B     .35              .
>
>
> Now, I want to gather information by HID on a set of other variables DATE
> and FD so that each observation in Table 2 contains information on HID
> DATE NID* and their corresponding dates, DIST* and FD*. I am not allowed
> to drop the duplicate observation (obs. 4 & 5) since each of them contains
> important information. The outcome table that I am looking for is as
> follows
>
>                                                                               Table
> 3
>
>  HID    DATE    FD NID1 DIST1 DATE1    DATE2   DATE3    FD1 FD2 FD3 NID2
> DIST2 DATE* FD*
>
>  1.  A   01/03/05   0   B   .75 02/03/99 09/07/04 09/07/04  0    1   0
> C   .25
>  2.  A   04/05/06   1   B   .75 02/03/99 09/07/04 09/07/04  0    1   0
> C   .25
>  3.  B   02/03/99   0   D
>  4.  B   09/07/04   1   D
>  5.  B   09/07/04   0   D
>  6.  C   05/02/04   0   E
>  7.  C   03/11/04   1   E
>  8.  D   05/08/98   0   B
>
> Basically, my plan is to know for each group (obs.) of HID and DATE the
> corresponding NIDS which are within a year from the DATE and their
> corresponding total number of FDs. That’s why I think I need to go through
> Table 3.
>
> Thanks in advance.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
To every ω-consistent recursive class κ of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(κ) (where v is the free variable of r).

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index