[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: Computing and allocating time intervals in a wide dataset

From	"Martin Weiss" <[email protected]>
To	<[email protected]>
Subject	st: AW: Computing and allocating time intervals in a wide dataset
Date	Mon, 8 Jun 2009 23:31:18 +0200

<> 

I would advise you to soak up the wisdom of
http://www.stata.com/support/faqs/data/
before proceeding. Most of the time, you will at least get important hints
there...

Also note there are a couple of columns on data management in the SJ
archive:
http://www.stata-journal.com/archives.html



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Thomas Speidel
Gesendet: Montag, 8. Juni 2009 23:25
An: [email protected]
Betreff: st: Computing and allocating time intervals in a wide dataset

I am attempting to compute several time points to calculate the  
interval (years) between the start and the end of an activity and to  
assign that interval to its relevant age group.  For example, given  
the following dataset:

     id   activity   start   stop
      1          1       6     15
      1          2      22     25
      1          3      15     16
      1          4      22     28
      1          5      30      .
      1          6       .      .
      2          1      53     69
      2          2      69     79

I am trying to derive the following:

     id   activity   start   stop   grp_0_17   grp_1~24   grp_2~44    
grp_4~64   grp_6~81
      1          1       6     15          9          0          0      
      0          0
      1          2      22     25          0        2.5         .5      
      0          0
      1          3      15     16          1          0          0      
      0          0
      1          4      22     28          0        2.5        3.5      
      0          0
      1          5      30      .          0          0          1      
      0          0
      1          6       .      .          .          .          .      
      .          .
      2          1      53     69          0          0          0      
   11.5        4.5
      2          2      69     79          0          0          0      
      0         10

The age groups are:
[0.5, 17.5]
[17.6, 24.5]
[24.6, 44.5]
[44.6, 64.5]
[64.6, 81]

If the dataset was in long format as above, it would not be terribly  
hard. To slightly complicate things is the fact that the interval may  
need to be correctly allocated when it falls between two or more age  
groups.  However, my data is in wide format (single observation per  
row) making it a nightmare to even check or troubleshoot my code (I  
have 40 activities per id), and the data is so large that I am  
reluctant to reshape it.
This is what the dataset above would look like:

     id   start1   stop1   start2   stop2   start3   stop3   start4    
stop4   start5   stop5   start6   stop6
      1        6      15       22      25       15      16       22     
   28       30       .        .       .
      2       53      69       69      79        .       .        .     
    .        .       .        .       .

-The activities do not necessarily follow a temporal sequence (e.g.  
3rd observation on top)
-While the example does not show that, every id has exactly 40  
activities, even though many of them may be completing missing.
-Whenever a start is present but its corresponding stop is missing (as  
in the 6th obs. on top), it means that at the time of the study the  
person was still performing that activity, hence stop would be a  
variable called ageref. If start==ageref, then the interval would be  
approximated as 1 year.

I would appreciate any feedback on how to best tackle this problem.

Thanks,
Thomas Speidel


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Computing and allocating time intervals in a wide dataset
  - From: Thomas Speidel <[email protected]>

Prev by Date: st: Computing and allocating time intervals in a wide dataset
Next by Date: st: Xtprobit pa
Previous by thread: st: Computing and allocating time intervals in a wide dataset
Next by thread: st: RE: Computing and allocating time intervals in a wide dataset
Index(es):
- Date
- Thread