Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Data management on multiple rows per subject


From   Joe Canner <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: RE: Data management on multiple rows per subject
Date   Tue, 27 Aug 2013 20:46:11 +0000

. bys id (event): gen Astart=start[1] 
. bys id (event): gen Aend=end[1]
. drop if event=="B" & (!inrange(start,Astart,Aend) | !inrange(end,Astart,Aend))  

This can probably be simplified if your data are more predictable (as it sounds like they might be), but you get the idea.

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Thomas Speidel
Sent: Tuesday, August 27, 2013 4:33 PM
To: [email protected]
Subject: st: Data management on multiple rows per subject

I have a data managment problem.  This is a sample of the data, having 
multiple rows per subject:

   
+-----------------------------------------------------------------------+
   |     id                start                  end                
event |
   
|-----------------------------------------------------------------------|
   | 9        10jan2011 00:00:00   10jan2011 21:29:59                    
A |
   
|-----------------------------------------------------------------------|
   | 10       19dec2010 00:00:00   19dec2010 19:59:59                    
A |
   
|-----------------------------------------------------------------------|
   | 11       23jan2011 08:15:00   24jan2011 18:00:00                    
A |
   | 11       24jan2011 10:14:59   24jan2011 13:45:00    	        B |
   | 11       26jan2011 06:00:00   26jan2011 07:00:00        	        B 
|
   | 11       26jan2011 07:30:00   26jan2011 18:00:00   	        B |
   
|-----------------------------------------------------------------------|
   | 12       17dec2010 02:44:59   18dec2010 01:30:00                    
A |
   
+-----------------------------------------------------------------------+


Within id, I need to drop the B rows when their date is not contained 
in A.
So, in the example above, this would be the result:

   
+-----------------------------------------------------------------------+
   |     id                start                  end                
event |
   
|-----------------------------------------------------------------------|
   | 9        10jan2011 00:00:00   10jan2011 21:29:59                    
A |
   
|-----------------------------------------------------------------------|
   | 10       19dec2010 00:00:00   19dec2010 19:59:59                    
A |
   
|-----------------------------------------------------------------------|
   | 11       23jan2011 08:15:00   24jan2011 18:00:00                    
A |
   | 11       24jan2011 10:14:59   24jan2011 13:45:00    	        B |
   
|-----------------------------------------------------------------------|
   | 12       17dec2010 02:44:59   18dec2010 01:30:00                    
A |
   
+-----------------------------------------------------------------------+

I know how to solve this using reshape, but the data is too complex to 
handle comfortably in reshape (too many rows per subjects in some 
instances).
I thought of subscripting, but did not get far. Within subject, date 
ranges are either fully contained in A or they are not.
Thank you.

-- 
Thomas Speidel
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index