Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: combinations of while, if, by, and foreach commands


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: combinations of while, if, by, and foreach commands
Date   Fri, 19 Sep 2003 00:38:55 +0100

Metcalfe, Paul

> Using Stata 8.1 SE, I'm trying to put together a loop for 
> what I imagine should be quite a straightforward task. 
> 
> The relevant part of my data looks like the following:
> 
> id	cons	time	
> 5001574		32	
> 5001574		31	
> 5001574	0.278548	30	
> 5001574	0.271683	29	
> 5001574	0.378903	28	
> 5001574	0.291933	27	
> 5001574	0.319807	26	
> 5001574		25	
> 5001574		24	
> 5001574		23	
> 5001574		22	
> 5001574		21	
> 5001574	0.348804	20	
> 5001574	0.247645	19	
> 5001574	0.306516	18	
> 5001574	0.303717	17	
> 5001574	0.310532	16	
> 
> I have about 8000 different id values in the full dataset, 
> observed for different stretches of time with different 
> numbers of gaps in the cons variable in different places 
> across the set of ids. 
> 
> What I would like to do is drop the observations at the end 
> of the time series where cons=., but keep the observations 
> in the middle.  There are varying numbers of gaps in the 
> cons time series for different ids, and I want to keep all 
> of them except the observations at the end of the time 
> series for each id. I've tried a number of different 
> combinations of the while, if and foreach commands, but 
> none of them has worked, so I hoped that someone on the 
> list could help.

This seems a bit awkward, but it satisfies a Stataish 
preference for -by:- over looping over observations: 

There is a block of missing values to drop at 
the start of each panel if and only if the _first_ 
value in each panel is missing. So let's get the 
cumulative sum of mi(cons) in that case. 

bysort id (t) : gen todrop = sum(mi(cons)) if mi(cons[1]) 

The values to drop will be those in which this cumulative 
sum is exactly the same as _n: for example, if 
the first three only are missing, the cumulative 
sum will be 1, 2, 3, 3 ... and only for the 
first three is this true. Here we lean on the fact 
that under -by id:- _n is evaluated within each panel. 
(Similarly, in the code above [1] always means the 
first observation within each panel.) 

bysort id (t) : drop if todrop == _n 

To get the blocks at the end of each panel, 
we work with time measured backwards: 

gen bt = -t 
bysort id (bt) : gen todrop2 = sum(mi(cons)) if mi(cons[1]) 
bysort id (bt) : drop if todrop2 == _n

after which we clean up: 

drop todrop todrop2 mt 
tsset

Nick 
n.j.cox@durham.ac.uk 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index