Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: count non-missing elements in a series


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: count non-missing elements in a series
Date   Thu, 25 Mar 2004 10:24:01 -0000

You don't give a very clear idea 
of your data structure. I guess that you 
have panel data, something like 

. tsset country year 

and a series of variables for different 
industries, say 

industry1-industry10 

What you want in question 1 can be got in 
various ways. Here is one: 

The first value for each panel is 1 if 
the investment value is not missing and 0 otherwise: 

. by id : gen nrun = !mi(industry1) if _n == 1 

Subsequent values are always 0 if the investment 
value is missing and 1 greater than the previous 
otherwise. 

. by id : replace nrun = 
	cond(mi(industry1), 0, nrun[_n-1] + 1)  if _n > 1 

Another way of doing it is to use -tsspell- from SSC. 
Install that by 

. ssc inst tsspell

Then for say -industry1- the definition of a spell is 
just a sequence of non-missing values of -industry1-. 

. tsspell, c(industry1 < .) 

The variable _seq created by -tsspell- is what 
you want. Spells automatically are determined 
separately by panel. 

If different industries have different patterns
of missing values, you would need something 
more like this: 

forval i = 1/10 { 
	tsspell, c(industry`i' < .) spell(_spell`i') seq(_seq`i') end(_end`i') 
} 

After which you might want to -drop- some of the created variables. 
	
Nick 
n.j.cox@durham.ac.uk 

Oleksandr Shepotylo
> 
>     I am using perpetual inventory method to calculate 
> capital stock but my
> dataset has a lot of missing data on investment. I have two questions:
> 
>     1. For any given country and industry, I want to create 
> an index that
> will tell me how many
> non-missing numbers in a row I have prior to any year t.
>  For example,  I have investment data for country i and 
> industry j starting
> from 1965 to 1975 then missing in 1976-1977 and continuing in 
> 1978-2001. The
> index should start from 1 in 1965 till 11 in 1975. then it equals 0 in
> 1976-1977 and starts from 1 again in 1978.
> 
>     2. Less important, but still interesting. How to  impute 
> missing data if
> the gap- like in previous example- is less or equal to 2? 
> Imputation can be
> based on average growth in capital stock for last 3 years.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index