[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: loop to fill in missing observations

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: loop to fill in missing observations
Date	Fri, 15 Jun 2007 17:47:26 +0100

"Missing" in Stata circles should not be used to mean "omitted" or 
"not present". "Missing" means missing values on (existing) variables. 

Anyway, each country and year should be 
represented at least once. That can done within place or by a -merge- 
with a suitable file. Merge mavens can work out the latter. 

I see that Fabrizio has set up up a variable -fillin- with marks 
at places he wants to fill in. I am going to ignore this. I am
going to assume a clean dataset with just the data. 

Here is an in-place solution.  

qui forval x = 1/28 { 
	forval y = 1975/2002 { 
		count if countryn == `x' & year == `y' 
		if r(N) == 0 { 
			set obs `= _N + 1' 
			replace countryn == `x' in l 
			replace year == `y' in l 
		}
	}
} 			

In words, 

for each country { 
	for each year { 
		count how many obs for that country and year 
		if there aren't any { 
			add an extra observation 
			set that observation to the right values 
		} 
	}
}

The difference between using -set obs- to add an extra observation
and -expand 2 in l- is that in the former all new values
are born missing. 

Nick 
[email protected] 

Fabrizio Gilardi
 
> I have a dataset of national elections in 28 countries. Observations  
> are elections. This means that there can be several elections in the  
> same year, and on the other hand only years when an election 
> was held  
> are included in the dataset.
> 
> I want to fill in missing years for each country. My idea was 
> to loop  
> over countries and years to check for every country if a given year  
> is present, and if not fill it in. To do so, I have created an  
> appropriate number of missing observations to be filled in, and a  
> counter variable to identify them.
> 
> Concretely, the dataset looks like this:
> 
> countryn		year	 elecn	fillin
> 1			1990	1		.
> 1			1994	2		.
> 1			1994	3		.
> 1			1997	4		.
> 2			1989	1		.
> 2			1992	2		.
> 2			1995	3		.
> 2			1999	4		.
> 2			2000	5		.
> .			.		.		1
> .			.		.		2
> .			.		.		3
> 
> 
> And my code is the following:
> 
> g n=.
> local z=1
> qui forval x=1/28 {
> 	forval y=1975/2002 {
> 		sum year if year==`y' & countryn==`x'
> 		replace n=r(N)
> 		replace year=`y' if n==0 & fillin==`z'
> 		replace countryn=`x' if n==0 & fillin==`z'
> 		count if countryn==`x' & year==`y' & fillin==`z'
> 		local w=r(N)
> 		local z=`z'+`w'
> 	}
> }
> 
> Now, for some countries it works fine, but for most some missing  
> years are not filled in. It does not seem to depend on 
> whether in the  
> country there was more than one election in some year, and I cannot  
> find any pattern that could help me identify the problem.
> 
> What am I doing wrong?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: RE: loop to fill in missing observations
  - From: "Carole J. Wilson" <[email protected]>

References:
- st: loop to fill in missing observations
  - From: Fabrizio Gilardi <[email protected]>

Prev by Date: Re: st: Re: loop to fill in missing observations
Next by Date: st: Re: xtreg, fe cluster(id) vs xtabond2 - just to control for serial correlation
Previous by thread: Re: st: Re: loop to fill in missing observations
Next by thread: Re: st: RE: loop to fill in missing observations
Index(es):
- Date
- Thread