st: RE: Missing values

From   "Nick Cox" <>
To   <>
Subject   st: RE: Missing values
Date   Tue, 25 Jun 2002 15:53:10 +0100

Babigumira Ronnie

> I have data on household expenditure. Households bought "x" kgs (quan) of
> food crops (exp) worth a certain amount of money (unitvalu). However, I
> have cases where the quantity bought is missing (either because the
> household couldn't recall or an error in data collection) however the
> amount spent by these households is known. The variables are
> lc1code   housecode  exp   quan  unitvalu
> 11233      112331    566    1        500
> I would like to replace the missing quantities purchased with community
> (lc1code) averages. If the lc1code, food item (exp), unitvalu are the same
> then we can deduce the quantity (quan) that can be purchased by that
> amount of money. What I now want to do is to replace all missing
> quantities with a value imputed from community averages. To make it more
> clear
> If we know that 500/= buys 1kg of cassava in a given community, then a
> respondent in the community who spends 500/= on cassava should
> automatically be purchasing 1kg.
> I want to write a code that would automatically execute this for all
> missing cases however, I can't figure out where to start. I would
> appreciate any help.

Nick Winter has made suggestions using -collapse-.

An alternative would be to use -egen-.

Form groups based on your three variables:

egen mean = mean(quan), by(lc1code exp unitvalu)
replace quan = mean if missing(quan)

However, if this still leaves missing values,
you might want to go on to a more general regression


