Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | David Hoaglin <dchoaglin@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Re: Fill Missing values at the End |
Date | Thu, 9 Jan 2014 07:30:19 -0500 |
Dear Sadia, If I have the right thread, one of your earlier postings, stating the problem, is the one that I have copied (in part) below. Filling in those missing observations is a form of imputation. In extrapolating, using a rate of growth estimated from the available data, you are making a strong assumption. Many people would not be comfortable with such a strong assumption. After filling in those observations, will you analyze the completed data as panel data? If so, you face difficulties. One of them is that, because the imputed observations are not actual observations, they do not vary as actual data would, and hence estimates of standard errors will be biased downward. In effect, you will be fooling yourself about the number of observations you have. Another potential problem is bias, introduced by the extrapolation model. An extensive literature discusses the challenges of missing data and a variety of approaches for dealing with them. One approach, which you may want to consider, is multiple imputation (mi). Stata has flexible and nicely designed commands for mi. David Hoaglin [I copied the text below from an earlier posting.] I have working on Panel data. For some of the individuals starting and ending values are missing. I have to fill these values by first calculating the growth of the known five values. Like for the individual Aqueela values for the year 1996 – 2000 are missing. I will first calculate the growth rate of values from 1991-1995. The growth rate is 1.05%.now the value in 1996 = value in 1995*(1+growth rate). And value in 1997= value in 1996 * (1+growth rate). And so on. For the second individual starting values are missing from 1990-1994. First I will have to find the r=growth rate of values from 1995-1999. Than the value in 1994 =value in 1995/(1+ growth rate) and value in 1995 =value in 1996/(1+ growth rate) and so on. I only have to fill in the missing values if the no of missing values is less than or equal to five. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/