Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Fill Missing values at the End


From   David Hoaglin <[email protected]>
To   [email protected]
Subject   Re: st: Re: Fill Missing values at the End
Date   Thu, 9 Jan 2014 07:30:19 -0500

Dear Sadia,

If I have the right thread, one of your earlier postings, stating the
problem, is the one that I have copied (in part) below.

Filling in those missing observations is a form of imputation. In
extrapolating, using a rate of growth estimated from the available
data, you are making a strong assumption.  Many people would not be
comfortable with such a strong assumption.

After filling in those observations, will you analyze the completed
data as panel data?  If so, you face difficulties.  One of them is
that, because the imputed observations are not actual observations,
they do not vary as actual data would, and hence estimates of standard
errors will be biased downward.  In effect, you will be fooling
yourself about the number of observations you have.  Another potential
problem is bias, introduced by the extrapolation model.

An extensive literature discusses the challenges of missing data and a
variety of approaches for dealing with them.  One approach, which you
may want to consider, is multiple imputation (mi).  Stata has flexible
and nicely designed commands for mi.

David Hoaglin

[I copied the text below from an earlier posting.]

I have working on Panel data. For some of the individuals starting and
ending values are missing. I have to fill these values by first
calculating the growth of the known five values.

Like for the individual Aqueela values for the year 1996 – 2000 are
missing. I will first calculate the growth rate of values from
1991-1995. The growth rate is 1.05%.now the value in 1996 = value in
1995*(1+growth rate).

And value in 1997= value in 1996 * (1+growth rate).

And so on.

For the second individual starting values are missing from 1990-1994.
First I will have to find the r=growth rate of values from 1995-1999.
Than the value in 1994 =value in 1995/(1+ growth rate) and

value in 1995 =value in 1996/(1+ growth rate) and so on.

I only have to fill in the missing values if the no of missing values
is less than or equal to five.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index